AI_TOOLS 12 May 2026 14 min read 52 view

Agent Chat: Two AI Agents Arguing — Spring Boot 4 + Spring AI + Ollama / OpenRouter

Updated: 12 May 2026

Language: 🇺🇦 🇺🇸 🇩🇪 🇪🇸

Vadim Kharovyuk

CEO & Founder of WebsCraft. 8 years in web development, focused on bringing AI into real products.

Agent Chat: Two AI Agents Arguing — Spring Boot 4 + Spring AI + Ollama / OpenRouter

What happens if you give two AIs opposing beliefs and make them argue on a given topic? This is exactly the question that became the starting point for Agent Chat — an experiment where two agents with different personalities engage in a real-time dialogue, backing their arguments with real facts from Wikipedia, Tavily, NewsAPI, ArXiv, and Alpha Vantage.

⚠️ Important about the architecture: Agent Chat is an experimental project intentionally made simple to quickly test how agents with different personalities behave. Polling instead of WebSocket, synchronous DB reads in a loop, lack of queues — all these are conscious compromises for the sake of ease of launch and code readability. For a production multi-agent system, the architecture needs refinement. 💡 Recommendation: run it locally with Ollama — it's completely free. No API keys, no costs. qwen3:8b is enough to see a live dialogue of agents with real facts from Wikipedia and ArXiv.

GitHub: github.com/VadimKharovyuk/Agent_Chat — MIT license, full code, README with launch instructions.

The Idea and How It Works in Action
Stack: Spring Boot 4, Spring AI 2.0, Ollama, and OpenRouter
Architecture in 5 Minutes — Entity, Layers, Flow from Request to Dialogue
AiProviderConfig: How to Switch Ollama and OpenRouter via @Profile
AgentConversationService — Service Layer: What's Here and Why Dialogue Logic Isn't Here
generateTopic() — How an Agent Invents a Topic from Real News
AgentConversationRunner — The Heart of the Project: @Async Loop, Stop Phrases, HISTORY_SIZE
ask() — How Context is Built and Why Role Order Matters
Five Tools: Wikipedia, Tavily, NewsAPI, Alpha Vantage, ArXiv
How to Write a Prompt So Agents Actually Argue
Deployment: Ollama Locally and Railway in Production
Conclusions + GitHub

The Idea and How It Works in Action

The classic problem with AI agents is that they are too polite. Ask GPT to argue, and it will agree within two messages. Agent Chat solves this through system prompts with strict prohibitions and clear beliefs for each agent.

The flow looks like this:

Topic → Agent A Responds → Agent B Responds → Agent A ...

The user specifies:

Topic — for example, "Should AI be regulated by the state?"
System Prompt for Agent A — role, beliefs, prohibitions, format
System Prompt for Agent B — opposing position
Number of Rounds — 1 round = 2 messages (A + B)

Agents conduct the dialogue automatically. Each message is stored in PostgreSQL. The conversation can be stopped at any moment. An agent can also end the dialogue independently — if its response contains a stop phrase like "goodbye" or "farewell."

The most interesting part: agents can access real sources to back up their arguments — Wikipedia, current news, scientific articles, stock prices. This makes the dialogue more lively and less prone to hallucinations.

If you already have other models installed in Ollama — you can try them. But from experience: not all models follow system prompt instructions well. Some ignore prohibitions and start agreeing with the opponent after 2-3 rounds, others don't call tools at all. For more details on which models actually work on local hardware — read the article about Ollama on 8GB RAM.

Stack: Spring Boot 4, Spring AI 2.0, Ollama, and OpenRouter

Component	Technology	Purpose
Backend	Java 21, Spring Boot 4.0.6	Main framework
AI Framework	Spring AI 2.0.0-M5	Abstraction over LLM providers
LLM (local)	Ollama (qwen3:8b / llama3.1:8b)	Local development without costs
LLM (prod)	OpenRouter (deepseek/deepseek-chat)	Cloud provider for Railway
Database	PostgreSQL	Storing conversations and messages
Frontend	Thymeleaf + Bootstrap 5	Simple UI without a separate SPA
Tools	Wikipedia, Tavily, NewsAPI, Alpha Vantage, ArXiv	Real facts for arguments

The key stack decision is Spring AI as an abstraction layer. All code in Runner and Service works with the ChatModel interface — it doesn't know whether it's Ollama or OpenRouter under the hood. Switching happens exclusively through Spring Profiles.

Architecture in 5 Minutes — Entity, Layers, Flow from Request to Dialogue

The project structure is classic for Spring Boot — a clear division of responsibility between layers:

src/main/java/com/example/agent_chat/
├── config/
│   └── AiProviderConfig.java          # Ollama / OpenRouter providers
├── controller/
│   ├── HomeController.java
│   └── AgentConversationController.java
├── entity/
│   ├── AgentConversation.java         # Conversation: topic, prompts, status
│   ├── AgentMessage.java              # Message: sender, content, round
│   ├── AgentSender.java               # Enum: AGENT_A / AGENT_B
│   └── ConversationStatus.java        # Enum: RUNNING / STOPPED / FINISHED
├── repository/
│   ├── AgentConversationRepository.java
│   └── AgentMessageRepository.java
├── service/
│   ├── AgentConversationService.java  # Thin service layer
│   ├── AgentConversationRunner.java   # @Async dialogue loop
│   └── WikipediaSearchTool.java       # + other tools
└── dto/
    ├── StartConversationRequest.java
    ├── ConversationResponse.java
    └── ExperimentMapper.java

Flow from request to the first agent message:

HTTP POST /start
    ↓
AgentConversationController
    ↓
AgentConversationService.start()
    ↓ saves AgentConversation to DB with RUNNING status
    ↓
AgentConversationRunner.run() ← @Async (separate thread)
    ↓
  [rounds loop]
    ↓
  ask() → Spring AI → Ollama / OpenRouter → response
    ↓
  saveMessage() → AgentMessage to DB
    ↓
HTTP GET /conversation/{id} ← frontend polling

Important: The controller returns the conversationId immediately, without waiting for the agents to finish the dialogue. The frontend polls the conversation status itself. This is a standard pattern for @Async operations.

AiProviderConfig: How to Switch Ollama and OpenRouter via @Profile

One of the most elegant solutions in the project is configuring providers via Spring Profiles. All Runner and Service code depends on the ChatModel interface — the specific implementation is injected via DI depending on the active profile:

@Configuration
public class AiProviderConfig {

    // ── LOCAL: Ollama ─────────────────────────────────────────────
    @Configuration
    @Profile("local")
    static class OllamaConfig {

        @Bean
        @Primary
        public ChatModel primaryChatModel(OllamaChatModel ollamaChatModel) {
            return ollamaChatModel;
        }

        @Bean("agentChatModel")
        public ChatModel agentChatModel(OllamaChatModel ollamaChatModel) {
            return ollamaChatModel;
        }
    }

    // ── PROD: OpenRouter ──────────────────────────────────────────
    @Configuration
    @Profile("openai")
    static class OpenAiConfig {

        @Bean
        @Primary
        public ChatModel primaryChatModel(OpenAiChatModel openAiChatModel) {
            return openAiChatModel;
        }

        @Bean("agentChatModel")
        public ChatModel agentChatModel(OpenAiChatModel openAiChatModel) {
            return openAiChatModel;
        }
    }
}

Note the two beans: primaryChatModel and agentChatModel. This is not duplication — these are two different roles:

primaryChatModel — used in AgentConversationService.generateTopic() to generate a topic. This is a quick request without tools.
agentChatModel — used in AgentConversationRunner for the main dialogue. This is the one that receives @Qualifier("agentChatModel").

Switching between profiles:

# Locally — application-local.properties
spring.ai.ollama.chat.model=qwen3:8b

# Production — environment variable
SPRING_PROFILES_ACTIVE=openai
OPENAI_API_KEY=your_openrouter_key

Note the @ConditionalOnProperty(name = "app.agent.experiment.enabled", havingValue = "true") on the Runner and Service. This means that all agent functionality is disabled by default and is enabled explicitly. This is useful when you want to deploy the application without agents or add new features gradually.

AgentConversationService — Service Layer: What's Here and Why Dialogue Logic Isn't

The service layer in this project is intentionally thin. It does not contain dialogue logic — all of it is in the Runner. The responsibilities of AgentConversationService:

Create AgentConversation in the DB and delegate the start to the Runner
Stop the conversation by changing the status
Provide CRUD for reading conversations
Generate a topic via generateTopic()

@Service
@ConditionalOnProperty(name = "app.agent.experiment.enabled",
        havingValue = "true", matchIfMissing = false)
public class AgentConversationService {

    private final AgentConversationRepository conversationRepository;
    private final AgentMessageRepository messageRepository;
    private final AgentConversationRunner runner;
    private final ChatModel primaryChatModel;
    private final NewsApiSearchTool newsApiSearchTool;

    public Long start(StartConversationRequest request) {
        // Save conversation to DB
        AgentConversation conversation = new AgentConversation();
        conversation.setTopic(request.topic());
        conversation.setSystemPromptA(request.systemPromptA());
        conversation.setSystemPromptB(request.systemPromptB());
        conversation.setTotalRounds(0);
        conversationRepository.save(conversation);

        int maxRounds = request.maxRounds() > 0 ? request.maxRounds() : 100;

        // Delegate to Runner — it will go to an @Async thread
        runner.run(
                conversation.getId(),
                request.systemPromptA(),
                request.systemPromptB(),
                request.topic(),
                maxRounds
        );

        // Return ID immediately — don't wait for completion
        return conversation.getId();
    }

    public void stop(Long conversationId) {
        AgentConversation conversation = conversationRepository
                .findById(conversationId)
                .orElseThrow(() -> new IllegalArgumentException("Not found: " + conversationId));
        conversation.setStatus(ConversationStatus.STOPPED);
        conversation.setFinishedAt(LocalDateTime.now());
        conversationRepository.save(conversation);
    }

    @Transactional
    public void deleteById(Long id) {
        messageRepository.deleteByConversationId(id);
        conversationRepository.deleteById(id);
    }
}

Note the stop() method: it simply changes the status in the DB to STOPPED. It does not stop the thread directly — the Runner checks the status at the beginning of each round and between A and B. This is a safer approach than interrupting the thread.

Why is the loop logic not in Service? Single Responsibility Principle. Service manages the conversation state in the DB and provides an API for the controller. Runner executes the dialogue itself. If tomorrow you need to add WebSockets instead of polling or change the agent alternation logic — these changes will only affect the Runner, not the Service.

generateTopic() — How an Agent Invents a Topic from Real News

A separate interesting feature: if the user doesn't want to invent a topic themselves — the system generates it automatically from real news. Here's the complete method:

public String generateTopic() {
    List<String> queries = List.of(
            "technology AI society",
            "economy inflation future",
            "climate environment crisis",
            "politics democracy freedom",
            "science space exploration",
            "healthcare medicine future",
            "education technology students",
            "cryptocurrency bitcoin finance"
    );

    // Select a random category
    String randomQuery = queries.get(
            (int) (Math.random() * queries.size())
    );

    // Get fresh news for the category
    String news = newsApiSearchTool.searchNews(randomQuery);

    // Ask the LLM to formulate a provocative topic
    String prompt = """
        Based on this news, come up with one provocative topic 
        for a philosophical discussion.
        The topic should be controversial — so that two agents with opposing 
        views can argue.
        Respond ONLY with the topic — one sentence, no explanations, no quotes.
        News: %s
        """.formatted(news);

    return primaryChatModel.call(prompt).trim();
}

Three steps: random category → NewsAPI → LLM formulates the topic. If something goes wrong (NewsAPI is unavailable, LLM doesn't respond) — fallback to a default topic: "Will artificial intelligence change the future of humanity?".

Important detail: primaryChatModel is used here, not agentChatModel. Generating a topic doesn't require tools and complex context — it's a simple text-in text-out request. The separation of beans is justified.

AgentConversationRunner — the heart of the project: @Async loop, stop phrases, HISTORY_SIZE

AgentConversationRunner is a component that executes the dialogue itself in a separate thread. Let's break down the key parts:

Main loop

@Async
public void run(Long conversationId, String systemPromptA,
                String systemPromptB, String topic, int maxRounds) {

    String message = topic; // The first message is the topic of the conversation

    for (int round = 1; round <= maxRounds; round++) {

        // Check if manually stopped
        AgentConversation conversation = conversationRepository
                .findById(conversationId).orElseThrow();
        if (conversation.getStatus() == ConversationStatus.STOPPED) return;

        // Agent A responds
        List<AgentMessage> historyA = messageRepository
                .findByConversationIdOrderByRoundNumberAsc(conversationId);
        String replyA = ask(systemPromptA, historyA, message, AgentSender.AGENT_A);
        saveMessage(conversation, AgentSender.AGENT_A, replyA, round);

        // Check for stop phrase and manual stop between A and B
        if (containsStopPhrase(replyA)) { finish(conversation, round); return; }

        conversation = conversationRepository.findById(conversationId).orElseThrow();
        if (conversation.getStatus() == ConversationStatus.STOPPED) return;

        // Agent B responds
        List<AgentMessage> historyB = messageRepository
                .findByConversationIdOrderByRoundNumberAsc(conversationId);
        String replyB = ask(systemPromptB, historyB, replyA, AgentSender.AGENT_B);
        saveMessage(conversation, AgentSender.AGENT_B, replyB, round);

        if (containsStopPhrase(replyB)) { finish(conversation, round); return; }

        message = replyB; // B's reply becomes the input message for A
        sleep(500);       // Small pause between rounds
    }

    finish(conversation, maxRounds);
}

Stop phrases

private static final List<String> STOP_PHRASES = List.of(
        "до свидания", "прощай", "на этом всё",
        "goodbye", "farewell", "конец разговора"
);

If an agent decides to end the conversation naturally, the system recognizes it and stops the loop. Phrases are checked after each response — both after A and after B.

HISTORY_SIZE = 8

Not the entire conversation history is passed in each request — only the last 8 messages. This is a critical limitation: without it, the context window overflows in long conversations, and the cost of the request increases proportionally to the number of rounds. 8 messages = 4 rounds back — enough for dialogue coherence.

Pay attention to the double read from the DB. Before Agent A's request and before Agent B's request — separate queries to the repository to get the latest history. This is not a bug — it's a conscious decision: between A and B, A's response is already saved in the DB, so B must see the updated history.

ask() — how context is built and why the order of roles is important

The ask() method is the most technical part of the project. Let's break it down in detail:

private String ask(String systemPrompt, List<AgentMessage> history,
                   String lastMessage, AgentSender currentSender) {

    List<Message> messages = new ArrayList<>();

    // 1. System prompt — agent's role and beliefs
    messages.add(new SystemMessage(systemPrompt));

    // 2. Last HISTORY_SIZE messages with correct roles
    history.stream()
            .skip(Math.max(0, history.size() - HISTORY_SIZE))
            .forEach(m -> {
                if (m.getSender() == currentSender) {
                    // Own utterance → AssistantMessage
                    messages.add(new AssistantMessage(m.getContent()));
                } else {
                    // Opponent's utterance → UserMessage
                    messages.add(new UserMessage(m.getContent()));
                }
            });

    // 3. Last message from the opponent
    messages.add(new UserMessage(lastMessage));

    // 4. Request with all 5 tools
    ToolCallback[] tools = ToolCallbacks.from(
            wikipediaSearchTool, tavilySearchTool,
            alphaVantageTool, arxivSearchTool, newsApiSearchTool
    );

    return agentChatModel.call(
            new Prompt(messages,
                    ToolCallingChatOptions.builder()
                            .toolCallbacks(tools)
                            .build()))
            .getResult().getOutput().getText();
}

The key point is mapping roles in history. The LLM expects AssistantMessage to be what it said itself, and UserMessage to be what the user (in our case, the opponent) said. If you mix them up, the model will "forget" its position and start agreeing with the opponent.

Therefore, for each agent, the same DB record can be either an AssistantMessage or a UserMessage — depending on which agent is responding at the moment.

Commented out removeThinkingBlock() code. In the repository, there is a commented-out version of ask() with the removal of <think>...</think> blocks. Some models (qwen3 in particular) return internal thoughts in think tags — and if they are not removed, they will end up in the response. For production use with qwen3, I recommend uncommenting this logic.

Five tools: Wikipedia, Tavily, NewsAPI, Alpha Vantage, ArXiv

Each tool is a Spring component with a method annotated with @Tool. Spring AI automatically registers them and passes their descriptions to the LLM. The model itself decides which tool to call based on the context of the question.

WikipediaSearchTool — facts and definitions

@Tool(description = """
    Searches for information on Wikipedia.
    Use for definitions, facts, history, biographies.
    Use ONLY one or two words for searching.
    """)
public String searchWikipedia(String query) {
    // Shorten the query to the first word — Wikipedia poorly
    // handles long phrases
    String shortQuery = query.trim().split("\\s+")[0];

    WikiSearchResponse response = restClient.get()
            .uri("https://ru.wikipedia.org/w/api.php", uriBuilder -> uriBuilder
                    .queryParam("action", "query")
                    .queryParam("list", "search")
                    .queryParam("srsearch", shortQuery)
                    .queryParam("format", "json")
                    .queryParam("srlimit", "1")
                    .build())
            .retrieve()
            .body(WikiSearchResponse.class);

    String title = response.query().search().get(0).title();
    String snippet = response.query().search().get(0).snippet()
            .replaceAll("<[^>]+>", "").trim(); // Remove HTML tags

    return "Article: " + title + "\n" + snippet;
}

An important detail: Wikipedia returns snippets with HTML tags (<span class="searchmatch">, etc.) — these need to be removed before passing to the LLM.

AlphaVantageTool — stock prices for economic discussions

@Tool(description = """
    Retrieves the current stock price or financial data of a company.
    Query is the stock ticker: AAPL, GOOGL, TSLA, AMZN.
    """)
public String getStockPrice(String symbol) {
    Map response = restClient.get()
            .uri(uriBuilder -> uriBuilder
                    .path("/query")
                    .queryParam("function", "GLOBAL_QUOTE")
                    .queryParam("symbol", symbol.toUpperCase())
                    .queryParam("apikey", apiKey)
                    .build())
            .retrieve()
            .body(Map.class);

    Map<String, String> quote = (Map<String, String>) response.get("Global Quote");
    return String.format("Stock %s: $%s | Change: %s | High: $%s | Low: $%s",
            symbol, quote.get("05. price"), quote.get("10. change percent"),
            quote.get("03. high"), quote.get("04. low"));
}

Table of all tools

Tool	Usage	Free limit
Wikipedia	Definitions, facts, biographies, scientific concepts	✅ Unlimited
Tavily Search	Current news, fresh statistics, web search	1,000 / month
NewsAPI	Fresh news on a topic as an argument	100 / day
Alpha Vantage	Stock prices, financial data for economic discussions	25 / day
ArXiv	Scientific articles and research	✅ Unlimited

Practical advice on @Tool description: the tool's description is a system prompt for the LLM that explains when and how to use it. The more precise the description, the less often the model will call the tool inappropriately or with incorrect parameters. Pay attention to "Use ONLY one or two words" in the Wikipedia tool — without it, the model would send long sentences and get empty results.

How to write a prompt for agents to actually argue

The quality of the dialogue almost entirely depends on the system prompt. Here's a structure that works:

Mandatory prompt elements:

Role — who this agent is, their character and beliefs
Position — what they advocate for and what they believe
Prohibitions — what they NEVER agree with (most important!)
Format — concise, with facts, a question at the end
Language — specify explicitly

Example of a strong prompt (Agent A — capitalist):

You are a tough capitalist, a billionaire, a corporate owner.
You believe that the free market is the only path to prosperity.
Before responding, search Wikipedia for facts about GDP, standard of living.
NEVER agree with communist ideas.
Speak with numbers and facts. You despise planned economies.
Respond concisely — 2-3 sentences.
End with a provocative question.
Respond ONLY in Ukrainian.

Example of a weak prompt (don't do this):

You are a supporter of capitalism. Defend your position.

The difference between a strong and a weak prompt is in the detail of the prohibitions. Without a clear "NEVER agree," the model will find a compromise after 2-3 messages, and the dialogue will become boring.

Tips for a lively dialogue:

Specify concrete sources for searching: "search Wikipedia", "check stock prices"
Demand a question at the end of each utterance — this provokes a response
Limit length: 2-3 sentences is optimal, more becomes boring
Specify the language explicitly — without it, the model might switch
For qwen3:8b add: "do not use think tags in your response"

Deployment: Ollama locally and Railway in production

Local launch with Ollama

# 1. Install and launch the model
ollama pull qwen3:8b    # quality testing (~5GB)
# or
ollama pull llama3.1:8b # fast testing (~4.7GB)
ollama serve

# 2. Create DB
psql -c "CREATE DATABASE Agent_Chat;"

# 3. application-local.properties
spring.ai.ollama.chat.model=qwen3:8b
spring.datasource.url=jdbc:postgresql://localhost:5432/Agent_Chat
spring.datasource.username=postgres
spring.datasource.password=your_password
spring.profiles.active=local

# 4. Run
mvn spring-boot:run

Open: http://localhost:1024

Comparison of models for local testing:

Model	Response time	Tool calling quality	RAM
qwen3:8b	2+ min	⭐⭐⭐ Follows instructions better	~8GB
llama3.1:8b	20–30 sec	⭐⭐ Faster but weaker	~6GB

Production on Railway via OpenRouter

# Environment variables on Railway
SPRING_PROFILES_ACTIVE=openai
OPENAI_API_KEY=your_openrouter_key   # OpenRouter key
DB_URL=jdbc:postgresql://...
DB_USERNAME=postgres
DB_PASSWORD=your_password
APP_AGENT_EXPERIMENT_ENABLED=true    # enable agents

Why Railway and not Heroku or Fly.io? Railway provides PostgreSQL for free and simple deployment via GitHub. For an experimental project, this is optimal — no need to set up a separate DB service. OpenRouter with deepseek/deepseek-chat costs significantly less than direct OpenAI — and for a test project, this is the right choice.

Conclusions

Agent Chat is not a production-ready product, but a live experiment demonstrating several important architectural approaches:

@Async loop with DB status check — a simple and reliable way to manage long-running background operations without message queues
Spring Profiles for switching providers — all code depends on an interface, the specific implementation is injected via DI
Thin service layer + separate Runner — proper separation of responsibilities when the logic is complex and asynchronous
@Tool with detailed description — the quality of the tool's description directly affects the correctness of its invocation by the model
HISTORY_SIZE limitation — an essential element for controlling costs and context size in long conversations

The full code is available on GitHub — MIT license, can be used as a basis for your own agent projects:

→ github.com/VadimKharovyuk/Agent_Chat

Read also:

→ Which Ollama model to choose for an agent with tool calling: comparison and benchmarks — if you want to delve deeper into choosing a local model for tool calling.

→ GPT-Realtime-2 vs Gemini Live API: what to choose in 2026 — if you are considering voice agents instead of text-based ones.

Sources: Agent Chat GitHub repository, Spring AI Documentation, Ollama Model Library, OpenRouter API Docs

Categories

Agent Chat: Two AI Agents Arguing — Spring Boot 4 + Spring AI + Ollama / OpenRouter

Vadim Kharovyuk

Contents