07. Multi-Agent Systems¶

Why This Chapter?¶

A single "jack-of-all-trades" agent often gets confused with tools. When an agent has too many tools (20+), the model starts getting confused during selection and makes mistakes.

It's more efficient to divide responsibility: create a team of narrow specialists, managed by a main agent (Supervisor). Each specialist knows only their tools and focuses on their area.

Real-World Case Study¶

Situation: You've created a DevOps agent with 15 tools: network checks, database work, service management, logs, metrics, security, etc.

Problem: The agent gets confused with tools. When a user asks "Check DB availability and find out version", the agent may call the wrong tool or skip a step.

Solution: Multi-Agent system with Supervisor and specialists. Supervisor delegates the task to Network Expert for availability check and DB Expert for version retrieval. Each specialist focuses only on their area.

Theory in Simple Terms¶

How Does Multi-Agent Work?¶

Supervisor receives a task from the user
Supervisor analyzes the task and decides which specialists are needed
Supervisor calls specialists via tool calls
Specialists perform tasks in isolated context
Results are returned to Supervisor, who assembles the response

Takeaway: Context isolation — each specialist receives only their task, not the entire Supervisor history. This saves tokens and helps focus attention.

Supervisor Pattern (Boss-Subordinate)¶

Architecture:

Supervisor: Main brain. Has no tools, but knows who can do what.
Workers: Specialized agents with a narrow set of tools.

Context isolation: Worker doesn't see the entire Supervisor conversation, only their task. This saves tokens and focuses attention.

graph TD
    User[User] --> Supervisor[Supervisor Agent]
    Supervisor --> Worker1[Network Specialist]
    Supervisor --> Worker2[DB Specialist]
    Supervisor --> Worker3[Security Specialist]
    Worker1 --> Supervisor
    Worker2 --> Supervisor
    Worker3 --> Supervisor
    Supervisor --> User

DevOps Example — Magic vs Reality¶

Magic:

Supervisor "thinks" and "delegates" tasks to specialists

Reality:

How Multi-Agent Works in Practice¶

Step 1: Supervisor Receives Task

// Supervisor has tools to call Workers
supervisorTools := []openai.Tool{
    {
        Function: &openai.FunctionDefinition{
            Name: "ask_network_expert",
            Description: "Ask the network specialist about connectivity, pings, ports",
            Parameters: json.RawMessage(`{
                "type": "object",
                "properties": {
                    "question": {"type": "string"}
                },
                "required": ["question"]
            }`),
        },
    },
    {
        Function: &openai.FunctionDefinition{
            Name: "ask_database_expert",
            Description: "Ask the DB specialist about SQL, schemas, data",
            Parameters: json.RawMessage(`{
                "type": "object",
                "properties": {
                    "question": {"type": "string"}
                },
                "required": ["question"]
            }`),
        },
    },
}

supervisorMessages := []openai.ChatCompletionMessage{
    {Role: "system", Content: "You are a Supervisor. Delegate tasks to specialists."},
    {Role: "user", Content: "Check if DB server is available, and if yes — find out version"},
}

Step 2: Supervisor Generates Tool Calls for Workers

supervisorResp, _ := client.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
    Model:    "gpt-4o",
    Messages: supervisorMessages,
    Tools:    supervisorTools,
})

supervisorMsg := supervisorResp.Choices[0].Message
// supervisorMsg.ToolCalls = [
//     {Function: {Name: "ask_network_expert", Arguments: "{\"question\": \"Check availability of db-host.example.com\"}"}},
//     {Function: {Name: "ask_database_expert", Arguments: "{\"question\": \"What PostgreSQL version on db-host?\"}"}},
// ]

Why did Supervisor call both tools?

Supervisor receives task "check availability" → links to Network Expert
Supervisor receives "find out version" → links to DB Expert
Supervisor understands sequence: first network, then DB

Step 3: Runtime (Your Code) Calls Worker for Network Expert

Note: Runtime is the agent code you write in Go. See Chapter 00: Preface for definition.

// Runtime intercepts tool call "ask_network_expert"
func askNetworkExpert(question string) string {
    // Create NEW context for Worker (isolation!)
    workerMessages := []openai.ChatCompletionMessage{
        {Role: "system", Content: "You are a Network Specialist. Use ping tool to check connectivity."},
        {Role: "user", Content: question},  // Only question, without entire Supervisor history!
    }

    // Worker has its own tools
    workerTools := []openai.Tool{
        {
            Function: &openai.FunctionDefinition{
                Name: "ping",
                Description: "Ping a host to check connectivity",
                Parameters: json.RawMessage(`{
                    "type": "object",
                    "properties": {"host": {"type": "string"}},
                    "required": ["host"]
                }`),
            },
        },
    }

    // Run Worker as separate agent
    workerResp, _ := client.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
        Model:    "gpt-4o-mini",
        Messages: workerMessages,  // Isolated context!
        Tools:    workerTools,
    })

    workerMsg := workerResp.Choices[0].Message
    // workerMsg.ToolCalls = [{Function: {Name: "ping", Arguments: "{\"host\": \"db-host.example.com\"}"}}]

    // Execute ping
    pingResult := ping("db-host.example.com")  // "Host is reachable"

    // Worker receives result and formulates response
    workerMessages = append(workerMessages, workerMsg)
    workerMessages = append(workerMessages, openai.ChatCompletionMessage{
        Role:    "tool",
        Content: pingResult,
    })

    workerResp2, _ := client.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
        Model:    "gpt-4o-mini",
        Messages: workerMessages,
        Tools:    workerTools,
    })

    // Return Worker's final response to Supervisor
    return workerResp2.Choices[0].Message.Content  // "Host db-host.example.com is reachable"
}

Key isolation point:

Worker doesn't see entire Supervisor history
Worker receives only its question and its context
This saves tokens and focuses Worker's attention

Step 4: Runtime (Your Code) Calls Worker for DB Expert

func askDatabaseExpert(question string) string {
    // Similar to Network Expert, but with different tools
    workerMessages := []openai.ChatCompletionMessage{
        {Role: "system", Content: "You are a DB Specialist. Use SQL tools."},
        {Role: "user", Content: question},  // Isolated context!
    }

    workerTools := []openai.Tool{
        {
            Function: &openai.FunctionDefinition{
                Name: "sql_query",
                Description: "Execute a SELECT query",
                Parameters: json.RawMessage(`{
                    "type": "object",
                    "properties": {"query": {"type": "string"}},
                    "required": ["query"]
                }`),
            },
        },
    }

    // Worker generates SQL and executes
    // Returns: "PostgreSQL 15.2"
    return "PostgreSQL 15.2"
}

Step 5: Worker Results Returned to Supervisor

// Runtime adds results as tool messages
supervisorMessages = append(supervisorMessages, supervisorMsg)
supervisorMessages = append(supervisorMessages, openai.ChatCompletionMessage{
    Role:       "tool",
    Content:    askNetworkExpert("Check availability of db-host.example.com"),  // "Host is reachable"
    ToolCallID: supervisorMsg.ToolCalls[0].ID,
})
supervisorMessages = append(supervisorMessages, openai.ChatCompletionMessage{
    Role:       "tool",
    Content:    askDatabaseExpert("What PostgreSQL version on db-host?"),  // "PostgreSQL 15.2"
    ToolCallID: supervisorMsg.ToolCalls[1].ID,
})

Step 6: Supervisor Collects Results and Responds

// Send Worker results to Supervisor
supervisorResp2, _ := client.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
    Model:    "gpt-4o",
    Messages: supervisorMessages,  // Supervisor receives results from both Workers!
    Tools:    supervisorTools,
})

finalMsg := supervisorResp2.Choices[0].Message
// finalMsg.Content = "DB server is available (ping successful). PostgreSQL version: 15.2"

Why this isn't magic:

Supervisor calls Workers as regular tools — this isn't "delegation", but a tool call
Workers are separate agents — each with its own context and tools
Context isolation — Worker doesn't see Supervisor history, only its question
Runtime manages everything — it intercepts Supervisor tool calls, runs Workers, collects results

Takeaway: Multi-Agent isn't magic "commanding", but a mechanism for calling specialized agents via tool calls with context isolation.

Other Multi-Agent System Patterns¶

Supervisor/Worker is the basic pattern. In practice, others are used as well.

Pattern: Router Agent¶

Router Agent receives a request and routes it to one matching specialist. Unlike Supervisor, Router does not coordinate multiple agents — it selects one.

┌──────┐     ┌────────┐     ┌────────────────┐
│ User │────→│ Router │────→│ Network Agent  │
└──────┘     │        │     └────────────────┘
             │        │────→│ DB Agent       │
             │        │     └────────────────┘
             │        │────→│ Security Agent │
             └────────┘     └────────────────┘

Implementation:

// Router determines which agent should handle the request
func routeRequest(query string, client *openai.Client) (string, error) {
    resp, err := client.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
        Model: "gpt-4o-mini", // Cheap model — classification task
        Messages: []openai.ChatCompletionMessage{
            {
                Role: openai.ChatMessageRoleSystem,
                Content: `Classify the request and route to the correct specialist.
Available specialists:
- "network": connectivity, DNS, ports, network troubleshooting
- "database": SQL, schemas, queries, DB performance
- "security": access, vulnerabilities, incidents
Return ONLY the specialist name.`,
            },
            {Role: openai.ChatMessageRoleUser, Content: query},
        },
        Temperature: 0,
    })
    if err != nil {
        return "", err
    }
    return strings.TrimSpace(resp.Choices[0].Message.Content), nil
}

// Usage
specialist, _ := routeRequest("Can't connect to PostgreSQL")
// specialist = "database"
result := runSpecialist(specialist, query) // Run the matching agent

When Router beats Supervisor:

Request belongs to a single domain (no coordination needed)
Minimal latency required (one LLM call for routing instead of several)
You have 10+ specialists — Router scales more easily

Pattern: Handoffs (Context Transfer)¶

A Handoff is when one agent transfers control to another, including part of the context. Unlike Supervisor/Worker, there is no "boss" here — agents are peers.

Warm handoff — next agent receives full context:

type Handoff struct {
    FromAgent string                          // Who transfers
    ToAgent   string                          // Who receives
    Context   []openai.ChatCompletionMessage  // What is transferred
    Reason    string                          // Why
}

func performHandoff(h Handoff, client *openai.Client) (string, error) {
    // Build context for receiving agent
    handoffMessages := []openai.ChatCompletionMessage{
        {
            Role: openai.ChatMessageRoleSystem,
            Content: fmt.Sprintf(
                "You are a %s specialist. You received a handoff from %s.\nReason: %s\nContinue the conversation.",
                h.ToAgent, h.FromAgent, h.Reason,
            ),
        },
    }
    // Append transferred context
    handoffMessages = append(handoffMessages, h.Context...)

    resp, err := client.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
        Model:    "gpt-4o",
        Messages: handoffMessages,
        Tools:    getToolsForAgent(h.ToAgent),
    })
    if err != nil {
        return "", err
    }
    return resp.Choices[0].Message.Content, nil
}

Cold handoff — next agent receives only a summary:

// Instead of full context, pass a compressed summary
summary := summarizeConversation(messages)
handoff := Handoff{
    FromAgent: "l1_support",
    ToAgent:   "l2_engineer",
    Context: []openai.ChatCompletionMessage{
        {Role: openai.ChatMessageRoleUser, Content: summary},
    },
    Reason: "Issue requires deeper investigation",
}

When to use Handoffs:

Escalation (L1 Support → L2 Engineer)
Domain switch (a network problem turns out to be a DB problem)
Long-running tasks with multiple stages

Pattern: Subagents (Agent Hierarchy)¶

A Subagent is an agent that another agent creates dynamically for a subtask. Unlike a Worker, a Subagent is created on the fly for a specific task.

// Agent works on a task and realizes a subtask is needed
func solveWithSubagent(task string, parentTools []openai.Tool) string {
    // Parent agent decomposes the task
    subtasks := decomposeTask(task)

    var results []string
    for _, subtask := range subtasks {
        // Create a Subagent for each subtask
        subResult := runSubagent(subtask, parentTools)
        results = append(results, subResult)
    }

    // Combine results
    return synthesizeResults(results)
}

func runSubagent(task string, tools []openai.Tool) string {
    messages := []openai.ChatCompletionMessage{
        {
            Role:    openai.ChatMessageRoleSystem,
            Content: "You are a focused agent. Complete the specific task given to you.",
        },
        {Role: openai.ChatMessageRoleUser, Content: task},
    }

    // Subagent runs in its own loop with available tools
    return runAgentLoop(messages, tools, 5) // maxIterations = 5
}

Pattern: Custom DAG Workflows¶

For tasks with complex dependencies — DAG (Directed Acyclic Graph). Agents execute in dependency order; independent ones run in parallel.

type WorkflowStep struct {
    AgentID      string   // Which agent executes
    DependsOn    []string // Which steps it depends on
    Task         string   // The task
}

type Workflow struct {
    Steps []WorkflowStep
}

// Example: Incident analysis
workflow := Workflow{
    Steps: []WorkflowStep{
        {AgentID: "log_analyzer", DependsOn: nil, Task: "Analyze logs from the last hour"},
        {AgentID: "metrics_checker", DependsOn: nil, Task: "Check CPU and memory metrics"},
        {AgentID: "correlator", DependsOn: []string{"log_analyzer", "metrics_checker"}, Task: "Correlate results"},
        {AgentID: "reporter", DependsOn: []string{"correlator"}, Task: "Write incident report"},
    },
}

// log_analyzer and metrics_checker run in parallel (no dependencies)
// correlator waits for both results
// reporter waits for correlator

For more on workflow patterns, see Chapter 10: Planning and Workflows.

A2A (Agent-to-Agent) Protocol¶

In the patterns above, agents communicate through runtime (your code). A2A is a standardized protocol for inter-agent communication, proposed by Google.

Key concepts:

Agent Card — JSON description of an agent: what it can do, what tasks it accepts.
Task — unit of work with a lifecycle: submitted → working → completed/failed.
Message/Artifact — data exchange between agents.

// Agent Card — agent description for other agents
type AgentCard struct {
    Name         string   `json:"name"`
    Description  string   `json:"description"`
    URL          string   `json:"url"`           // Agent endpoint
    Capabilities []string `json:"capabilities"`  // What it can do
    InputSchema  json.RawMessage `json:"input_schema"` // What data it accepts
}

// Task — unit of work
type A2ATask struct {
    ID      string `json:"id"`
    Status  string `json:"status"` // "submitted", "working", "completed", "failed"
    Input   string `json:"input"`
    Output  string `json:"output,omitempty"`
}

// Client agent sends a task to another agent
func sendA2ATask(agentURL string, task A2ATask) (*A2ATask, error) {
    body, _ := json.Marshal(task)
    resp, err := http.Post(agentURL+"/tasks", "application/json", bytes.NewReader(body))
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    var result A2ATask
    json.NewDecoder(resp.Body).Decode(&result)
    return &result, nil
}

When you need A2A:

Agents run on different servers (microservice architecture)
Agents are written in different languages (Go + Python)
You need a standard interface for integrating with external agents

When A2A is overkill:

All agents in one process — use tool calls (Supervisor/Worker)
Prototype — start with plain functions

For more on A2A, see Chapter 18: Tool Protocols.

Scaling Multi-Agent Systems¶

As load grows, a single Supervisor isn't enough.

Worker Pool:

type WorkerPool struct {
    workers map[string]chan WorkerTask // Task channel for each Worker type
    results chan WorkerResult
}

func (p *WorkerPool) Submit(agentType string, task string) {
    p.workers[agentType] <- WorkerTask{Task: task, ResultCh: p.results}
}

Load balancing:

Round-robin — distribute tasks evenly across Workers
By load — send to an idle Worker
By specialization — Router picks a Worker by task type

Decision Table: When to Use Which Pattern¶

Pattern	When to Use	Complexity
Supervisor/Worker	Task requires coordinating multiple specialists	Medium
Router	Request belongs to a single domain	Low
Handoffs	Escalation, domain switch	Medium
Subagents	Dynamic task decomposition	High
DAG Workflow	Tasks with complex dependencies	High
A2A	Distributed agents on different servers	High

Common Errors¶

Error 1: No Context Isolation¶

Symptom: Worker receives entire Supervisor history, leading to context overflow and confusion.

Cause: Worker receives full Supervisor message history instead of isolated context.

Solution:

// BAD: Worker receives entire Supervisor history
workerMessages := supervisorMessages  // Full history!

// GOOD: Worker receives only its question
workerMessages := []openai.ChatCompletionMessage{
    {Role: "system", Content: "You are a Network Specialist."},
    {Role: "user", Content: question},  // Only question!
}

Error 2: Supervisor Doesn't Know Who to Call¶

Symptom: Supervisor doesn't call needed specialists or calls wrong ones.

Cause: Tool descriptions for calling Workers are not clear enough.

Solution:

// GOOD: Clear description of when to call each specialist
{
    Name: "ask_network_expert",
    Description: "Ask the network specialist about connectivity, pings, ports, network troubleshooting. Use this when user asks about network issues, connectivity, or network-related problems.",
},
{
    Name: "ask_database_expert",
    Description: "Ask the DB specialist about SQL, schemas, data, database queries. Use this when user asks about database, SQL, or data-related questions.",
},

Error 3: Worker Doesn't Return Result¶

Symptom: Supervisor doesn't receive answer from Worker or receives empty answer.

Cause: Worker doesn't complete its work or result isn't returned to Supervisor.

Solution:

// GOOD: Worker completes work and returns result
func askNetworkExpert(question string) string {
    // ... Worker performs task ...

    // Return Worker's final response
    return workerResp2.Choices[0].Message.Content  // "Host is reachable"
}

// Supervisor receives result
supervisorMessages = append(supervisorMessages, openai.ChatCompletionMessage{
    Role:       "tool",
    Content:    askNetworkExpert("..."),  // Worker result
    ToolCallID: supervisorMsg.ToolCalls[0].ID,
})

Mini-Exercises¶

Exercise 1: Implement Context Isolation¶

Implement a function to create isolated context for Worker:

func createWorkerContext(question string, workerRole string) []openai.ChatCompletionMessage {
    // Create isolated context for Worker
    // Only System Prompt and user question
}

Expected result:

Worker receives only System Prompt and its question
Worker doesn't see Supervisor history

Exercise 2: Implement Supervisor with Two Specialists¶

Create a Supervisor with two specialists (Network Expert and DB Expert):

supervisorTools := []openai.Tool{
    // Your code here
}

Expected result:

Supervisor can call both specialists
Tool descriptions are clear and understandable
Supervisor correctly selects specialist for task

Completion Criteria / Checklist¶

Completed:

Supervisor correctly delegates tasks to specialists
Workers operate in isolated context
Worker results are returned to Supervisor
Supervisor collects results and formulates final response
Tool descriptions for calling Workers are clear

Not completed:

Worker receives entire Supervisor history (no isolation)
Supervisor doesn't know who to call (poor descriptions)
Worker doesn't return result to Supervisor
Supervisor doesn't collect results from Workers

Production Notes¶

When using Multi-Agent systems in production:

Correlation by run_id: Use a single run_id for the entire chain Supervisor → Worker → Tool. This allows tracking the full request path in logs.
Chain tracing: Trace each step of the chain (Supervisor → Worker → Tool) for debugging. More: Chapter 19: Observability and Tracing.
Context isolation: Each Worker must have its own isolated context (already described above). This is critical for preventing context overflow.

Connection with Other Chapters¶

Tools: How Supervisor calls Workers via tool calls, see Chapter 03: Tools
Autonomy: How Supervisor manages the work loop, see Chapter 04: Autonomy
Architecture: Agent components (Runtime, Memory, Planning) that multi-agent systems are built from, see Chapter 09: Agent Architecture

What's Next?¶

After studying Multi-Agent, proceed to:

08. Evals and Reliability — how to test agents