Textbook: Designing Autonomous AI Agents¶
Version: 2.0
Author: Kirill Shvakov
For Course: AI Agent Course
Target Audience: Programmers who want to build production AI agents
Translations¶
- English (EN) — English version
- Русский (RU) — Russian version
📚 Table of Contents¶
Part I: Fundamentals¶
- 00. Preface — How to use this textbook, requirements, and what an agent is
- 01. LLM Physics — Tokens, context, temperature, determinism, probabilistic nature
- 02. Prompting as Programming — ICL, Few-Shot, CoT, task structuring, SOP
Part II: Practice-first (build an agent)¶
- 03. Tools and Function Calling — JSON Schema, validation, error handling, tool↔runtime contract
- 04. Autonomy and Loops — ReAct loop, stopping, anti-loops, observability
- 05. Safety and Human-in-the-Loop — Confirmation, Clarification, Risk Scoring, Prompt Injection
- 06. RAG and Knowledge Base — Chunking, Retrieval, Grounding, search modes, limits
- 07. Multi-Agent Systems — Supervisor/Worker, context isolation, task routing
- 08. Evals and Reliability — Evals, prompt regressions, quality metrics, test datasets
Part III: Architecture and Runtime Core¶
- 09. Agent Anatomy — Memory, Tools, Planning, Runtime
- 10. Planning and Workflow Patterns — Plan→Execute, Plan-and-Revise, task decomposition, DAG/workflow, stop conditions
- 11. State Management — Tool idempotency, retries with exponential backoff, deadlines, persist state, task resumption
- 12. Agent Memory Systems — Short/long-term memory, episodic/semantic memory, forgetting/TTL, memory verification, storage/retrieval
- 13. Context Engineering — Context layers, fact selection policies, summarization, token budgets, context assembly from state+memory+retrieval
- 14. Ecosystem and Frameworks — Choosing between custom runtime and frameworks, portability, avoiding vendor lock-in
Part IV: Practice (case studies/practices)¶
- 15. Real-World Case Studies — Examples of agents in different domains (DevOps, Support, Data, Security, Product)
- 16. Best Practices and Application Areas — Best practices for creating and maintaining agents, application areas
Part V: Platform Infrastructure/Security¶
- 17. Security and Governance — Threat modeling, risk scoring, prompt injection protection (canonical), tool sandboxing, allowlists, policy-as-code, RBAC, dry-run modes, audit
- 18. Tool Protocols and Tool Servers — Tool↔runtime contract at process/service level, schema versioning, authn/authz
Part VI: Production Readiness¶
- 19. Observability and Tracing — Structured logging, tracing agent runs and tool calls, metrics, log correlation
- 20. Cost & Latency Engineering — Token budgets, iteration limits, caching, fallback models, batching, timeouts
- 21. Workflow and State Management in Production — Queues and asynchrony, scaling, distributed state
- 22. Prompt and Program Management — Prompt versioning, prompt regressions via evals, configs and feature flags, A/B testing
- 23. Evals in CI/CD — Quality gates in CI/CD, dataset versioning, handling flaky cases, security tests
- 24. Data and Privacy — PII detection and masking, secret protection, log redaction, log storage and TTL
- 25. Production Readiness Index — Prioritization guide (1 day / 1–2 weeks) and quick links to production topics
Appendices¶
- Appendix: Reference Guides — Glossary, checklists, SOP templates, decision tables, Capability Benchmark
🗺️ Reading Path¶
For Beginners (recommended path — practice-first)¶
- Start with Preface — learn what an agent is and how to use this textbook
- Study LLM Physics — the foundation for understanding everything else
- Master Prompting — the foundation of working with agents
- Build a working agent:
- Tools and Function Calling — the agent's "hands"
- Autonomy and Loops — how agents work in loops
- Safety and Human-in-the-Loop — protecting against dangerous actions
- Expand capabilities:
- RAG and Knowledge Base — working with documentation
- Multi-Agent Systems — teams of specialized agents
- Evals and Reliability — testing agents
- Dive deeper into architecture:
- Agent Anatomy — components and their interactions
- Planning and Workflow Patterns — planning complex tasks
- State Management — execution reliability
- Agent Memory Systems — long-term memory
- Context Engineering — context management
- Practice: Complete laboratory assignments alongside reading chapters
For Experienced Programmers¶
You can skip basic chapters and go directly to:
- Tools and Function Calling
- Autonomy and Loops
- Case Studies — for understanding real-world applications
Quick Track: Core Concepts in 10 Minutes¶
If you're an experienced developer and want to quickly understand the essence:
- What is an agent?
- Agent = LLM + Tools + Memory + Planning
- LLM is the "brain" that makes decisions
- Tools are the "hands" that perform actions
- Memory is history and long-term storage
-
Planning is the ability to break down a task into steps
-
How does the agent loop work?
-
Key points:
- LLM doesn't execute code. It generates JSON with an execution request.
- Runtime (your code) executes real Go functions.
- LLM doesn't "remember" the past. It processes it in
messages[], which Runtime collects. -
Temperature = 0 for deterministic agent behavior.
-
Minimal example:
// 1. Define tool tools := []openai.Tool{{ Function: &openai.FunctionDefinition{ Name: "check_status", Description: "Check server status", }, }} // 2. Request to model resp, _ := client.CreateChatCompletion(ctx, openai.ChatCompletionRequest{ Model: openai.GPT3Dot5Turbo, Messages: []openai.ChatCompletionMessage{ {Role: "system", Content: "You are a DevOps engineer"}, {Role: "user", Content: "Check server status"}, }, Tools: tools, }) // 3. Check tool_call if len(resp.Choices[0].Message.ToolCalls) > 0 { // 4. Execute tool (Runtime) result := checkStatus() // 5. Add result to history messages = append(messages, openai.ChatCompletionMessage{ Role: "tool", Content: result, }) // 6. Send updated history back to model } -
What to read next:
- Chapter 03: Tools — detailed protocol
- Chapter 04: Autonomy — agent loop
- Chapter 09: Agent Anatomy — architecture
After Completing the Main Course¶
After studying chapters 1-16, proceed to:
- Part V: Platform Infrastructure/Security — security, governance, tool protocols
- Part VI: Production Readiness — practical guide to production readiness with step-by-step implementation recipes
🔗 Connection with Laboratory Assignments¶
| Textbook Chapter | Corresponding Laboratory Assignments |
|---|---|
| 01. LLM Physics | Lab 00 (Capability Check) |
| 02. Prompting | Lab 01 (Basics) |
| 03. Tools | Lab 02 (Tools), Lab 03 (Architecture) |
| 04. Autonomy | Lab 04 (Autonomy) |
| 05. Safety | Lab 05 (Human-in-the-Loop) |
| 02. Prompting (SOP) | Lab 06 (Incident) |
| 06. RAG | Lab 07 (RAG) |
| 07. Multi-Agent | Lab 08 (Multi-Agent) |
| 09. Agent Anatomy | Lab 01 (Basics), Lab 09 (Context Optimization) |
| 10. Planning and Workflow Patterns | Lab 10 (Planning & Workflow) |
| 11. State Management | Lab 10 (Planning & Workflow) — partially |
| 12. Agent Memory Systems, 13. Context Engineering | Lab 11 (Memory & Context Engineering) |
| 18. Tool Protocols and Tool Servers | Lab 12 (Tool Server Protocol) |
| 17. Security and Governance | Lab 13 (Agent Security Hardening) — Optional |
| 22. Prompt and Program Management | Lab 01 (Basics) — partially |
| 23. Evals in CI/CD | Lab 14 (Evals in CI) — Optional |
📖 How to Use This Textbook¶
- Read sequentially — each chapter builds on previous ones
- Practice alongside reading — complete the corresponding laboratory assignment after each chapter
- Use as a reference — return to relevant sections when working on projects
- Study examples — each chapter includes examples from different domains (DevOps, Support, Data, Security, Product)
- Complete exercises — mini-exercises in each chapter help reinforce the material
- Check your understanding — use checklists for self-assessment
Happy Learning! 🚀