The 5 AI Agent Patterns That Actually Work
Building an AI agent isn't just about connecting an LLM to tools and hoping for the best. There are proven architectures — patterns — that structure how an agent reasons, acts, and improves. Some are simple and robust, others are sophisticated and powerful.
In this article, we review the 5 AI agent patterns that actually work in production. For each one: clear explanation, architecture diagram, real-world use cases, and an honest analysis of strengths and weaknesses.
🔄 Pattern 1: ReAct (Reasoning + Acting)
How It Works
ReAct is the most popular and intuitive pattern. Published by Yao et al. in 2022, it combines reasoning and acting in an iterative loop. At each step, the agent:
- Thinks (Thought) — Analyzes the situation, plans the next step
- Acts (Action) — Calls a tool or performs an operation
- Observes (Observation) — Receives the result of the action
- Repeats until it has enough information to respond
Architecture Diagram
User question
│
▼
┌─────────┐
│ THOUGHT │ ← "I need to look up the population of France"
└────┬────┘
│
▼
┌─────────┐
│ ACTION │ ← search("population France 2024")
└────┬────┘
│
▼
┌─────────┐
│OBSERVAT° │ ← "68.4 million inhabitants"
└────┬────┘
│
▼
┌─────────┐
│ THOUGHT │ ← "I have the info, I can respond"
└────┬────┘
│
▼
┌─────────┐
│ RESPONSE │ ← "France has 68.4 million inhabitants"
└─────────┘
Implementation Example
class ReActAgent:
def __init__(self, llm, tools):
self.llm = llm
self.tools = tools
self.max_iterations = 10
def run(self, question: str) -> str:
prompt = f"""You are a ReAct agent. For each step:
Thought: your reasoning
Action: tool_name(params)
Observation: (will be filled automatically)
...
When you have the final answer:
Thought: I have enough information
Final Answer: your answer
Question: {question}"""
history = prompt
for i in range(self.max_iterations):
response = self.llm.complete(history)
history += response
if "Final Answer:" in response:
return response.split("Final Answer:")[-1].strip()
if "Action:" in response:
action = self.parse_action(response)
result = self.execute_tool(action)
observation = f"\nObservation: {result}\n"
history += observation
return "Sorry, I couldn't find the answer."
def parse_action(self, text):
# Extract tool name and parameters
action_line = [l for l in text.split("\n") if l.startswith("Action:")][0]
return self.parse_tool_call(action_line)
def execute_tool(self, action):
tool = self.tools.get(action["name"])
if tool:
return tool(**action["params"])
return f"Tool '{action['name']}' not found"
Ideal Use Cases
- Information retrieval: Factual questions requiring multiple sources
- Conversational assistants: Chatbots with API access
- Diagnostic tasks: Step-by-step problem investigation
- Document Q&A: Searching and synthesizing information
Strengths and Weaknesses
| Strengths | Weaknesses |
|---|---|
| Simple to implement | Can loop on complex tasks |
| Transparent reasoning | No long-term planning |
| Works with all LLMs | High token cost (full history) |
| Very well documented | Sensitive to tool errors |
| Easy debugging (visible trace) | Iteration count must be manually limited |
📋 Pattern 2: Plan-and-Execute
How It Works
Unlike ReAct which reasons step by step, Plan-and-Execute separates planning from execution. The agent starts by creating a complete plan, then executes each step sequentially. If a step fails, it can re-plan.
This pattern is inspired by work on BabyAGI and hierarchical planners. It excels at complex tasks that require a big-picture view.
Architecture Diagram
User question
│
▼
┌──────────────────┐
│ PLANNER │
│ (powerful LLM) │
│ │
│ Plan: │
│ 1. Search X │
│ 2. Analyze Y │
│ 3. Calculate Z │
│ 4. Synthesize │
└────────┬───────────┘
│
▼
┌──────────────────┐
│ EXECUTOR │ ┌──────────┐
│ (LLM + tools) │────▶│ Step 1 │ ✅
│ │ └──────────┘
│ │ ┌──────────┐
│ │────▶│ Step 2 │ ✅
│ │ └──────────┘
│ │ ┌──────────┐
│ │────▶│ Step 3 │ ❌ Failed!
│ │ └──────────┘
└────────┬───────────┘
│
▼
┌──────────────────┐
│ RE-PLANNER │
│ Adjusts plan │
│ 3b. Alternative │
│ 4. Synthesize │
└──────────────────┘
Implementation Example
class PlanAndExecuteAgent:
def __init__(self, planner_llm, executor_llm, tools):
self.planner = planner_llm # Powerful model (e.g., GPT-4, Claude Opus)
self.executor = executor_llm # Fast model (e.g., GPT-4o-mini, Haiku)
self.tools = tools
def run(self, task: str) -> str:
# Phase 1: Planning
plan = self.create_plan(task)
results = []
# Phase 2: Execution
for i, step in enumerate(plan):
try:
result = self.execute_step(step, results)
results.append({"step": step, "result": result, "status": "success"})
except Exception as e:
results.append({"step": step, "result": str(e), "status": "failed"})
# Re-plan if needed
plan = self.replan(task, plan, results, i)
# Phase 3: Synthesis
return self.synthesize(task, results)
def create_plan(self, task: str) -> list:
prompt = f"""Create a step-by-step plan to accomplish this task.
Return a JSON list of steps.
Task: {task}
Available tools: {list(self.tools.keys())}"""
response = self.planner.complete(prompt)
return json.loads(response)
def execute_step(self, step: dict, previous_results: list) -> str:
context = "\n".join(
f"Step {i+1}: {r['result']}"
for i, r in enumerate(previous_results)
)
prompt = f"""Execute this step using the available tools.
Context from previous steps:
{context}
Step to execute: {step['description']}"""
return self.executor.complete_with_tools(prompt, self.tools)
def replan(self, task, original_plan, results, failed_index):
prompt = f"""The original plan failed at step {failed_index + 1}.
Original plan: {json.dumps(original_plan)}
Results: {json.dumps(results)}
Create a new plan starting from step {failed_index + 1}."""
new_steps = json.loads(self.planner.complete(prompt))
return original_plan[:failed_index] + new_steps
def synthesize(self, task, results):
prompt = f"""Synthesize the results to answer the task.
Task: {task}
Results: {json.dumps(results)}"""
return self.planner.complete(prompt)
Ideal Use Cases
- Report writing: Research → Analysis → Writing → Proofreading
- Code projects: Analysis → Design → Implementation → Testing
- Business workflows: Market research → Recommendations → Action plan
- Predictable multi-step tasks: Data pipelines, ETL
Strengths and Weaknesses
| Strengths | Weaknesses |
|---|---|
| Big-picture view before action | Initial planning can be poor |
| Ability to re-plan | Slower (2 phases) |
| Lightweight model for execution (cost savings) | Coupling between steps sometimes poorly handled |
| Ideal for long, structured tasks | Plan can become outdated mid-execution |
| Clear execution trace | More complex to implement than ReAct |
🪞 Pattern 3: Reflexion
How It Works
Reflexion adds a layer of self-evaluation to the agent. After producing a result, the agent critiques its own work and iterates to improve it. It's the AI equivalent of proofreading your paper before turning it in.
Published by Shinn et al. in 2023, this pattern showed significant improvements on code benchmarks (HumanEval) and reasoning tasks.
Architecture Diagram
User task
│
▼
┌─────────┐
│ ACTOR │ ← Produces a first response/solution
└────┬────┘
│
▼
┌─────────┐
│EVALUATOR │ ← Analyzes quality (tests, criteria, score)
└────┬────┘
│
┌────▼────┐
│ Score OK │──── YES ───▶ Final response ✅
│ ? │
└────┬────┘
│ NO
▼
┌──────────┐
│REFLEXION │ ← "The answer is missing X, the error is Y"
└────┬─────┘
│
▼
┌─────────┐
│ ACTOR │ ← New attempt with reflection as context
└────┬────┘
│
▼
(loop)
Implementation Example
class ReflexionAgent:
def __init__(self, llm, evaluator, max_retries=3):
self.llm = llm
self.evaluator = evaluator
self.max_retries = max_retries
def run(self, task: str) -> str:
reflections = []
for attempt in range(self.max_retries + 1):
# Phase 1: Produce a response
response = self.act(task, reflections)
# Phase 2: Evaluate
evaluation = self.evaluate(task, response)
if evaluation["passed"]:
return response
# Phase 3: Reflect
reflection = self.reflect(task, response, evaluation)
reflections.append(reflection)
# Return the best attempt
return response
def act(self, task: str, reflections: list) -> str:
reflection_context = ""
if reflections:
reflection_context = "\n\nReflections on your previous attempts:\n"
for i, r in enumerate(reflections):
reflection_context += f"\nAttempt {i+1}: {r}\n"
prompt = f"""Complete this task to the best of your ability.
{reflection_context}
Task: {task}"""
return self.llm.complete(prompt)
def evaluate(self, task: str, response: str) -> dict:
# Can be an LLM, automated tests, or a score
prompt = f"""Evaluate this response on a scale of 1-10.
Identify specific issues.
Task: {task}
Response: {response}
Return JSON: {{"score": X, "passed": bool, "issues": [...]}}"""
result = self.evaluator.complete(prompt)
return json.loads(result)
def reflect(self, task, response, evaluation) -> str:
prompt = f"""Analyze why this response is unsatisfactory
and provide specific instructions for improvement.
Task: {task}
Response: {response}
Issues: {evaluation['issues']}
Reflection:"""
return self.llm.complete(prompt)
Ideal Use Cases
- Code generation: Write → Test → Fix
- Content writing: Write → Evaluate quality → Rewrite
- Math problem solving: Calculate → Verify → Correct
- Translation: Translate → Evaluate fidelity → Refine
Strengths and Weaknesses
| Strengths | Weaknesses |
|---|---|
| Progressive quality improvement | Cost multiplied (N attempts) |
| Self-correction of errors | The evaluator can be wrong too |
| Works well for code (automated tests) | Risk of going in circles |
| Rich reasoning trace | Model may "over-correct" |
| Simulates trial-and-error learning | Higher latency |
🤝 Pattern 4: Multi-Agent
How It Works
Instead of a single agent doing everything, the multi-agent pattern distributes work among several specialized agents. Each agent has a specific role, dedicated tools, and potentially a different LLM.
This pattern is popularized by frameworks like CrewAI, AutoGen, and LangGraph. It draws from how human teams work: specialization and collaboration.
Architecture Diagram
User task
│
▼
┌─────────────────┐
│ ORCHESTRATOR │
│ (Coordinator) │
└────────┬────────┘
│
┌──────────────┼──────────────┐
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ AGENT │ │ AGENT │ │ AGENT │
│ Research │ │ Analysis │ │ Writing │
│ │ │ │ │ │
│ Tools: │ │ Tools: │ │ Tools: │
│ - Web │ │ - Python │ │ - Markdown │
│ - News API │ │ - Charts │ │ - CMS │
└──────┬─────┘ └──────┬─────┘ └──────┬─────┘
│ │ │
└──────────────┼──────────────┘
│
▼
Final result
Common Topologies
There are several ways to organize agents:
| Topology | Description | Example |
|---|---|---|
| Sequential | Agent A → Agent B → Agent C | Content pipeline |
| Hierarchical | Manager delegates to workers | Development team |
| Debate | Agents argue for/against | Decision analysis |
| Peer-to-peer | Agents communicate freely | Brainstorming |
Implementation Example
class Agent:
def __init__(self, name, role, llm, tools=None):
self.name = name
self.role = role
self.llm = llm
self.tools = tools or []
def execute(self, task: str, context: str = "") -> str:
prompt = f"""You are {self.name}, a specialized agent.
Your role: {self.role}
Context from previous steps:
{context}
Task: {task}"""
return self.llm.complete_with_tools(prompt, self.tools)
class MultiAgentOrchestrator:
def __init__(self):
self.agents = {}
def add_agent(self, agent: Agent):
self.agents[agent.name] = agent
def run_sequential(self, task: str, agent_order: list) -> str:
"""Sequential execution: each agent passes its result to the next."""
context = ""
for agent_name in agent_order:
agent = self.agents[agent_name]
result = agent.execute(task, context)
context += f"\n\n[{agent.name}]:\n{result}"
return context
def run_hierarchical(self, task: str, manager_name: str) -> str:
"""The manager decides which agent to call and when."""
manager = self.agents[manager_name]
workers = {k: v for k, v in self.agents.items() if k != manager_name}
worker_descriptions = "\n".join(
f"- {name}: {a.role}" for name, a in workers.items()
)
context = ""
for _ in range(10): # Max 10 delegations
prompt = f"""You are the manager. Here's your team:
{worker_descriptions}
Overall task: {task}
Work done: {context}
Delegate the next subtask to an agent, or say DONE if everything is complete.
Format: DELEGATE:agent_name:subtask OR DONE:summary"""
decision = manager.llm.complete(prompt)
if decision.startswith("DONE"):
return decision.split(":", 1)[1]
if decision.startswith("DELEGATE"):
_, agent_name, subtask = decision.split(":", 2)
agent = workers.get(agent_name.strip())
if agent:
result = agent.execute(subtask, context)
context += f"\n[{agent.name}]: {result}"
return context
# Usage
orchestrator = MultiAgentOrchestrator()
orchestrator.add_agent(Agent(
"researcher", "Web research expert",
llm=fast_llm, tools=[web_search, web_fetch]
))
orchestrator.add_agent(Agent(
"analyst", "Data analysis expert",
llm=smart_llm, tools=[python_exec, chart_gen]
))
orchestrator.add_agent(Agent(
"writer", "Professional writer",
llm=smart_llm, tools=[markdown_gen]
))
orchestrator.add_agent(Agent(
"manager", "Project manager who coordinates the team",
llm=smart_llm
))
# Hierarchical execution
result = orchestrator.run_hierarchical(
"Write a report on AI trends in 2025",
manager_name="manager"
)
Ideal Use Cases
- Content production: Research → Writing → Proofreading → SEO
- Software development: Architect → Developer → Tester → Reviewer
- Business analysis: Data collection → Analysis → Recommendations → Report
- Customer support: Triage → Specialist → Validation → Response
Strengths and Weaknesses
| Strengths | Weaknesses |
|---|---|
| Specialization (each agent is optimized) | Coordination complexity |
| Parallelization possible | Costly inter-agent communication |
| Different models per agent (cost savings) | Harder to debug |
| Scalable (easy to add agents) | "Telephone game" risk (distorted info) |
| Mirrors how human teams work | Overhead for simple tasks |
📚 Pattern 5: Tool-Augmented RAG
How It Works
Classic RAG (Retrieval-Augmented Generation) retrieves relevant documents and injects them into the LLM's context. Tool-Augmented RAG goes further: the agent can dynamically choose its data sources and search methods via tools.
Instead of a fixed RAG pipeline (query → vector search → LLM), the agent decides:
- Where to search (vector database, SQL, API, web)
- How to search (semantic search, filters, aggregations)
- When to search (iteratively, until it has enough info)
Architecture Diagram
User question
│
▼
┌─────────────┐
│ RAG AGENT │
│ (LLM) │
└──────┬──────┘
│
│ Dynamic source selection
│
┌─────┼──────┬────────────┐
▼ ▼ ▼ ▼
┌────────┐┌────┐┌──────┐┌──────────┐
│Vector ││SQL ││ Ext. ││ Web │
│Search ││ ││ API ││ Search │
│ ││ ││ ││ │
│Pinecone││PG ││REST ││Brave/ │
│Chroma ││ ││ ││Google │
└───┬────┘└──┬─┘└──┬───┘└────┬────┘
│ │ │ │
└────────┴─────┴─────────┘
│
▼
┌──────────────┐
│ Synthesis │
│ with sources │
└──────────────┘
Implementation Example
class ToolAugmentedRAG:
def __init__(self, llm, tools):
self.llm = llm
self.tools = {
"vector_search": self.vector_search,
"sql_query": self.sql_query,
"web_search": self.web_search,
"api_fetch": self.api_fetch,
}
def vector_search(self, query: str, collection: str, top_k: int = 5) -> list:
"""Semantic search in a vector database."""
embeddings = self.embed(query)
results = self.vector_db.query(
collection=collection,
vector=embeddings,
top_k=top_k
)
return [{"text": r.text, "score": r.score, "source": r.metadata} for r in results]
def sql_query(self, query: str) -> list:
"""Execute a read-only SQL query."""
# Validation: read-only!
if not query.strip().upper().startswith("SELECT"):
raise ValueError("Only SELECT queries are allowed")
return self.db.execute(query).fetchall()
def web_search(self, query: str, num_results: int = 5) -> list:
"""Search the web."""
return brave_search(query, count=num_results)
def api_fetch(self, url: str, params: dict = None) -> dict:
"""Call an external API."""
response = requests.get(url, params=params, timeout=10)
return response.json()
def run(self, question: str) -> str:
sources = []
max_searches = 5
for i in range(max_searches):
prompt = f"""You are a research agent.
Question: {question}
Sources found so far:
{json.dumps(sources, indent=2) if sources else "None"}
Available tools:
- vector_search(query, collection, top_k): Semantic search
- sql_query(query): SQL query
- web_search(query, num_results): Web search
- api_fetch(url, params): API call
Do you have enough information to answer? If yes, respond FINAL:your_answer
Otherwise, call a tool: TOOL:tool_name:params_json"""
decision = self.llm.complete(prompt)
if decision.startswith("FINAL:"):
return decision[6:]
if decision.startswith("TOOL:"):
_, tool_name, params_str = decision.split(":", 2)
params = json.loads(params_str)
result = self.tools[tool_name](**params)
sources.append({
"tool": tool_name,
"params": params,
"result": result
})
# Forced synthesis after max_searches
return self.synthesize(question, sources)
Ideal Use Cases
- Advanced customer support: Search docs, tickets, CRM
- Legal analysis: Search case law + legislation
- Business intelligence: Cross-reference internal data + open sources
- Technical assistants: Documentation + logs + monitoring
Strengths and Weaknesses
| Strengths | Weaknesses |
|---|---|
| Multiple, dynamic sources | More complex than classic RAG |
| Agent chooses the best source | LLM may pick the wrong source |
| Iterative search (refinement) | Higher latency (multiple searches) |
| Fewer hallucinations (verified sources) | Cost proportional to number of searches |
| Adapts to varied questions | Requires well-described tools |
📊 Overall Comparison of the 5 Patterns
| Criteria | ReAct | Plan & Execute | Reflexion | Multi-Agent | Tool-Aug. RAG |
|---|---|---|---|---|---|
| Complexity | ⭐ | ⭐⭐ | ⭐⭐ | ⭐⭐⭐ | ⭐⭐ |
| Result quality | ⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ |
| Token cost | ⭐⭐ | ⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐ |
| Latency | ⭐ | ⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐ |
| Debugging | ⭐ | ⭐⭐ | ⭐⭐ | ⭐⭐⭐ | ⭐⭐ |
| Scalability | ⭐⭐ | ⭐⭐ | ⭐⭐ | ⭐ | ⭐⭐ |
| Use case | General | Long tasks | Critical quality | Complex projects | Info retrieval |
(⭐ = low, ⭐⭐⭐ = high)
🔀 How to Choose the Right Pattern?
Here's a quick decision tree:
Is the task simple (1-3 steps)?
├── YES → ReAct
└── NO
├── Does quality need to be maximized?
│ ├── YES → Reflexion
│ └── NO
│ ├── Does the task have predictable steps?
│ │ ├── YES → Plan-and-Execute
│ │ └── NO
│ │ ├── Are varied skills needed?
│ │ │ ├── YES → Multi-Agent
│ │ │ └── NO → ReAct
│ └── Is the task centered on information retrieval?
│ └── YES → Tool-Augmented RAG
Combining Patterns
In practice, the best agents combine multiple patterns:
- ReAct + Reflexion: The agent acts step by step, then evaluates and corrects
- Plan-and-Execute + Multi-Agent: The planner delegates to specialized agents
- Multi-Agent + Tool-Augmented RAG: A researcher agent with augmented RAG
OpenClaw, for example, uses a mix of ReAct (think-act-observe loop) and Tool-Augmented RAG (dynamic access to multiple tools and sources via MCP).
🚀 Tips for Implementing Your First Agent
- Start with ReAct — It's the simplest and covers 80% of use cases
- Add Reflexion if quality isn't good enough — An automated evaluator can transform a mediocre agent into an excellent one
- Move to Multi-Agent only if the problem is genuinely complex — Coordination has a cost
- Measure everything — Tokens consumed, latency, success rate, response quality
- Limit iterations — An agent that loops is expensive and produces nothing
The future of AI agents lies in the intelligent composition of these patterns, adapted to each use case. Master them one by one, then combine them to build truly powerful systems.
📚 Related Articles
- MCP, Function Calling, Tool Use: The Complete Guide — Understand the mechanisms that let agents use tools
- Automating a Complete Pipeline with an Agent — Put these patterns into practice in a real-world case
- Automate Your Life with OpenClaw — How OpenClaw uses these patterns in production
- Configuring OpenClaw: SOUL, AGENTS, and Skills — Give your agent a personality and capabilities
- What Is OpenClaw? — The AI agent that combines ReAct and Tool-Augmented RAG