NJI

Introduction

I opened patterns/agents/ expecting a couple of abstract diagrams and a lot of "agent-y" hype. Instead, it's a set of runnable notebooks that quietly encode the patterns I keep reaching for when I'm trying to make an LLM system reliable rather than impressive in a demo.

The big shift for me is this: "agent patterns" are mostly about controlling the shape of work (what gets produced, in what order, with what checks), not about giving the model more freedom. When a workflow feels flaky, these patterns are usually where I start.

1. Prompt Chaining

Agent Patterns Overview

🔗

Prompt Chaining

Sequential execution: output of step N becomes input of step N+1

🚦

Routing

Route requests to specialized handlers based on intent classification

⚡

Parallelization

Execute independent subtasks concurrently for faster completion

🔄

Evaluator-Optimizer

Iterative improvement loop: generate → evaluate → feedback → refine

👥

Orchestrator-Workers

Coordinator delegates tasks to specialized worker agents

Click any pattern to highlight it • These patterns can be combined for complex workflows

Location: patterns/agents/basic_workflows.ipynb

When I'm stuck with a task that's too big for one prompt, prompt chaining is my default move. It's boring, but it works: you force intermediate artifacts (notes → outline → draft) and you get far fewer "teleportation" answers where the model skips steps.

A common mistake (I've done this plenty) is letting the chain steps be vaguely specified. If step 1 doesn't produce a crisp artifact, step 2 becomes a mushy rewrite of mush.

from anthropic import Anthropic

client = Anthropic()

def prompt_chain(initial_input: str, steps: list[str]) -> str:
    current_input = initial_input
    
    for step in steps:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=2000,
            messages=[{
                "role": "user",
                "content": f"{step}\n\nInput:\n{current_input}"
            }]
        )
        current_input = response.content[0].text
    
    return current_input

# Example: Research -> Outline -> Draft
steps = [
    "Research this topic and list 5 key points:",
    "Create a detailed outline from these points:",
    "Write a blog post following this outline:"
]

result = prompt_chain("Quantum computing applications in finance", steps)

With Validation Gates

If the output matters (customer-facing copy, SQL, anything "ship-able"), I've found that adding a gate is more effective than telling the model to "be careful." You're basically acknowledging that the model will sometimes drift, and you're giving yourself a checkpoint to catch it early.

The trick is to keep validators short and decisive: PASS/FAIL with a reason. If the validator becomes another essay generator, it's not doing its job.

def validated_chain(initial_input: str, steps: list[dict]) -> str:
    current_input = initial_input
    
    for step in steps:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=2000,
            messages=[{"role": "user", "content": f"{step['prompt']}\n\nInput:\n{current_input}"}]
        )
        output = response.content[0].text
        
        if "validator" in step:
            validation = client.messages.create(
                model="claude-sonnet-4-20250514",
                max_tokens=500,
                messages=[{"role": "user", "content": f"{step['validator']}\n\nContent:\n{output}\n\nRespond with PASS or FAIL: <reason>"}]
            )
            if "FAIL" in validation.content[0].text:
                raise ValueError(f"Validation failed at step: {validation.content[0].text}")
        
        current_input = output
    
    return current_input

2. Routing

Routing is the pattern I reach for when I'm tempted to build a single "do everything" prompt. In practice, specialization beats cleverness: if you can classify inputs into a small set of intents, you can swap system prompts (or tools) and get more consistent behavior.

The part that's easy to forget: routing doesn't have to be perfect. You just need it to be good enough that the "wrong" handler isn't catastrophic. I usually add a safe fallback (like general_question) and I avoid handlers that assume access to sensitive systems unless I've got checks around them.

def classify_intent(user_input: str) -> str:
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=100,
        messages=[{
            "role": "user",
            "content": f"""Classify this request into exactly one category:
- technical_support
- billing_inquiry
- general_question
- complaint

Request: {user_input}

Category:"""
        }]
    )
    return response.content[0].text.strip().lower()

def route_request(user_input: str) -> str:
    intent = classify_intent(user_input)
    
    handlers = {
        "technical_support": "You are a technical support specialist. Provide detailed troubleshooting steps.",
        "billing_inquiry": "You are a billing specialist. Access account information and explain charges.",
        "general_question": "You are a helpful assistant. Provide clear, informative answers.",
        "complaint": "You are a customer success manager. Acknowledge concerns and offer solutions."
    }
    
    system_prompt = handlers.get(intent, handlers["general_question"])
    
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1500,
        system=system_prompt,
        messages=[{"role": "user", "content": user_input}]
    )
    
    return response.content[0].text

3. Parallelization

Parallelization is where you can buy back latency when you have independent subtasks. When I use it, I'm usually trying to avoid a "slow waterfall" where the model serially does sentiment → topics → entities → summary.

The obvious footgun: you can hit rate limits or spike costs without noticing. If I'm doing this in production, I usually put a cap on concurrency and I treat failures as "partial results," not total failure.

import asyncio
from anthropic import AsyncAnthropic

async_client = AsyncAnthropic()

async def parallel_analyze(text: str) -> dict:
    async def analyze_aspect(aspect: str, prompt: str) -> tuple[str, str]:
        response = await async_client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=500,
            messages=[{"role": "user", "content": f"{prompt}\n\nText: {text}"}]
        )
        return aspect, response.content[0].text
    
    tasks = [
        analyze_aspect("sentiment", "Analyze the sentiment of this text. Return: positive/negative/neutral with confidence."),
        analyze_aspect("topics", "Extract the main topics from this text as a comma-separated list."),
        analyze_aspect("entities", "Extract named entities (people, places, organizations) as JSON."),
        analyze_aspect("summary", "Summarize this text in one sentence."),
    ]
    
    results = await asyncio.gather(*tasks)
    return dict(results)

# Usage
# result = asyncio.run(parallel_analyze("...long text..."))

4. Evaluator-Optimizer Pattern

Evaluator-Optimizer Loop

✨

Generate

📊

Evaluate

💬

Feedback

🔧

Optimize

Iterative

Improvement

Content quality improves through continuous cycles of generation, evaluation, and refinement

Location: patterns/agents/evaluator_optimizer.ipynb

This pattern is basically "write → critique → rewrite," and it's one of the few loops I'll still use even when I'm trying to keep systems simple. It's especially good when you have qualitative targets ("clear," "persuasive," "more rigorous") that are hard to encode as rules.

The thing I don't think gets emphasized enough: this loop can burn tokens fast. I try to be disciplined about (1) max iterations and (2) a target score. Otherwise it becomes an infinite polishing machine.

def evaluate(content: str, criteria: list[str]) -> dict:
    criteria_text = "\n".join([f"- {c}" for c in criteria])
    
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1000,
        messages=[{
            "role": "user",
            "content": f"""Evaluate this content against the criteria.

Criteria:
{criteria_text}

Content:
{content}

Return JSON:
{{
    "score": 0-10,
    "strengths": ["..."],
    "weaknesses": ["..."],
    "suggestions": ["..."]
}}"""
        }]
    )
    import json
    return json.loads(response.content[0].text)

def optimize(content: str, feedback: dict) -> str:
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=2000,
        messages=[{
            "role": "user",
            "content": f"""Improve this content based on feedback.

Original:
{content}

Feedback:
Weaknesses: {feedback['weaknesses']}
Suggestions: {feedback['suggestions']}

Provide the improved version:"""
        }]
    )
    return response.content[0].text

def iterative_improve(initial_content: str, criteria: list[str], max_iterations: int = 3, target_score: int = 8) -> str:
    content = initial_content
    
    for i in range(max_iterations):
        evaluation = evaluate(content, criteria)
        print(f"Iteration {i+1}: Score {evaluation['score']}/10")
        
        if evaluation["score"] >= target_score:
            print("Target score reached!")
            break
        
        content = optimize(content, evaluation)
    
    return content

5. Orchestrator-Workers Pattern

Location: patterns/agents/orchestrator_workers.ipynb

When people say "agent," this is usually what they mean: a coordinator that decomposes a task and delegates to specialists. I like this pattern when the task naturally splits (research + writing + review), but I've also found it's the easiest one to overbuild.

A common mistake is letting the orchestrator create a huge plan with vague subtasks. I've had better luck when the orchestrator produces a small number of concrete subtasks, and when each worker has a sharp role and constraints.

def orchestrator(task: str) -> str:
    # Step 1: Break down the task
    breakdown = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1000,
        messages=[{
            "role": "user",
            "content": f"""Break this task into subtasks for specialized workers.

Available workers:
- researcher: Finds information and facts
- writer: Creates written content
- analyst: Analyzes data and provides insights
- reviewer: Reviews and improves content

Task: {task}

Return JSON array of subtasks:
[{{"worker": "...", "subtask": "...", "depends_on": []}}]"""
        }]
    )
    
    import json
    subtasks = json.loads(breakdown.content[0].text)
    
    # Step 2: Execute subtasks (respecting dependencies)
    results = {}
    
    def execute_subtask(subtask: dict) -> str:
        context = ""
        for dep in subtask.get("depends_on", []):
            if dep in results:
                context += f"\nFrom {dep}: {results[dep]}"
        
        worker_prompts = {
            "researcher": "You are a research specialist. Find relevant information and cite sources.",
            "writer": "You are a skilled writer. Create engaging, clear content.",
            "analyst": "You are a data analyst. Provide insights and interpretations.",
            "reviewer": "You are an editor. Improve clarity, fix errors, enhance quality."
        }
        
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1500,
            system=worker_prompts.get(subtask["worker"], "You are a helpful assistant."),
            messages=[{"role": "user", "content": f"Task: {subtask['subtask']}{context}"}]
        )
        return response.content[0].text
    
    # Simple execution (no parallel for dependencies)
    for i, subtask in enumerate(subtasks):
        results[f"subtask_{i}"] = execute_subtask(subtask)
    
    # Step 3: Synthesize results
    all_results = "\n\n---\n\n".join([f"**{k}**:\n{v}" for k, v in results.items()])
    
    synthesis = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=2000,
        messages=[{
            "role": "user",
            "content": f"""Synthesize these worker outputs into a final deliverable.

Original task: {task}

Worker outputs:
{all_results}

Create a cohesive final result:"""
        }]
    )
    
    return synthesis.content[0].text

Summary

Pattern	When to Use
Prompt Chaining	Linear workflows with dependent steps
Routing	Different handling based on input type
Parallelization	Independent subtasks that can run concurrently
Evaluator-Optimizer	Iterative quality improvement
Orchestrator-Workers	Complex tasks requiring multiple specialists

In the next post, I switch from "patterns in notebooks" to the Claude Agent SDK—the stuff you'd actually reach for when you want to package these ideas into something you can ship.