Skip to content
Book demo
Back to Guides

Agentic Workflow Patterns: Building Agents That Coordinate

Jun 23, 2026
Molisha Shah
Molisha Shah
Agentic Workflow Patterns: Building Agents That Coordinate

Agentic workflow patterns give multi-agent systems a reusable coordination architecture. They define how agents pass work, share context, validate outputs, and recover from failures. Pattern selection matters early because it fixes the handoff, routing, validation, and recovery mechanisms at each coordination boundary.

TL;DR

Multi-agent systems fail when coordination outgrows a single prompt. A UC Berkeley MAST analysis of 1,600+ multi-agent execution traces identified 14 failure modes clustering into system design issues, inter-agent misalignment, and task verification. Five coordination patterns map workflow shape to validation, observability, and human checkpoints, preventing errors from cascading across stages.

Multi-agent coordination fails when handoffs carry silent intermediate errors into later stages. Workflow patterns add contracts, routing, and audit checkpoints before downstream agents consume flawed context. In a sequential chain, a subtly incorrect intermediate output can pass through intact, and every downstream agent then treats it as fact, with no stack trace or alert appearing.

The MAST taxonomy groups multi-agent failures into specification and system design issues, inter-agent misalignment, and task verification. That makes coordination design part of the reliability boundary, not a secondary implementation detail.

A workflow pattern differs from a one-off agent prompt because it defines the structure of coordination. It specifies what each agent receives, what it returns, how control transfers, and where execution pauses. This guide covers five patterns staff engineers can apply directly, each mapped to the failure mode it addresses.

For long-running and parallel work, those patterns also depend on persistent context, escalation rules, and review checkpoints remaining available across sessions. Augment Cosmos is a cloud agents platform built for exactly this coordination layer: its Environments, Experts, and Sessions turn individual prompts into auditable, replayable workflows with organizational memory that carries effective configurations team-wide rather than keeping them in one engineer's local setup.

[ Coming up next ]

The New Code Review Workflow for AI-Native Engineering Teams

See how leading teams keep code review fast and rigorous as AI writes more of the code.

Save your seat
Thu, Jul 9 // 9:45 AM PDT

Why Do Coordination Patterns Affect Multi-Agent Reliability?

Agentic coordination patterns matter because routing, validation, and recovery mechanisms give multi-agent systems known places to catch flawed handoffs. Research on multi-agent systems shows that errors can propagate and sometimes amplify across agents, especially when communication is dense or poorly controlled, while validation checkpoints and other gating mechanisms can limit that spread.

Production failures appear in three ways. Context loss occurs when long reasoning chains cause an agent to miss or dilute relevant context, producing hallucinations when critical information is missing. Cascading errors emerge when a single step fails, and the agent explores an entirely different trajectory; Anthropic describes this as the compound nature of errors in agentic systems. Unauditable outputs occur when a task completes incorrectly without generating an error signal, leaving no trace for root cause analysis.

Coordination failureHow it appearsPattern-level control
Context lossAgent misses or dilutes relevant contextConstrain shared state to task-relevant history and handoff fields
Cascading errorsOne step fails; the agent explores a different trajectoryAdd validation gates before downstream agents consume output
Unauditable outputsTask completes incorrectly without an error signalEmit structured events for root-cause analysis
Topology mismatchStatic pattern selection does not match the workflow topologySelect patterns from the task shape before implementation

Task topology should drive pattern selection. Fan-out benefits parallelizable tasks but hurts sequential ones. Static pattern selection becomes a failure mode when the workflow topology does not match the coordination structure.

What Prerequisites Do Multi-Agent Coordination Patterns Require?

Before choosing a pattern, define shared context, handoff contracts, and observability hooks. Skipping one removes a recovery mechanism.

A shared context strategy governs how agents access information without overwhelming one another. LangChain frames context engineering as write, select, compress, and isolate. Anthropic's three techniques are compaction, structured note-taking, and context isolation via sub-agents. Multi-agent systems add coordination work when agents receive irrelevant history, so context precision constrains shared state to task-relevant history and handoff fields.

Augment Cosmos's Context Engine uses semantic dependency graph analysis across 400,000+ files to select task-relevant code, linked issues, PR feedback, documentation, and ticketing context for each task.

Handoff contracts define what flows between agents. OpenAI's structured outputs with strict: true guarantee schema adherence, but structural conformance does not guarantee semantic correctness. Production systems need both schema validation and semantic validation. Agent observability tools make every action a structured event; without this instrumentation, a task can complete incorrectly without emitting one for root cause analysis.

Step-by-Step Workflow: Five Coordination Patterns

Five agentic coordination patterns map workflow topology to validation, routing, and recovery mechanisms. Use the shape of the work to choose the structure before choosing a framework.

PatternWorkflow topologyMain control pointFailure mode addressedPrimary limit
Sequential pipelineOrdered stagesValidation gates between stagesError compoundingCross-table reference failure
Parallel fan-outIndependent subtasksAggregator synthesisRace conditionsPrerequisite sequential checks
Supervisor-workerDynamic subtasksCentral supervisorSingle-agent context collapseVague delegation
Self-correcting loopRetryable work with clear criteriaRetry boundaryUnbounded loop costEasier prompts can deteriorate
Human-in-the-loopIrreversible or reviewable actionsPolicy-defined pauseUnauthorized irreversible actionsA resume requires a persistent state

Step 1: Sequential Pipeline

The sequential pipeline, also called prompt chaining, is a deterministic agent graph. Each agent processes the output of the previous one and passes a response downstream. Teams accept added stage-to-stage latency in exchange for narrower LLM calls, with programmatic checks on intermediate steps to keep the process on track.

LangGraph enforces handoffs using a state variable, such as current_step or active_agent, that persists across turns. Conversation history integrity is critical: when handing off, include both the tool call and its ToolMessage response, or the history becomes malformed.

Sequential pipeline controls: persist the current stage's state with a state variable; preserve the integrity of conversation history by including the tool call and its ToolMessage response; add validation gates between stages before downstream agents consume intermediate output.

This pattern addresses error compounding. The MAST taxonomy identifies step repetition and disobeying task constraints as distinct failure modes that validation gates between stages can prevent. The pattern's weakness is cross-table reference failure, as information moves through a fixed order and may not revisit earlier sections.

Step 2: Parallel Fan-Out with Aggregation

Parallel fan-out with aggregation routes a task from a dispatcher to multiple specialist agents operating concurrently, then flows their outputs to an aggregator for synthesis. The pattern fits work that can be decomposed into independent subtasks. Each sub-agent must operate independently of the others.

The aggregator controls the main risk, as concurrent agents can produce inconsistent outputs that still appear valid in isolation. The Claude Opus 4.5 multi-agent system card names this synthesis problem as a core difficulty for the orchestrator. Synthesis functions include majority voting, weighted synthesis, orchestrator overrides, and termination based on consensus or quality thresholds.

Fan-out aggregation concentrates conflict resolution in one place: decompose independent subtasks before dispatch; assign specialists with isolated task-relevant state; monitor concurrent outputs for disagreement; reconcile inconsistent outputs; terminate based on consensus or quality thresholds.

Workflow orchestration platforms can coordinate specialized agents with isolated context in incident-response-style decomposition. This pattern addresses race conditions. The hard limit is task shape: tasks requiring prerequisite sequential checks can fail when dispatcher and aggregation work occurs before the prerequisite path finishes.

Cosmos's Parallel Tool Calls connect to this pattern by reducing serial execution bottlenecks. Its Auggie agent executes independent tool calls concurrently, while the Tasklist capability breaks complex work into actionable steps with progress tracking.

Step 3: Supervisor-Worker Delegation

Supervisor-worker delegation uses a central orchestrating agent that breaks down tasks, delegates them to worker agents, and synthesizes their results. The distinguishing property is dynamic subtask selection: the orchestrator determines subtasks from the specific input rather than relying on predefined subtasks.

Microsoft's Magnetic-One uses a dual-ledger design. A Task Ledger maintains facts, while a Progress Ledger lets the orchestrator self-reflect on progress and replan when stuck. The LangGraph supervisor pattern routes work so that only the supervisor responds to the user, centralizing logging, monitoring, and escalation rules.

Delegation needs worker boundaries to prevent duplicate work: define the worker's objective, specify the expected output format, name the tools the worker should use, and define task boundaries to prevent overlapping assignments.

Evaluating AI coding tools for this pattern requires checking whether the tool can preserve task boundaries while agents work across complex codebases. This pattern addresses single-agent context collapse. Delegation fails when instructions are vague: Anthropic fixed duplicate work from vague delegation by requiring each subagent description to include an objective, output format, tool guidance, and clear task boundaries.

Step 4: Self-Correcting Loop

The self-correcting loop lets an agent evaluate its own output against criteria and retry within a boundary. Reflexion separates the loop into an Actor, an Evaluator, and a Self-Reflector: the Actor produces a trajectory, the Evaluator produces a scalar score, and the Self-Reflector converts failed trajectories into verbal feedback diagnosing what went wrong. Teams should use this pattern only when clear evaluation criteria exist.

The retry boundary is the main engineering parameter. LangGraph uses conditional edge routing to END when the run exceeds the retry limit. Without bounds, one documented production incident had two agents stuck in an infinite conversation loop with costs escalating before detection.

A bounded self-correcting loop depends on stop conditions: set a finite retry limit; route to END when the run exceeds the retry limit; track the delta between consecutive attempts; terminate if retries only rephrase the same query with minor variations.

This pattern addresses the cost of unbounded loops and repetitive, non-improving retries. Self-reflection works best when initial accuracy is low and external verification is available. It risks performance deterioration on easier prompts.

Step 5: Human-in-the-Loop Checkpoint

The human-in-the-loop checkpoint pauses execution at policy-defined escalation points before the agent continues. LangGraph's middleware checks each tool call against a configurable interrupt_on policy that maps tool names to approval configs. When a model proposes a reviewable action, the middleware issues an interrupt and halts execution. The human can approve, edit, reject, or respond before execution resumes.

Open source
augmentcode/augment.vim611
Star on GitHub

State persistence makes pauses resumable. LangGraph requires a checkpointer to persist agent state between interrupt and resume, with AsyncPostgresSaver recommended for production. Without a persistent checkpointer, resuming restarts from scratch.

Human-in-the-loop checkpoints control side effects:

Checkpoint concernRequired mechanismFailure prevented
Reviewable tool callsinterrupt_on policy mapping tool names to approval configsUnreviewed writes or SQL execution
Resumable pausesPersistent checkpointer and same thread_id on resumeRestarting from scratch after an interruption
In-flight state changesVersioned state classes with migration functionsBroken resume after schema changes
Irreversible actionsApproval before side effectsDestructive operations after the fact

OpenAI's Codex deployment shows the production scenario: auto-review replaces user approval at the sandbox boundary with review by a separate agent that considers intent, environment, and likely impact.

This pattern addresses unauthorized irreversible actions such as destructive database operations, infrastructure changes, and financial transactions. The EU AI Act, Art. 14, makes human oversight a regulatory requirement for high-risk systems.

Augment Cosmos connects this pattern to policy-defined pauses by assigning intermediate workflow stages to agent experts, while humans review prioritization, spec intent, and code-evolution context. Cosmos's code review agent achieved a 59% F-score in code review benchmarks. Teams comparing AI code review tools should evaluate whether review checkpoints happen before side effects and whether review output is auditable.

Choosing and Wiring a Pattern Before You Write Code

Choose the coordination pattern from the workflow topology before writing code. Ordered work, independent subtasks, retry criteria, and irreversible actions each point to different controls. No single structure fits every task.

Before implementation: identify whether subtasks are independent or ordered; locate where irreversible actions occur; determine where evaluation criteria are clear enough to bound a retry loop; select the pattern that matches the workflow topology; define the handoff contract and wire observability before deploying.

Production systems also need defined execution environments, behavior rules, and persisted state. Augment Cosmos uses Environments, Experts, and Sessions to turn prompts into workflows that persist across long-running and parallel work, with organizational memory that carries effective agent configurations team-wide rather than keeping them in one engineer's local setup.

Frequently Asked Questions About Agentic Workflow Patterns

Written by

Molisha Shah

Molisha Shah

Molisha is an early GTM and Customer Champion at Augment Code, where she focuses on helping developers understand and adopt modern AI coding practices. She writes about clean code principles, agentic development environments, and how teams are restructuring their workflows around AI agents. She holds a degree in Business and Cognitive Science from UC Berkeley.


Get Started

Give your codebase the agents it deserves

Install Augment to get started. Works with codebases of any size, from side projects to enterprise monorepos.