Can the Coordinator, Implementor, and Verifier use the same underlying LLM?

Yes, if the workflow isolates prompts, context, and verification criteria. VeriMAP's verification-aware planning shows that independence comes from where verification functions are generated, not necessarily from which model runs them. Logical isolation through separate prompts and contexts is the minimum; physical isolation adds defense-in-depth when same-model failure modes are a concern.

When does a single agent outperform the CIV pattern?

Single-agent execution is preferable for changes that affect a single file or module, where standard linting and type checks provide sufficient verification, and the task completes in under 30 minutes. Coordination overhead must be justified by parallelization savings or by the risk of correlated errors.

How do teams handle model version changes that break agent behavior?

Treat the model version as a dependency requiring explicit change management. Cognition's experience rebuilding Devin for Claude Sonnet 4.5 demonstrates that agent orchestration logic is coupled to specific model behavioral characteristics. Run integration tests against new model versions before production deployment.

What verification techniques catch errors that LLM judges miss?

Deterministic AST-based analysis catches failure modes that LLM judges miss. Layered verification combining deterministic, LLM-based, and execution-based gates catches failure modes that any single technique misses.

How does the CIV pattern interact with existing CI/CD pipelines?

The Verifier's output maps to CI gate checks. Merge-readiness packs bundle verification evidence into structured artifacts intended to support review and merge decisions. CI/CD pipelines often use manual approval, review, and security gates for sensitive changes such as production deployments and CI configuration updates.

Agentic SDLC Implementation: The Coordinator-Implementor-Verifier Pattern

The Coordinator-Implementor-Verifier pattern structures an agentic SDLC around separate planning, execution, and validation roles, thereby isolating reasoning paths in complex multi-file workflows.

TL;DR

Single-agent SDLC workflows fail when a single agent writes code and evaluates it using the same reasoning path. The Coordinator-Implementor-Verifier pattern assigns planning, execution, and validation to separate roles, with verification grounded in the original specification. Role separation pays off on multi-file changes spanning three or more independent modules; for single-file work, coordination overhead exceeds the correlated-error risk it prevents.

The same agent that writes the code is often the one that reviews it, runs the tests, and decides when to stop iterating. That works for a single-file change. It breaks down on multi-file work, where generation errors survive review because the reviewing pass shares context with the pass that produced them, retries chain together without anyone deciding when to stop, and merge readiness becomes a judgment call with no evidence trail.

Augment Cosmos is an orchestration layer for agentic software development workflows: it coordinates planning, execution, and verification across separate agent roles, with shared context and structured handoffs between them. The same architectural understanding that lets the underlying Context Engine reason across 400,000+ files becomes the substrate the Coordinator uses to divide work safely.

[ Free report ]

The Agentic SDLC

How teams like Stripe, Ramp, and Uber move from solo coding agents to a coordinated, team-level system.

Download the guide

Why Role Separation Defines Reliable Agentic SDLC Implementation

Role separation improves reliability in the agentic SDLC by reducing correlated failures between code generation and evaluation. When the same agent writes code and judges it, the judgment runs on the same reasoning patterns that produced the errors. Better models reduce the error rate without fully addressing the structural overlap between the producing and validating paths.

VeriMAP is one published example of this pattern. It proposes a four-module architecture (Planner, Executor, Verifier, Coordinator) in which the Planner decomposes the task and encodes passing criteria as verification functions for each subtask; the Executor produces outputs; the Verifier evaluates those outputs against the planner-defined functions; and the Coordinator manages sequencing, retries, and replanning. Validation lives on a separate path from code production rather than running inside the same loop.

The Coordinator-Implementor-Verifier pattern in this guide follows the same separation principle: the Verifier operates under separate prompts and context from the Implementor, and validation criteria come from the Coordinator's specification rather than the Implementor's reasoning trail. That specification is the ground truth for each handoff, keeping long execution chains tied to the original requirements rather than drifting as agents iterate. Whether a verifier and executor can share a model depends on which failure modes the verifier needs to catch. Cosmos workspaces extend logical isolation with branch-level physical isolation: per-agent git worktrees with independent branch checkout and file state.

The Three Roles in a Multi-Agent SDLC: Coordinator, Implementor, Verifier

The Coordinator controls planning and routing; the Implementer executes the scoped work; and the Verifier determines whether the outputs satisfy the specification and should proceed.

Coordinator: Planning, Decomposition, and Routing

The Coordinator converts an incoming request into a structured specification, bounded tasks, and explicit handoffs. Anthropic's multi-agent research system documented specific failure modes in its orchestration design. Subagent tasks need four elements to avoid misinterpretation and duplicate work:

An objective
An output format
Guidance on tools and sources
Clear task boundaries

Those constraints keep decomposition actionable and prevent each subagent from inferring task structure independently.

Implementor: Scoped Execution in Isolated Environments

The Implementor receives bounded tasks from the Coordinator and executes coding, modification, or remediation work within an isolated environment. Scoped work limits interference between concurrent changes across independent modules.

Two inputs constrain each Implementor's work: the task specification and the execution environment. A single Implementor should follow the single-responsibility principle by keeping routing and execution separate.

Verifier: Independent Validation Against the Specification

The Verifier evaluates Implementor outputs through testing, review, quality assurance, and compliance checks. Its independence comes from measuring those outputs against the Coordinator's specification rather than the Implementor's reasoning trail.

VeriMAP's verification-aware framing supports response improvement loops without prescribing a binary success-or-fail verifier contract. In production CIV workflows, structured feedback is returned to the Coordinator's retry context, closing the loop. Documented systems use different verifier mechanisms: some rely on a single verifier, others on consensus-based review.

How the Pattern Operates Across the Software Development Lifecycle

CIV is this guide's shorthand for the Coordinator-Implementor-Verifier separation, not a published standard. The phase mapping below draws on cited research and production sources.

Phase 1: Specification and Planning

The Coordinator maps dependencies and produces a structured specification that downstream agents can follow before any code changes begin.

Before proposing changes, the Coordinator identifies affected files, traces call graphs, and maps dependencies. Downstream agents consume the resulting specification directly. Specification-driven development treats it as the source of truth, and humans review and approve before any Implementor writes code. Teams formalizing this stage can compare AI coding tools before deciding how the Coordinator should hold state.

Phase 2: Implementation

CIV's implementation model can use isolated Implementor agents in bounded parallel work, with each agent in its own git worktree so concurrent changes do not collide. Factory AI's Missions architecture is a production analog: an orchestrator breaks projects into milestones, decomposes milestones into features, and gives each feature a fresh worker session with clean context. Missions parallelize only where coordination overhead stays low.

That production model has three visible properties:

Workers execute assigned tasks independently
Workers do not coordinate directly with other workers
Workers commit via git, so each subsequent worker inherits a clean state

Parallel execution creates a coordination challenge. Cognition's multi-agent analysis describes cases where agents assume they share state with other agents when they do not. The Coordinator addresses this by giving each Implementor a shared source of truth.

Phase 3: Code Review and Verification

Code review and verification compare implementation output to the original specification through layered validation, catching static failures, semantic drift, and incomplete implementation.

Effective verification combines multiple techniques. Teams comparing code review tools and static analysis tools can see how these techniques serve different failure classes:

Deterministic gates first: AST-based analysis detects API hallucination and similar static failures.
LLM-based reasoning second: Structured semi-formal reasoning evaluates patch correctness against the task and specification.
Dynamic testing third: Execution-based validation runs generated tests against the modified code.

This ordering catches syntactic and type-level errors at lower cost before probabilistic evaluation runs. Code hallucination research identifies failure modes such as non-existent API calls, dependency conflicts, and internal inconsistencies, with different classes often requiring different detection mechanisms.

Phase 4: Feedback Loop and Iteration

Feedback-loop iteration makes CIV practical. Verifier diagnostics return to the Coordinator, and the Coordinator routes targeted retries without restarting the whole workflow. The Coordinator sends the specific failure back to the appropriate Implementor with the Verifier's diagnostic. The Coordinator marks a subtask complete only after all verification functions pass.

Phase 5: Deployment Preparation and Merge Readiness

Deployment preparation bundles evidence from specification, testing, and review for downstream CI and human approval.

The merge-readiness pack (MRP) is the agentic SE concept for bundling evidence that agent-produced work meets merge criteria. Reasonable MRP contents include:

Functional completeness: acceptance criteria met, feature behaves as specified
Sound verification: test plan covering happy paths, edge cases, and failure modes

The same evidence then maps each lifecycle phase to a clear handoff point:

SDLC Phase	Coordinator Action	Implementor Action	Verifier Action
Specification	Decomposes task, produces structured spec	N/A	N/A
Implementation	Monitors execution, manages retries	Codes in isolated worktree	N/A
Code Review	Routes Verifier feedback to Implementors	Applies targeted fixes	Validates against specification
Testing	Coordinates test execution strategy	Generates or updates tests	Executes tests, interprets results
Deployment Prep	Assembles merge-readiness pack	N/A	Signs off on all gates

Workflow Decomposition Patterns for the CIV Architecture

CIV is a logical role separation, not an orchestration runtime. The patterns below are general orchestration architectures on which CIV can be implemented.

Pattern 1: Graph-State Supervisor with Dynamic Routing

Graph-state supervision implements CIV via explicit nodes and edges, allowing a Coordinator to route work dynamically while preserving state. LangGraph models workflows as directed graphs on States, Nodes, and Edges. The supervisor coordinates specialized worker nodes, and teams can model verification as a separate workflow step. LangGraph's persistence and interrupt primitives provide native human-in-the-loop capability.

Pattern 2: Event-Stream with Append-Only Log

Event-stream orchestration records CIV execution as actions and observations, using an append-only history for recovery and inspection. OpenHands uses a stateless agent that emits Actions, an append-only EventLog that stores history, and a Workspace that executes Actions and returns Observations. The architecture supports decomposition extensions and side models for verifying task completeness.

Pattern 3: Preparation-First Context Engineering

Preparation-first context engineering moves CIV coordination earlier, with specification grounding and repository familiarization happening before agent execution begins. The Mise en Place approach externalizes domain expertise into structured documents, then produces detailed design artifacts through human-agent dialogue, then decomposes specifications into structured, dependency-aware records. The coordination burden shifts from runtime to preparation time.

Layered Verification Architectures

Verification in an agent-in-the-loop SDLC requires layered pipelines because no single technique catches all failure types.

Specification Isolation for Verifiers

Providing implementation code to verification agents too early biases their outputs. Specification-grounded test generation should run first; implementation-level coverage analysis comes later.

Security Scanning as Verification

Security scanning fits the CIV pattern because security checks can operate as independent validation gates before changes move toward deployment. Google's CodeMender post describes an LLM judge tool that evaluates functional equivalence after modifications. The agent self-corrects when the verifier detects failures.

Treat security checks as security gates equivalent to SAST, DAST, software composition analysis, and infrastructure-as-code scanning. This keeps security verification aligned with the same independent-gate logic used elsewhere in the CIV pattern.

Failure Containment and Operational Governance

The containment practices below come from agentic AI safety and orchestration literature; CIV adopts them because role separation alone does not contain runtime failures.

Production Failure Modes in Agentic Systems

Three categories of production failures require distinct containment strategies: execution and loop failures, reasoning quality failures, and multi-agent orchestration failures.

Architectural primitives for agentic AI include maximum step limits, tool-call caps, and idempotent tool design. Common attack surfaces in agentic systems include goal hijack, tool misuse, privilege abuse, cascading failures, and memory poisoning.

Sandboxing and Blast Radius Control

AI agent sandboxes must prevent local damage, enable closed-loop feedback for self-correction, and ensure multi-tenant isolation in shared platforms. Docker-based isolation with git worktree routing can enforce a default-deny policy for access to production resources. Systems should block technical access to production unless the workflow explicitly allows it.

Circuit Breakers and Rollback

Circuit breakers adapt distributed-systems ideas to agentic workflows by preventing cascading failures, triggering fallback behavior, and enabling graceful degradation. LangGraph checkpoints support replaying prior executions and forking from past checkpoints, which lets teams re-run a branch while preserving the original execution history. Git branch isolation provides passive rollback because unmerged work never reaches the main branch.

Human-in-the-Loop Checkpoint Positioning

Human-in-the-loop checkpoints change governance because oversight placed before execution, during critical subtasks, and before shipping can prevent high-impact failures from propagating. Martin Fowler's analysis of humans and agents distinguishes between in-the-loop and on-the-loop oversight modes. In practice, three checkpoints map well to CIV: before work begins, at intermediate artifacts for critical subtasks, and at final review before shipping.

Context engineering raises the probability of correct results, but results remain probabilistic as long as LLMs are involved.

Decision Criteria: When and How to Adopt the CIV Pattern

Teams should adopt CIV based on the scope of change, verification needs, and failure tolerance. Coordination overhead pays off when isolated workspaces, verifier feedback, and merge gates prevent more downstream rework than they add during planning.

Open source

augmentcode/review-pr★40

Star on GitHub

When Multi-Agent Separation Pays Off

ROI turns positive for complex multi-file tasks when the change spans multiple independent modules, a reusable specification can guide multiple agents, and parallel execution delivers gains only when handoff overhead stays lower than downstream rework.

Cognition's analysis of multi-agent systems observes that most practical multi-agent setups are limited to read-only subagents, and that naive multi-agent architectures often fail without careful design of context and orchestration. Treat read-only patterns as the mature baseline and write-capable patterns as requiring more engineering investment.

Context Window Management for Long-Horizon Tasks

Bounded task context changes long-horizon execution because each Implementor works from scoped inputs rather than carrying the full accumulated session history. LLM generation results become less reliable the longer a session continues. The CIV pattern addresses this structurally by ensuring that each Implementor receives a bounded task with scoped context. Anthropic's guidance on context engineering recommends memory files to track progress across complex tasks.

Consolidated Decision Matrix

The thresholds below are practical guidelines, not research-backed numbers; teams should calibrate them against their own coordination overhead and error tolerance.

Decision Point	Choose Single Agent	Choose CIV Pattern
Change scope	Single file or module	Three or more independent modules
Verification needs	Linting and type checks are sufficient	Specification-grounded behavioral validation is required
Execution duration	Under 30 minutes	Multi-hour or multi-day autonomous runs
Error tolerance	Errors caught in standard PR review	Correlated errors create downstream failures
Team maturity	Individual agent adoption phase	Shared workflow patterns and governance in place

Orchestration Infrastructure Considerations

CIV workflows need separate orchestration infrastructure when they persist across sessions, repositories, and organizational boundaries. Context, policy, and feedback must remain attached to the work as it moves between roles.

The protocol stack is converging under the Agentic AI Foundation at the Linux Foundation, with MCP for agent-to-tool communication and A2A for agent-to-agent communication. Governance remains less standardized than those communication layers.

Three artifacts must persist across handoffs: context, policy and feedback.

Production Lessons from Deployed CIV-Pattern Systems

Production systems show how role separation responds to coordination failures, validation gaps, and model-coupling risks under real workloads.

Cognition's Devin: Model-Agent Coupling as Operational Risk

When Cognition rebuilt Devin around Claude Sonnet 4.5, the process required reworking the agent's behavior to accommodate the new model's characteristics. Sub-agent communication back to managers required explicit remediation. The lesson: changing the underlying model can require changes to agent behavior and communication patterns.

Factory AI's Missions: Milestone Validation for Long-Horizon Runs

Factory AI's Missions architecture uses milestone-level validation for multi-day autonomous runs, with each feature getting a fresh worker session rather than carrying state across sequential agent work. The pattern is an analog for CIV deployments where the Verifier catches behavioral regressions that syntax checks miss.

Google: Separating Reproduction from Repair

Google's agentic bug reproduction research treats reproduction as a distinct stage that runs before repair. The system first demonstrates the issue, then assigns repair work. This decomposition maps directly to CIV: a reproduction Implementor runs first, the Verifier confirms the reproduction is valid, and only then does a repair Implementor receive the task.

Implement the CIV Pattern Starting with Specification-Grounded Verification

The trade-off in agentic SDLC implementation is simple: tighter separation adds orchestration overhead, while weaker separation lets correlated errors persist long enough to become production problems.

Start with one bounded workflow in which the Coordinator produces a structured specification with explicit acceptance criteria, the Implementer works in an isolated Git worktree, and the Verifier blocks promotion until the change passes deterministic, specification-grounded checks. Once that loop works reliably, widen the pattern to broader refactors, cross-service changes, and longer-horizon execution with less drift and cleaner retry control. That sequence keeps governance ahead of automation and makes failure containment part of the workflow rather than a later patch.

Frequently Asked Questions About Agentic SDLC and the CIV Pattern

The Frequently Asked Questions below address the implementation questions that most often block adoption: model reuse, single-agent exceptions, model-version risk, verification coverage, and CI/CD integration.

Agentic SDLC Implementation: The Coordinator-Implementor-Verifier Pattern

TL;DR

The Agentic SDLC

Why Role Separation Defines Reliable Agentic SDLC Implementation

The Three Roles in a Multi-Agent SDLC: Coordinator, Implementor, Verifier

Coordinator: Planning, Decomposition, and Routing

Implementor: Scoped Execution in Isolated Environments

Verifier: Independent Validation Against the Specification

How the Pattern Operates Across the Software Development Lifecycle

Phase 1: Specification and Planning

Phase 2: Implementation

Phase 3: Code Review and Verification

Phase 4: Feedback Loop and Iteration

Phase 5: Deployment Preparation and Merge Readiness

Workflow Decomposition Patterns for the CIV Architecture

Pattern 1: Graph-State Supervisor with Dynamic Routing

Pattern 2: Event-Stream with Append-Only Log

Pattern 3: Preparation-First Context Engineering

Layered Verification Architectures

Specification Isolation for Verifiers

Security Scanning as Verification

Failure Containment and Operational Governance

Production Failure Modes in Agentic Systems

Sandboxing and Blast Radius Control

Circuit Breakers and Rollback

Human-in-the-Loop Checkpoint Positioning

Decision Criteria: When and How to Adopt the CIV Pattern

When Multi-Agent Separation Pays Off

Context Window Management for Long-Horizon Tasks

Consolidated Decision Matrix

Orchestration Infrastructure Considerations

Production Lessons from Deployed CIV-Pattern Systems

Cognition's Devin: Model-Agent Coupling as Operational Risk

Factory AI's Missions: Milestone Validation for Long-Horizon Runs

Google: Separating Reproduction from Repair

Implement the CIV Pattern Starting with Specification-Grounded Verification

Frequently Asked Questions About Agentic SDLC and the CIV Pattern

Written by

Ani Galstian

Give your codebase the agents it deserves

TL;DR

The Agentic SDLC

Why Role Separation Defines Reliable Agentic SDLC Implementation

The Three Roles in a Multi-Agent SDLC: Coordinator, Implementor, Verifier

Coordinator: Planning, Decomposition, and Routing

Implementor: Scoped Execution in Isolated Environments

Verifier: Independent Validation Against the Specification

How the Pattern Operates Across the Software Development Lifecycle

Phase 1: Specification and Planning

Phase 2: Implementation

Phase 3: Code Review and Verification

Phase 4: Feedback Loop and Iteration

Phase 5: Deployment Preparation and Merge Readiness

Workflow Decomposition Patterns for the CIV Architecture

Pattern 1: Graph-State Supervisor with Dynamic Routing

Pattern 2: Event-Stream with Append-Only Log

Pattern 3: Preparation-First Context Engineering

Layered Verification Architectures

Specification Isolation for Verifiers

Security Scanning as Verification

Failure Containment and Operational Governance

Production Failure Modes in Agentic Systems

Sandboxing and Blast Radius Control

Circuit Breakers and Rollback

Human-in-the-Loop Checkpoint Positioning

Decision Criteria: When and How to Adopt the CIV Pattern

When Multi-Agent Separation Pays Off

Context Window Management for Long-Horizon Tasks

Consolidated Decision Matrix

Orchestration Infrastructure Considerations

Production Lessons from Deployed CIV-Pattern Systems

Cognition's Devin: Model-Agent Coupling as Operational Risk

Factory AI's Missions: Milestone Validation for Long-Horizon Runs

Google: Separating Reproduction from Repair

Implement the CIV Pattern Starting with Specification-Grounded Verification

Frequently Asked Questions About Agentic SDLC and the CIV Pattern

Can the Coordinator, Implementor, and Verifier use the same underlying LLM?

When does a single agent outperform the CIV pattern?

How do teams handle model version changes that break agent behavior?

What verification techniques catch errors that LLM judges miss?

How does the CIV pattern interact with existing CI/CD pipelines?

Related Guides

Written by

Ani Galstian

Give your codebase the agents it deserves