When is LangGraph not enough for a production agent system?

LangGraph stops being enough when teams maintain external production systems that exceed actual agent logic. The Thoughtworks Radar (April 2026) moved it from Adopt to Trial, flagging the global shared-state architecture as not always the best approach and noting that observability depends on the separate LangSmith product rather than native runtime tracing.

Do CrewAI and AutoGen solve the production gaps LangGraph leaves open?

No, all three frameworks require external persistence, governance, observability, or routing layers for production agent systems. CrewAI gates RBAC behind its Enterprise tier and relies on Portkey for failover, whereas AutoGen's Team abstraction lacks built-in checkpointing per its migration guide.

What is the difference between an agent framework and an agent platform?

An agent framework gives teams orchestration primitives and leaves production operations to the team. An agent platform provides the runtime around those primitives: shared memory, structured event emission, policy-enforced human-in-the-loop, and BYOK model routing across providers.

When does building your own agent platform make sense?

Building makes sense under three conditions: the team has truly unique requirements no platform addresses, the agent is the core product, or the organization has regulated data that cannot leave its environment. For other cases, a managed platform becomes more compelling as engineering work expands across memory, observability, governance, routing, and reliability.

Is Cosmos available now?

Augment Cosmos is generally available and included on all paid plans. It is not in preview or gated behind a specific plan.

Agent Frameworks vs. Platforms: Is LangGraph Enough?

A managed platform is a good fit when teams spend more time maintaining state stores, trace pipelines, policy checks, model gateways, and recovery paths than on improving agent behavior.

TL;DR

LangGraph, CrewAI, and AutoGen cover orchestration. Multi-session production use adds crash recovery, tracing, policy enforcement, routing, and cross-session memory. Conventional frameworks leave those responsibilities outside workflow wiring. Augment Cosmos persists corrections and patterns across sessions through tenant and private memory, structured event emission, and policy-enforced human-in-the-loop checkpoints.

Production agent frameworks reach their limits when systems must survive worker crashes and provider rate limits, account for cost, redact before model calls, and carry corrections across sessions. LangGraph, CrewAI, and AutoGen expose orchestration primitives for these systems. Teams still own the production layer around those primitives.

The Thoughtworks Technology Radar (April 2026) moved LangGraph from Adopt to Trial. It noted that the LangGraph architecture, which treats every multi-agent system as a stateful graph with a globally shared state, is not always the best approach. This evaluation provides teams with a decision framework for when engineering work around an agent exceeds the agent logic itself, and for where a unified cloud agents platform fits when they cross that line.

In my evaluation across teams building production multi-agent systems, the pattern is consistent: orchestration primitives are not the bottleneck. The surrounding production layer is. That is the question this article answers. When teams reach the point where memory stores, trace pipelines, policy layers, and model gateways consume more sprint time than agent behavior, a managed platform becomes the more efficient path. Augment Cosmos is the runtime that packages those layers: persistent memory, structured observability, policy-enforced checkpoints, and BYOK model routing are included on all paid plans.

[ Coming up next ]

The New Code Review Workflow for AI-Native Engineering Teams

See how leading teams keep code review fast and rigorous as AI writes more of the code.

Save your seat

— Thu, Jul 9 // 9:45 AM PDT

Frameworks vs Platforms at a Glance

The table below maps five production dimensions (orchestration model, memory, observability, governance, and model routing) across LangGraph, CrewAI, AutoGen, and a managed cloud agents platform. Use it to identify which framework best aligns with your production requirements before reading the detailed breakdown.

Dimension	LangGraph	CrewAI	AutoGen	Managed cloud agents platform
Orchestration model	Stateful graph with nodes, conditional edges, supersteps	Sequential or hierarchical crews with delegation	Conversation runtime with group chat and actor-model core	Agent runtime coordinating long-running execution through isolated sessions and scheduling
Built-in memory	Checkpointers plus stores; production typically uses a durable backend such as Postgres	Short-term, long-term, entity memory with scoring	No built-in Team checkpointing; external state required	Shared filesystem with tenant and private memory
Observability	Delegated to LangSmith (separate paid product)	Verbose logs plus third-party integrations	Delegated to external tooling	Every action emits a structured event
Governance	No native policy engine or PII handling	RBAC gated behind Enterprise tier	Hardened only in successor framework	Governance-first cloud/local selection framework
Model routing	Multi-model, no cost-aware routing	Provider routing via Portkey, not native	No native routing or failover	BYOK across Anthropic, OpenAI, Bedrock, Vertex, and open-source models
Cross-lifecycle wiring	Engineering teams build it	Engineering teams build it	Engineering teams build it	Configured once across build, tests, review, deploy
Learning over time	Checkpoints plus LangMem or external Mem0	Tunable recency and importance scoring	Manual	Memory persists corrections and patterns across sessions
Best-fit use case	Custom control flow, durable graph logic	Role-based multi-agent collaboration	Event-driven distributed experiments	Production agents across a team and lifecycle

Agent Frameworks vs Managed Platforms: The Core Differences

Agent frameworks give teams orchestration primitives and low-level control. Teams use them when they want to own workflow structure, state transitions, and agent interaction patterns directly.

LangChain homepage with tagline 'Ship agents that wow' on a dark background with an animated agent lifecycle diagram spanning Build, Observe, Evaluate, and Dep…LangChain homepage with tagline 'Ship agents that wow' on a dark background with an animated agent lifecycle diagram spanning Build, Observe, Evaluate, and Deploy stages

LangGraph describes itself as a low-level orchestration framework and runtime for building, managing, and deploying long-running, stateful agents. The StateGraph class, parameterized by a user-defined State object, lets teams define exactly how control flows through nodes. Command bundles state updates with navigation; Send spawns parallel node executions for map-reduce patterns.

CrewAI homepage with tagline 'Accelerate AI agent adoption and start delivering production value' on a black background with enterprise customer logos including IBM and Docusign

CrewAI organizes workflows around Flows and Crews. A hierarchical Process assigns a manager agent to allocate tasks based on capability, review outputs, and assess completion. Tasks support guardrails, Pydantic output schemas, and conditional execution.

AutoGen documentation homepage describing it as a framework for building AI agents and applications, with code snippets for the Studio and AgentChat installation options

AutoGen uses the AgentChat API for high-level multi-agent applications and the Core API from the 0.4 redesign, which adopts the actor model of computation to support distributed, highly scalable, event-driven agentic systems.

These orchestration models work best when workflow control is the main deliverable and the engineering team intentionally owns the surrounding production systems. Outside that boundary, each framework hands the production layer back to the team.

The Production Layer Agent Frameworks Hand Back to You

Agent frameworks handle team orchestration. Production teams still need decisions and infrastructure across five areas: memory persistence, observability, governance, model routing, and reliability.

Memory persistence surfaces first. LangGraph's default InMemorySaver is ephemeral: when the process stops, the data is lost. The docs explicitly instruct teams to use a database-backed store in production, which requires a separate psycopg install and a connection string. Teams maintain two systems: checkpointers for thread-scoped memory and stores for cross-thread memory. AutoGen is blunter still. Its migration guide states that the Team abstraction does not provide built-in checkpointing, and that any persistence must be implemented externally.

Observability creates the next layer. The OpenTelemetry GenAI semantic conventions are still in development and not yet stable, and the fragmentation between OpenInference and OpenLLMetry requires span-processor translation pipelines.

Governance cuts across identity, permissions, approval points, and auditability. Microsoft's open-source Agent Governance Toolkit (April 2026) exists because these frameworks do not natively include governance. An agent may hold an API key with write access to a production system and may keep operating after the employee who deployed it has left the organization. Teams without a governance layer must prove control over identity, permissions, approval points, and auditability outside the framework runtime.

Model routing and reliability complete the picture. Across LangGraph, CrewAI, and AutoGen, cost-aware routing and failover sit outside the native orchestration model. The Thoughtworks Radar notes that LiteLLM's drop_params mode silently discards unsupported parameters, meaning capabilities may be lost across routing decisions without visibility.

A threshold emerges when three or more of these external systems consume sprint time. That is the signal to evaluate whether a managed platform is the right next step. Teams already evaluating workflow orchestration options will recognize the same constraints mapped here.

Where LangGraph, CrewAI, and AutoGen Fall Short in Production

Each framework has a natural boundary where its design assumptions no longer serve production requirements. The table maps those boundaries across five dimensions, followed by a detailed breakdown of what each framework does well and where teams typically hit the wall.

Ceiling dimension	LangGraph	CrewAI	AutoGen
Orchestration boundary	Global shared-state graph model	Exactly sequential and hierarchical processes	Actor-model Core API for experiments
Memory pressure	Checkpointer, store, and external Mem0 stack	Memory scoring exists; broader platform layers remain external	No built-in Team checkpointing
Observability pressure	LangSmith stitched in as a separate product	Verbose logs and third-party integrations	External tooling required
Governance pressure	A separate policy layer is required	RBAC is gated behind the Enterprise tier	Enterprise hardening moves to successor framework
Routing and reliability	Separate gateway for routing and failover	Portkey contribution rather than native substrate	Surrounding systems handle durability

LangGraph provides durable control flow through stateful graphs, supersteps, and the interrupt primitive. The Thoughtworks Radar moved it from Adopt to Trial because the global shared-state model is not always the right approach. Teams hit the wall when they fight the global state to isolate one agent's view and stitch together LangSmith, external Mem0, and a separate gateway. At that point, the question is whether AI-first dev workflows at the enterprise level require an entirely different architectural model.

CrewAI supports role-based collaboration through sequential and hierarchical crews. The Process class allows only those two shapes. Complex routing beyond them means leaving the framework's grain. CrewAI gates governance: RBAC sits under the Enterprise tier. Portkey documents production reliability features as its contribution, not CrewAI's native layer.

AutoGen supports event-driven multi-agent experiments through its Core API and actor-model runtime. Microsoft has positioned the Microsoft Agent Framework as the successor to Semantic Kernel and AutoGen, stating that most of its investment is now focused on it. The migration guide notes that orchestration patterns are now hardened with durability, observability, and security in the successor, which implies they were not sufficiently hardened in AutoGen. Teams are building durability and governance on a framework whose own maintainer is steering investment elsewhere.

How Augment Cosmos Addresses the Production Layer

Augment Cosmos combines governed execution, structured observability, persistent memory, and lifecycle wiring into a single runtime. Environments define where agents run and what they can touch. Experts define how agents behave, what tools they use, and what events they subscribe to. Sessions turn one-off prompts into auditable, replayable workflows that can remain private to a single engineer or become a shared capability the whole organization can draw on.

Open source

augmentcode/augment-swebench-agent★873

Star on GitHub

The managed runtime packages the recurring engineering work into five layers: tenant and private memory that persist corrections and patterns across sessions; structured events that emit for every action; policy-enforced human-in-the-loop that sets where human judgment is required; BYOK and Prism model routing spanning Anthropic, OpenAI, Bedrock, Vertex, and open-source models; and lifecycle wiring across build, tests, review, and deployment configured once.

Cosmos maps the framework gaps to managed primitives:

Framework gap	Cosmos primitive	Operational change
Memory persistence	Tenant and private memory	Corrections and patterns persist across sessions
Observability	Structured events	Every action emits an event inside the runtime
Governance	Policy-enforced human-in-the-loop	Human judgment is required at configured checkpoints
Routing	BYOK and Prism model routing	Providers span Anthropic, OpenAI, Bedrock, Vertex, and open-source models
Lifecycle wiring	Build, test, review, and deploy the configuration	Agents do not need to be rewired into each stage

Augment Cosmos's Context Engine processes entire codebases across 400,000+ files through semantic dependency graph analysis.

Some teams, including Stripe, Ramp, and Uber, are building this kind of system themselves. The managed platform pattern consolidates memory, observability, governance, routing, and lifecycle wiring into a single runtime, rather than having platform engineers maintain those layers separately. The trade-off is real: teams give up some low-level control in exchange for not maintaining the operational layer.

Matching Each Option to the Right Team

Framework and platform fit depend on what the team is actually building and which production constraints it is willing to own. The table below maps common team situations to the option that fits best and explains the reasoning behind each recommendation.

Team situation	Best fit	Why
Custom control flow; the agent is the product	LangGraph	Direct control over node transitions, conditional edges, and durability modes
Role-based collaboration fits; Enterprise governance suffices	CrewAI	Sequential and hierarchical crews map cleanly to collaborative tasks
Distributed, event-driven research; succession risk is acceptable	AutoGen	Actor-model Core API supports scalable pub/sub agent systems
Regulated data cannot leave the organization's environment	DIY framework	Managed platforms require their infrastructure
Maintaining more scaffolding than agent logic across a team	Managed cloud agents platform	Memory, observability, governance, and routing come built in
Agents needed across build, tests, review, and deploy	Managed cloud agents platform	Configured once across the software development lifecycle

Making the Call: Framework or Platform

Agent-platform migration makes sense when framework scaffolding consumes sprint time that would otherwise go to agent behavior. If three or more of the following consume that sprint time, the threshold has been crossed: memory stores take more work than workflow logic; trace pipelines must span complete agent runs; policy layers must prove control and auditability; gateways must handle providers, failover, or rate limits; recovery paths must keep sessions usable after crashes or worker changes.

For teams already funding that platform work, the move is to decide whether low-level control still outweighs delivery cost. For auditable multi-session workflows, Cosmos Sessions carry corrections forward through governed multi-session workflows, with persistent state logs that carry context into later sessions.

Agent Frameworks vs. Platforms: Is LangGraph Enough?

TL;DR

The New Code Review Workflow for AI-Native Engineering Teams

Frameworks vs Platforms at a Glance

Agent Frameworks vs Managed Platforms: The Core Differences

The Production Layer Agent Frameworks Hand Back to You

Where LangGraph, CrewAI, and AutoGen Fall Short in Production

How Augment Cosmos Addresses the Production Layer

Matching Each Option to the Right Team

Making the Call: Framework or Platform

Frequently Asked Questions About Agent Frameworks vs Platforms

Written by

Ani Galstian

Give your codebase the agents it deserves

TL;DR

The New Code Review Workflow for AI-Native Engineering Teams

Frameworks vs Platforms at a Glance

Agent Frameworks vs Managed Platforms: The Core Differences

The Production Layer Agent Frameworks Hand Back to You

Where LangGraph, CrewAI, and AutoGen Fall Short in Production

How Augment Cosmos Addresses the Production Layer

Matching Each Option to the Right Team

Making the Call: Framework or Platform

Frequently Asked Questions About Agent Frameworks vs Platforms

When is LangGraph not enough for a production agent system?

Do CrewAI and AutoGen solve the production gaps LangGraph leaves open?

What is the difference between an agent framework and an agent platform?

When does building your own agent platform make sense?

Is Cosmos available now?

Related Guides

Written by

Ani Galstian

Give your codebase the agents it deserves