Skip to content
Install
Back to Blog

We don't need more agents. We need a better system. So we built one.

May 4, 2026
Vinay Perneti
Chris Kelly
Vinay Perneti, Chris Kelly
We don't need more agents. We need a better system. So we built one.

Today we're putting Augment Cosmos into public preview. Cosmos is the operating system for agentic software development: agents that run anywhere (your env, our cloud) and work across your SDLC, humans steering where judgment matters. And, Cosmos is built for teams: shared context and memory, self-improving agent loops, connections to all of your tools, multi-model by default. You can try it now.

Architecture diagram of Cosmos in three horizontal sections. Top: a build-with-an-agent flow showing how teams create custom Experts in three steps. Step 1, describe it ("Build me a security scanner for our APIs that runs weekly"). Step 2, an agent builds the Expert and wires up dependencies. Step 3, the Expert lands in the registry, available to the whole team. Middle: the Cosmos block, labeled "the operating system for agentic software development," containing three layers. Apps and Reference Experts (Deep Code Review, PR Author, E2E Testing, Incident Response). System Services (Expert Registry, Knowledge Base, Human-in-the-Loop, Learning Flywheel, Integrations). Cosmos Core (Agent Runtime, Context Engine, Event Bus, Organizational Knowledge Layer). Bottom: an Infrastructure band listing Laptops, Dev-VMs, Our Cloud, and Your Cloud (coming soon).

An operating system for agentic software development.

The more useful thing for us to share isn't what shipped. It's why we needed to build it.

Three core beliefs shape why we built Cosmos.

1. You cannot transform an organization without a system.

Every eng leader we talk to is experiencing the same disconnect. 99% of their engineers have adopted agents. More code than ever is being written. But that productivity boost hasn’t translated to the org.

We wrote about this a few months ago in the hardest part about going AI-native. It’s not enough to have new technology. You need systems and people to adapt as well.

When individuals adopt agents without a unifying system, four things compound.

  • Setups fragment: every engineer builds their own workflow with no shared patterns.
  • Expertise gets trapped: the engineer who figured out the great prompt for your billing service has it in their config and nowhere else.
  • There's no quality signal: no way to know which agent setups actually work across teams.
  • The review bottleneck gets worse, not better: humans get pulled in only at the final PR, which is exactly where catching problems costs the most.

2. The software factory of the future will be fully agentic.

The Opus 4.5 moment in November was the first time a large portion of serious software engineers agreed it no longer makes sense to write most code by hand.

At the rate at which models are improving, we'll have models that are 10x more capable than Opus 4.5 by the end of this year! That’s an astounding reality: a near future with models 10x more capable than the model that changed the way people think about coding. The way we build software in that world will be very different than how we've been doing it for decades.

We believe the future is small teams of people working with large teams of agents: agentic engineering with agents across the SDLC. Until Cosmos, the tooling for that way of working did not exist.

3. Your agentic SDLC needs to be model-agnostic.

Recent jumps (Opus 4.7, GPT‑5.5) show how fast the rate of model improvement is. Coding is close to becoming a solved problem, yet we expect model development to continue, and to focus on bigger and harder tasks. That means the state of the art models will soon be overpowered for software tasks. Defaulting to a frontier model for everything will be wasteful and expensive. The rational choice is to find the best combination of cost/quality models to get the job done.

Lock in to one provider and you inherit their pricing, their roadmap, and the migration tax you'll pay when you need to diversify. Staying model‑agnostic avoids that trap.

That's the direction we're building toward with Prism: model routing that already delivers ~20–30% token savings without giving up quality.

Cosmos turns agent interruption into flow

Individual productivity gains from agents are real. Engineers are shipping more code than ever. But the organization doesn't feel 10x more productive, because the work is still structured around the human breaking in at every step. Synthesizing feedback. Triaging in meetings and threads. Researching the problem. Writing the spec. Coding with an agent. Testing with an agent. Reviewing the PR. Shipping and monitoring. That's eight interruptions across one product improvement loop.

The next phase isn't humans doing with agents assisting. It's agents doing with humans steering.

With Cosmos, eight interruptions become three checkpoints.

Post image

Cosmos turns agent interruption into flow

Prioritization. Instead of starting your day in Slack reading the feedback channel, an agent has been there the whole time: monitoring, aggregating signal, finding patterns. By the time you're at your desk, it's proposing the day's priorities. You review.

Two kinds of feedback matter here. The first is obvious: you can correct the priorities themselves. A Slack reply is enough. But the second kind is more important: you can correct the mental model. "Here's how prioritization should work for this kind of feedback going forward." The agent remembers, so tomorrow's prioritization comes back better.

Human leverage shifts from doing the prioritization to teaching the priority function. Once the list is confirmed, Cosmos can launch parallel agents to open PRs or take a stab at the work. Specs come back to you for review, because we believe the future is outcome-oriented, and the human's job is to look at the spec before agents independently go write, test, and review the code.

Contextual understanding via spec and intent review. Code review was an aha moment for us. Every code review tool out there is built on an assumption that's about to be wrong: that a human is reading the code. So they optimize for precision: surface the highest-importance issues, keep the noise down, respect the reader's time.

But if the reviewer is an agent, you don't want precision. You want recall. You want to catch every bug possible. That's a fundamentally different product. We call it deep code review, and it only works because you're not asking a human to read the output.

So how does the human stay in the loop on what's changing? After a lot of experimentation, we landed on an experience where a human works with an agent and a PR to understand how the codebase is evolving: the experience surfaces the places where key assumptions are shifting, the things the human should actually pay attention to. That's how humans maintain an understanding of the codebase, which, as we argued in a recent post, turns out to be essential to shipping with confidence.

Cosmos gets smarter the more you use it

As we were building the primitives for our cloud agents platform, something far more powerful emerged. We realized what we really wanted were specialized agents that remember and get better with feedback, not just agents that can execute a task.

By designing agents differently, giving them shared memory, and enabling the system to update itself, we unlocked agents that worked much more like teammates. The clearest example at Augment is Milo, our tester agent. We tried to load Milo up front with all the context about how testing is done at Augment. Everything we knew, in their context window. It failed. The approach that worked was different. We scoped Milo to be the best testing expert at Augment, and we tuned for continuous learning and memory. When Milo ran a test and stumbled, an engineer on Slack would jump in to coach them. Milo is designed to distill important information from conversations and store it.

The power of specialized agents is that they get better with feedback. Over time they become the best agents for your environment, on a well-scoped task, in a way no off-the-shelf agent ever can be.

Public preview

Cosmos is in public preview today for MAX plan users. It's early. There are rough edges. We're sharing it because we'd rather learn in public with a few teams than polish in private.

Individual adoption is not organizational transformation. If you've been reading along, if you're an eng leader watching individual adoption run ahead of org impact, watching expertise get trapped, watching the review queue grow, we'd love to put Cosmos in front of you and learn together.

Written by

Vinay Perneti

Vinay Perneti

VP of Engineering

As VP of Engineering, Vinay supports product, research, and engineering teams building AI agents that truly understand large, complex codebases. Before joining Augment, Vinay led product and platform organizations at Meta and Pure Storage. He’s drawn to problems that live at the intersection of technology and people, like how teams evolve, how AI reshapes the craft of software engineering, and what it takes to build things that evoke delight.

Chris Kelly

Chris Kelly

Product Lead, Cosmos

Chris leads product initiatives across Augment to make building software better for every developer. Chris has been making developers happier and more productive for 15 years at innovative companies like New Relic, GitHub, Salesforce, and FireHydrant. You can find him at @amateurhuman everywhere on the internet.

Get Started

Give your codebase the agents it deserves

Install Augment to get started. Works with codebases of any size, from side projects to enterprise monorepos.