Introducing Augment Prism: model routing to reduce cost and maintain quality
Today we're shipping Prism: a new option in the Augment model picker that efficiently routes each turn to the model that fits the work.
On our internal multi-turn coding benchmark, Prism matches the best individual model on quality at 20–30% lower cost per task than frontier models.
Teams sending 10,000 user messages a month can expect to save $20,000 on their token spend, at similar or better quality.
We know that developers and teams have strong preferences for different model families: with Prism, you can stay in the model family you like, at lower cost.
→ Prism (GPT + Kimi) targets GPT 5.5
→ Prism (Claude + Gemini) targets Opus 4.7
Prism's job is to switch only when the expected win from a different model exceeds the cost of the cache eviction.
Augment can do this because we are model agnostic. No single model wins every task, so we give our customers access to all the industry leaders. That's the foundation Prism is built on: a pool of frontier models, with the routing decision made per turn rather than at the start of a session.
Prism is in the picker today: VS Code, JetBrains, CLI (/model), and web.
Billing rolls up under a single Prism line item. The underlying model that handled any given turn isn't surfaced.