Leaked system prompts for 28+ AI coding tools hit 134K GitHub stars

Three things worth knowing

A single GitHub repo has compiled the internal system prompts of 28+ major AI coding tools, including Cursor, Windsurf, Claude Code, Augment Code, and Devin AI, and it now has 134K stars.
For the first time, developers can read the actual instructions each tool sends to its model, rather than relying on marketing copy or guesswork. In practice, this changes how you evaluate and build with these tools.
This also raises a real security question for AI startups: if your prompts are this extractable, how are you protecting them?

Screenshot of the x1xhlol/system-prompts-and-models-of-ai-tools GitHub repository, showing 134k stars, 33.7k forks, and a directory listing of 28+ AI tools including Cursor, Windsurf, Augment Code, Claude Code, and Devin AI.

There's a GitHub repository I keep coming back to whenever someone asks me how to evaluate AI coding tools. It collects the raw system prompts of nearly every major AI coding assistant, and it just crossed 134,000 stars and 33,700 forks.

Maintained by developer Lucas Valbuena, the repo exposes how tools like Cursor, Windsurf, Claude Code, Augment Code, and Devin AI actually instruct their underlying models. If you've ever wondered why one tool refuses a certain request or formats output differently from another, this is where you go to find out. It's the closest thing to a public teardown of the competition that I've seen, and I think most developers are sleeping on it.

What Happened

The repository system-prompts-and-models-of-ai-tools has been accumulating extracted system prompts since early 2025. As of its latest update on March 28, 2026, it has 489 commits across 28 contributors and covers more than 28 distinct AI tools.

What makes this more useful than most repos like this is the depth. The JSON tool schemas are in there too, not just the prompt text. That's the part that actually shows you what a tool has been given permission to do. That combination is rare, and it's what makes comparison meaningful rather than superficial.

What stood out to me is that it's still being actively updated. Recent commits include changes to v0's prompt (March 8, 2026) and Anthropic's Claude Sonnet 4.6 prompt (March 4, 2026). That cadence matters; a snapshot from a year ago would tell you very little about how these tools behave today.

Key Features

Full prompt text for 28+ tools. Each tool has its own directory with raw system prompts, sometimes across multiple versions. Cursor includes an "Agent Prompt 2.0," which alone is worth reading if you're building anything agent-based.
Tool and function call schemas. Several entries go beyond the prompt text to include the JSON definitions of internal tools. Windsurf's entries run through "Tools Wave 11," and Augment Code includes GPT-5 tool definitions. I find the schema files more revealing than the prompts themselves; they show you what capabilities a vendor actually prioritized shipping.
Version history. With 489 commits, you can diff the prompts to see how they have changed over time. This is where it gets interesting: watching how vendors quietly adjust their instructions tells you a lot about where they're struggling, iterating, or quietly rolling back decisions.
Full coverage across tool types. IDE agents (Cursor, Windsurf, VSCode Agent, Xcode), autonomous agents (Devin AI, Manus), and app builders (Lovable, Replit, v0) are all represented. That breadth is what makes it a real reference, not just a curiosity.

Why It Matters

System prompts are the hidden layer that actually shapes how an AI coding tool behaves. They define what the model prioritizes, what it refuses, how it formats output, and which tools it can call. Before repos like this surfaced, there was no real way to compare that across products. You were just taking vendors at their word.

The real shift here is that tool evaluation no longer has to be purely experiential. You don't have to spend weeks with each product to form a view; you can read what each tool actually sends to its model and make a more informed call before you commit.

You can see whether a tool defaults to cautious refusals or aggressive code generation, whether it has file system access, and how it handles multi-step tasks.
For anyone building AI-powered developer tools, these prompts are a reference architecture. Real patterns for tool-use schemas, context management, and agent behavior that have already shipped to millions of users.
The version history layer makes it a living record, not a static snapshot, which means it gets more useful over time, not less.

There's also a prompt-security angle I'm seeing come up more often in conversations. The repo maintainer flags that exposed prompts can become attack surfaces and links to a service called ZeroLeaks that aims to identify extraction risks. A year ago, this felt theoretical. I don't think it does anymore, and if you're an AI startup that hasn't thought about prompt extraction, this repo is a good reminder to start.

Example Use Case

Say you're building a VS Code extension that uses Claude to assist with TypeScript refactoring. You need to decide how to structure your system prompt: what context to include, how to handle file references, and what tool calls to expose. This is exactly the kind of problem I'd point someone to this repo for.

Instead of starting from scratch, open the Cursor Prompts and VSCode Agent directories side by side. See how Cursor's Agent Prompt 2.0 structures tool definitions versus how the VSCode Agent approaches the same problem. Borrow the patterns that fit, skip the ones you've already seen cause issues in tools you've tested yourself.

The Cursor Prompts directory in the system-prompts-and-models-of-ai-tools repo, showing multiple versioned prompt files including Agent Prompt 2.0, v1.0, v1.2, a CLI prompt, Agent Tools JSON, and a Chat Prompt.

What this makes easier is skipping the trial-and-error phase that eats up most of the time in early prompt engineering. You're not guessing at best practices, you're reading what's already working at scale.

Competitive Context

Reading these prompts side by side, a few things stand out that don't show up in any product comparison chart.

Open source

augmentcode/augment.vim★610

Star on GitHub

Cursor and Windsurf both ship with detailed agent prompts and extensive tool schemas. Windsurf's collection, running through 11 "waves" of tool definitions, suggests a team that has been aggressively iterating on its function-calling surface, likely in response to real failure modes they were seeing in production. Cursor's second-generation agent prompt points to a meaningful rewrite, not just incremental tuning. That kind of structural change usually signals a core UX problem they were trying to fix.

Augment Code's inclusion of a GPT-5 tools JSON file is the kind of detail I wouldn't have caught without this repo. It confirms multi-model provider support in a way the product page doesn't surface directly, which likely means more flexibility in how Augment routes tasks based on model availability or cost.

Devin AI's DeepWiki prompt points to a knowledge-retrieval layer beyond basic code generation. This suggests Devin is investing more in structured context than in raw generation, which makes sense for an autonomous agent that needs to reason across large codebases, not just complete single-file tasks.

Claude shows up both as a standalone model (with Sonnet 4.6 entries) and as the backend for several other tools in the collection. That's consistent with what I'm seeing in the broader ecosystem. Claude is increasingly the default inference layer that other products build on, rather than a standalone product.

My Take

If you're not reading these prompts, you're missing the point about how these tools actually work. Marketing pages tell you what a tool can do. System prompts tell you how it's been instructed to do it, and those are very different things.

For developers choosing among Cursor, Windsurf, Augment Code, or Devin, this repo offers a signal that no feature comparison chart can match. For teams building AI dev tools, it's the reference library I wish existed when I was starting out. And for AI startups still treating their prompts as proprietary secrets: they're probably not as protected as you think, and this repo is evidence of that.

[ Coming up next ]

The New Code Review Workflow for AI-Native Engineering Teams

See how leading teams keep code review fast and rigorous as AI writes more of the code.

Save your seat

— Thu, Jul 9 // 9:45 AM PDT