Garry Tan open-sources gstack: what developers should know

Three things worth knowing

Y Combinator CEO Garry Tan open-sourced his personal Claude Code configuration as a toolkit called gstack, and it hit 66K GitHub stars within weeks.
It packages 23 specialist skills and 8 power tools into slash commands that run inside Claude Code and seven other AI coding agents.
This is a structured bet that opinionated prompts, not custom tooling, are the right abstraction layer for AI-assisted development.

When a YC CEO open-sources his personal AI coding setup and it hits 66,000 GitHub stars in a matter of weeks, I pay attention. That's what happened with gstack, Garry Tan's MIT-licensed toolkit that turns Claude Code into what he describes as "a virtual engineering team."

The repo packages 23 opinionated tools that serve as CEO, Designer, Eng Manager, Release Manager, Doc Engineer, and QA roles, all running as slash commands inside Claude Code and seven other AI coding agents. Tan claims the setup let him ship 600,000+ lines of production code in 60 days while running YC full-time. I don't take that claim at face value, but the architecture behind it is worth understanding regardless.

The garrytan/gstack GitHub repository showing 66K stars, 9.1k forks, 35 contributors, and a directory listing of skill folders including agents, codex, design, and careful.

What Happened

Tan released gstack as a public MIT-licensed repository, largely co-authored with Claude Opus 4.6, as evidenced by the commit history. As of April 6, 2026, the project is at v0.15.14.0 with 204 commits, 33 contributors, and 9,100 forks.

What I'd highlight here is the pace of adoption relative to how specific this tool is. Most repos that go viral are broad and approachable. gstack is opinionated and workflow-heavy, which tells me the 66K stars reflect a real pain point: developers are tired of rebuilding the same process scaffolding around their AI tools on every project.

The repo itself is a good signal too. It was largely co-authored with Claude Opus 4.6, as reflected in the commit history. That's a meaningful data point about what this kind of workflow actually looks like in practice.

Key Features

23 specialist skills in a sprint structure. The workflow follows a think, plan, build, review, test, ship, and reflect sequence. Each skill feeds output into the next, so you're not just prompting in isolation.
Real browser testing via /qa. Launches Playwright-based Chromium, clicks through flows, finds bugs, generates regression tests, and commits fixes with atomic commits. This is the feature I'd evaluate first if your team is doing any frontend work.
Multi-agent support across 8 hosts. Works with Claude Code, OpenAI Codex CLI, Cursor, OpenCode, Factory Droid, Slate, Kiro, and OpenClaw. Adding a new host requires one TypeScript config file, which is a low bar.
Team install mode. Running ./setup --team auto-updates gstack at session start via a SessionStart hook, throttled to once per hour. No vendored files end up in your repos.
Cross-model review with /codex. Gets an independent code review from OpenAI's Codex CLI, then generates a cross-model analysis showing overlapping and unique findings. The multi-model angle is underrated here.
Safety guardrails. /careful warns before destructive commands, /freeze locks edits to one directory, and /guard activates both. /investigate auto-freezes to the module being debugged.

Why It Matters

The core bet gstack makes is that structured prompts, not custom tooling, are the right abstraction layer for AI-assisted development. Every skill is a Markdown file. There's no proprietary runtime. The entire system runs on Claude Code's existing skill mechanism, which means there's nothing to maintain beyond the prompts themselves.

For teams already using Claude Code, that's a meaningful starting point. You're not adopting new infrastructure, you're adopting a process.

The multi-host architecture is the part I keep coming back to, though. The declarative HostConfig system means the same skill templates generate output for Claude Code, Codex, Cursor, and seven other agents from a single source. As teams increasingly use multiple AI coding tools across a codebase, having a single configuration layer that spans them all is a real practical win.

The /autoplan command chains CEO review, design review, and engineering review into a single pipeline. The /ship command bootstraps test frameworks if none exist and enforces coverage audits. These are opinions encoded as prompts. Skip the ones that don't fit your team.

Example Use Case

A TypeScript team using Claude Code and Codex installs gstack globally with ./setup --team, then bootstraps their repo with gstack-team-init. Every developer's Claude Code session auto-updates gstack silently from that point on.

A developer runs /office-hours to describe a new API endpoint. It writes a design doc. /autoplan chains through CEO, design, and engineering reviews. After implementation, /review auto-fixes lint issues and flags a race condition. /qa opens a real browser, tests the endpoint through the UI, and commits a regression test. /ship syncs main, runs tests, and opens the PR.

This is the workflow I'd walk through with a team that's been asking, “How do we actually structure AI-assisted development?” The answer here is concrete enough to evaluate in an afternoon.

Competitive Context

gstack sits in a growing space of Claude Code skill packs and prompt libraries. What I find most distinct about it is its scope. Most alternatives focus on code generation or review. gstack covers the full sprint lifecycle from ideation through deployment, which is a different level of ambition.

Open source

augmentcode/augment.vim★608

Star on GitHub

The /codex skill is worth calling out specifically. It explicitly bridges Claude and OpenAI's Codex CLI, treating multi-model review as a first-class feature rather than a platform choice. That's a clear signal that Tan isn't optimizing for one vendor, he's optimizing for the best output regardless of where it comes from.

Compared to using Claude Code with custom CLAUDE.md instructions alone, the key difference is structured handoffs. The /office-hours design doc is read by /plan-eng-review, which writes a test plan picked up by /qa. That chain is what separates gstack from ad hoc prompting, and it's the hardest thing to replicate without a framework like this.

My Take

gstack is a free, MIT-licensed skill pack that encodes one developer's opinionated workflow into 23 specialist skills and 8 power tools. For teams that want structured AI-assisted sprints rather than just code completion, this is worth an hour of evaluation time.

The 30-second install and /office-hours command are the fastest way to find out if the opinions match yours. At 66K stars and 9,100 forks, enough developers have found them worth adopting that I'd at least run the install before writing your own process scaffolding from scratch.

[ Coming up next ]

The New Code Review Workflow for AI-Native Engineering Teams

See how leading teams keep code review fast and rigorous as AI writes more of the code.

Save your seat

— Thu, Jul 9 // 9:45 AM PDT