Garry Tan's gstack hits 89.7K stars: what developers should know

Three things worth knowing

Y Combinator CEO Garry Tan open-sourced his personal Claude Code setup as gstack, and it hit 89.7K GitHub stars in under two months.
It packages 23 opinionated skills that serve as CEO, Designer, Eng Manager, Release Manager, Doc Engineer, and QA roles, running as slash commands inside Claude Code and nine other AI coding agents.
The real bet here is that AI coding agents need a process and not just prompts. gstack is the most concrete expression of that I've seen from someone shipping at this scale.

When a YC CEO open-sources his personal AI coding setup and it hits 89,700 GitHub stars in under two months, I pay attention. gstack is opinionated, workflow-heavy, and specific to one person's development philosophy. That kind of specificity usually limits adoption. Here, it drove it. Developers aren't staring this out of curiosity. They recognize the problem it's solving in their own work.

The repo packages 23 tools that serve as CEO, Designer, Eng Manager, Release Manager, Doc Engineer, and QA roles inside Claude Code and nine other AI coding agents. Tan claims the setup lets him run 10-15 parallel sprints simultaneously. The architecture behind it is worth understanding, regardless of whether those numbers hold in your environment.

The garrytan/gstack GitHub repository showing 89.7K stars, 13.2K forks, and a directory listing including agents, browse, careful, and codex folders.

What Happened

Garry Tan published gstack as a collection of Claude Code skills. The latest release is v1.26.3.0 (May 4, 2026), which added a /sync-gbrain skill and a native code-surface orchestrator. The repo has 260 commits across 237 branches and 49 contributors. Most commits are co-authored by Claude Opus 4.7, which is worth noting: this project was built using the same workflow it's trying to teach.

It runs on Bun and TypeScript, installs via a single git clone, and supports Claude Code, OpenAI Codex CLI, Cursor, and seven other runtimes.

Most repos that go viral are broad. gstack is a single developer's complete development philosophy encoded as 23 skills. 89.7K stars says a lot of developers share that philosophy, or at least want to.

Key Features

23 slash-command skills in a sprint structure: Organized around a full development lifecycle: /office-hours for product thinking, /plan-ceo-review for strategic scope, /plan-eng-review for architecture, /review for code quality, /qa for browser-based testing, and /ship for release management. Each skill feeds output into the next. You're running a pipeline, not prompting in isolation.
Real browser automation: A built-in Playwright-powered browse server ($B commands) runs at roughly 100ms per command, with anti-bot stealth, cookie import from Chrome, Arc, and Brave, and a Chrome extension sidebar. This is the feature I'd evaluate first if your team does any frontend work.
Multi-agent coordination via /pair-agent: Lets Claude Code, OpenClaw, Codex, or any curl-capable agent share a browser session with scoped tokens, tab isolation, and rate limiting.
ML prompt-injection defense: A layered classifier using TestSavantAI BERT-small, an optional DeBERTa-v3 ensemble, Haiku transcript check, and canary-token scans of page content and tool outputs before they reach the agent. Most teams haven't thought seriously about prompt injection at the agent level. gstack has.
Cross-platform support: Full Mac and Linux coverage with a curated Windows CI lane covering ~50% of the free test suite. Skills are installed to host-specific paths for each supported agent.
Persistent memory via GBrain: /setup-gbrain and /sync-gbrain handle code indexing, cross-machine sync, and per-repo trust policies covering read-write, read-only, and deny. This is what makes gstack feel like infrastructure rather than a config pack.

Why It Matters

Most teams I see using AI coding agents run them like autocomplete. One prompt, one response, no structure. That works for small tasks. It falls apart on anything multi-step.

gstack's answer is process. /office-hours produces a design doc that /plan-eng-review reads. /review catches bugs that /ship verifies are fixed before the PR opens. Each stage has a defined input and output. According to the README, Tan runs 10-15 parallel sprints using Conductor workspaces with gstack providing the guardrails.

The model-specific overlays for Opus 4.7 behavior differences and the workspace-aware version allocation are what I find most instructive. That's what a mature production AI setup looks like. A structured pipeline with defined handoffs and specific decisions about which model handles which stage.

Example Use Case

A TypeScript team shipping a new API endpoint runs the full gstack pipeline in a single session. They start with /office-hours to pressure-test the feature design, then with /plan-eng-review to lock in the architecture and generate a test matrix. After building, /review catches bugs and auto-fixes obvious issues. /qa https://staging.example.com opens a real Chromium browser, clicks through the new endpoint's UI, and verifies behavior. /ship syncs main, runs tests, audits coverage, and opens a PR.

Every step produces a persistent artifact: design docs, test plans, coverage audits. The next session picks up where the last one left off. That's the part I'd demo to a skeptic.

Competitive Context

gstack requires Claude Code as its host environment. The 23 skills are written as Markdown templates (SKILL.md files) that Claude interprets at runtime, which keeps the surface area intentionally small. The project does one thing well rather than trying to cover every runtime equally, and the quality shows.

Open source

augmentcode/auggie★209

Star on GitHub

The /codex skill uses OpenAI's Codex CLI as a second opinion reviewer. I find that more interesting than most multi-model features I've seen. Treating model disagreement as a quality signal, rather than a configuration problem, is a smarter approach.

Compared to ECC and GSD, gstack is the most opinionated. ECC standardizes agent configuration across tools. GSD manages context and spec-driven planning. gstack encodes a complete development philosophy. Teams aligned with Tan's approach get a substantial head start. Teams with different workflows will want to fork it and adapt rather than adopt it wholesale.

My Take

Study the repo before you install it. The 23 skills and the sprint structure are worth reading, even if you build your own version. The prompt injection defense and the GBrain memory layer are decisions most teams haven't made yet, and gstack shows one concrete way to make them.

At 89.7K stars and 13.2K forks, the community weight is real. Fork it, strip out what doesn't fit your stack, and keep the structure that does.

gstack proves process beats prompts. Cosmos is built on that same idea for entire engineering teams.

See Cosmos in action

Free tier available · VS Code extension · Takes 2 minutes