Skip to content
Install
Back to Guides

How to Build Your AGENTS.md (2026): The Context File That Makes AI Coding Agents Actually Work

Mar 31, 2026
Ani Galstian
Ani Galstian
How to Build Your AGENTS.md (2026): The Context File That Makes AI Coding Agents Actually Work

AGENTS.md is a Markdown file placed at the root of a repository that provides AI coding agents with persistent, project-specific operational guidance: build commands, coding conventions, testing rules, and constraints the agent cannot infer from the codebase alone. Building an effective AGENTS.md requires writing only what agents cannot discover independently, structuring rules for machine parsing rather than human readability, and accepting a measurable inference-cost trade-off that pays off only when the file is human-curated rather than auto-generated.

TL;DR

The central question isn't whether to create an AGENTS.md. It's whether yours will improve agent performance or just add token overhead. The ETH Zurich study found that LLM-generated context files reduced task success rates by approximately 3% on average, increased inference costs by over 20%, and required 2-4 additional reasoning steps. Human-curated files provided a marginal 4% performance gain, but still incurred the same token overhead. This guide covers the structure, content decisions, and modular organization that determines which side of that line your file lands on, plus how tools like Intent address the maintenance problem manual files can't solve at scale.

See how Intent's living specs keep parallel agents aligned across cross-service refactors.

Build with Intent

Free tier available · VS Code extension · Takes 2 minutes

ci-pipeline
···
$ cat build.log | auggie --print --quiet \
"Summarize the failure"
Build failed due to missing dependency 'lodash'
in src/utils/helpers.ts:42
Fix: npm install lodash @types/lodash

Why AI Coding Agents Need a Context File

Every coding agent, whether Claude Code, Cursor, GitHub Copilot, or Codex, starts each session blind to your project's specific conventions. The agent knows how to write Python or TypeScript in general, but it does not know that your team uses Pixi instead of pip, that your API client never throws exceptions, or that the vendor/ directory should never be modified.

Before AGENTS.md emerged as a standard, teams maintained a patchwork of tool-specific files to communicate these constraints. An Augment blog post describes the experience: "Open a typical project that's been through a few months of AI-assisted development. You'll find some combination of CLAUDE.md, .cursorrules, and copilot-instructions.md, AGENTS.md, and maybe a Gemini.md for good measure. Almost the same content in each one. Slowly drifting apart."

The spec repo defines AGENTS.md as "Think of AGENTS.md as a README for agents: a dedicated, predictable place to provide context and instructions to help AI coding agents work on your project." OpenAI helped pioneer the AGENTS.md format for Codex, and in December 2025, it was donated to the Agentic AI Foundation (AAIF), a directed fund under the Linux Foundation, alongside Anthropic donating the Model Context Protocol (MCP) and Block donating Goose.

FilePrimary AudiencePurpose
README.mdHuman developersProject overview, installation, usage
CONTRIBUTING.mdHuman contributorsHow to submit PRs, code style for humans
AGENTS.mdAI coding agentsBuild commands, test runners, conventions, constraints for autonomous agents

The Quality Threshold: What ETH Zurich Found About Context File Effectiveness

The ETH study evaluated multiple coding agents and LLMs across two benchmarks, comparing LLM-generated and developer-written context files and their performance relative to no repository context. The findings challenge two common practices.

LLM-generated context files hurt performance. In 5 out of 8 tested settings, LLM-generated files reduced task success rates. Agents took 2.45 to 3.92 additional steps per task, and inference costs increased by 20% to 23%.

Developer-written context files help, but modestly. Human-curated files outperformed LLM-generated files for all four agents tested, with a gain of roughly 4 percentage points on the AGENTbench benchmark.

Context File TypeCost IncreaseTask Success Change
LLM-generated (auto-init)+20 to 23%−0.5% (SWE-bench Lite) to −2% (AGENTbench)
Developer-written (human-curated)Up to 19% (shorter files, lower cost than LLM-generated)Marginal improvement (AGENTbench)
No context fileBaselineBaseline

A critical follow-up experiment removed all other documentation from the repository before re-evaluating. Under those conditions, LLM-generated files improved performance by 2.7%, confirming the core insight: LLM-generated context files are redundant with existing documentation that agents already access independently. Duplicating that content adds cost without adding signal.

What "Non-Inferable Details" Means in Practice

The study concludes that human-written files should describe only minimal requirements, custom-built commands, and specific tooling choices, while avoiding content that agents can already discover independently.

Content TypeInclude?Reason
Custom build commands not documented elsewhereYesNon-inferable
Highly specific tooling choices (e.g., pixi instead of pip)YesNon-inferable
Codebase overviews and architecture summariesNoAgents find these independently
Anything already in README or existing docsNoRedundant; increases steps and cost

Architectural overviews "do not provide effective overviews," per the study: removing an "Architecture" section while keeping only commands, constraints, and non-standard patterns produces the same agent behavior at a lower token budget.

Core Sections for Every AGENT.md Needs

GitHub analysis and OpenAI docs converge on the sections that consistently improve agent behavior. Each section targets a specific class of agent errors.

Section 1: Stack Definition With Exact Versions

Without version constraints, the agent defaults to whichever API conventions are most represented in training data. The Inngest repo illustrates the principle, specifying versions hard, signaling non-negotiable constraints explicitly:

markdown
## Tech Stack
- Framework: Next.js 15 (App Router + Pages Router hybrid)
- Language: TypeScript
- Package Manager: pnpm (always use pnpm, never npm)
- Node Version: 22.x (required)

Section 2: Executable Commands With Full Flags

Place commands early; the agent references them repeatedly throughout a task. From mcollina/skills:

markdown
## Common commands
- Install deps: `npm install`
- Typecheck: `npm run typecheck`
- Lint: `npm run lint`
- Run tests: `npm test` (alias for `node --test`)
- Run a single test file: `node --test path/to/file.test.ts`
- Run tests matching a name: `node --test --test-name-pattern "pattern"`

Per OpenAI docs, AGENTS.md can specify programmatic checks Codex will attempt to run before finishing a task. These are advisory, not mechanically enforced: agents may skip checks if they judge them unnecessary, so clarity of instruction matters more than assumed compliance.

Section 3: Coding Conventions and Patterns

One real snippet showing your style beats three paragraphs describing it. The most valuable convention to document is the counterintuitive one. The NetCore repo includes this:

text
All client `api`, `apiVoid` and `apiForm` methods never throws exceptions -
it always returns an `ApiResult<T>` which contains either a response for
successful responses or an error with a populated `ResponseStatus`, as such
using `try/catch` around `client.api` calls is always wrong.

Without this, an agent wraps every api call in try/catch. The file explains the mechanism that enables the agent to generalize correctly to novel situations.

Section 4: Testing Rules

From phodal/auto-dev:

markdown
## Testing Guidelines
- Write unit tests for all new functionality
- Mock external dependencies when appropriate
- Ensure tests are deterministic and isolated

For complex build systems, exact commands matter more than guidelines. The CBMC repo includes:

bash
cmake --build build
ctest --test-dir build -V -L CORE -j$(nproc)
cd unit && ../build/bin/unit

Section 5: "Don't Touch" Zones and Permission Boundaries

"Never commit secrets" was the most common helpful constraint across 2,500+ repositories per GitHub analysis. A three-tier system gives the agent an explicit priority hierarchy when rules interact:

markdown
### ✅ Always
- Run linting before committing
- List only human authors in git commits
### ⚠️ Ask First
- Database schema changes
- Adding new dependencies
### 🚫 Never
- Commit secrets or `.env` files
- Force push to main
- Modify content within [protected] blocks

This particular Vercel template's AGENTS.md file begins with an "Architecture Guidelines. Repository Page Structure." section; this file does not start with security rules or use a CRITICAL: severity prefix, though security guardrails are recommended in related docs.

Section 6: Non-Standard Tooling

AGENTS.md delivers the highest ROI for tools underrepresented in LLM training data:

markdown
## Package management
- `pixi run <command>`
- `pixi run python script.py`
- `pixi run pytest`

For standard tools like npm, pytest, or cargo, agents already know the conventions. Focus on what the agent genuinely cannot know.

Tool-Specific Variants: CLAUDE.md, .cursorrules, and copilot-instructions.md

AGENTS.md is converging as a cross-tool standard. Claude Code includes auto-memory, building persistent learning across sessions without manual configuration. A claudeMdExcludes config prevents instruction bleed in large monorepos. Cursor's .cursor/rules/ system uses YAML frontmatter to scope rules by glob pattern. GitHub Copilot uses .github/copilot-instructions.md for repository-wide defaults and path-specific .instructions.md files for targeted rules.

Live session · Fri, Apr 3

Testing Gemini 3.1 Pro on real engineering work (live with Google DeepMind)

Apr 35:00 PM UTC

FeatureCLAUDE.mdCursor .mdccopilot-instructions.windsurf/rules
FormatPlain MarkdownMarkdown with required YAML frontmatterPlain MarkdownPlain Markdown
Multi-file supportYesNot documented as .cursor/rules/Yes, .github/instructions/Not documented as .windsurf/rules/
File-based scopingUser vs. project levelPath-specific .instructions.md
Auto-memoryBuilt-inUnknownYes (Copilot Memory)Yes (Memories)
Agent-decided rule inclusionNoYes (description field)NoNo
AGENTS.md interopNo (uses CLAUDE.md)YesYesUnknown

For multi-tool teams, the symlink pattern keeps files from diverging: "Note: CLAUDE.md is a symlink to AGENTS.md. They are the same file."

Modular Rules: When and How to Split Your Context File

A monolithic AGENTS.md loads every rule into the agent's context on every invocation. Start with a single file; split it into subdirectories when it exceeds 150-200 lines. The maas repo represents the upper bound: a 371-line root file. Beyond that scale, modular organization becomes necessary for token budget reasons.

Place context files at any directory level; the agent reads the file closest to the file being edited:

text
project/
├── AGENTS.md # Root: org-wide standards, global commands
├── apps/
│ ├── web/
│ │ └── AGENTS.md # Web app overrides and additions
│ └── api/
│ └── AGENTS.md # API service overrides and additions
└── infra/
└── AGENTS.md # Terraform/infrastructure rules

Per the Codex spec, more deeply nested files take precedence in case of conflicting instructions.

ConditionApproach
Single root file under 150 to 200 linesMonolithic root file sufficient
Rules exceed 150-200 linesSplit: root for org standards, subdirectory files for specifics
Cross-cutting concerns (security, testing, CI)Cursor .mdc files per concern with glob patterns
Multiple AI tools in useCanonical AGENTS.md + tool-specific symlinks
Enterprise compliance requirementsWindsurf system-level rules + workspace rules

The Cost Tradeoff: Roughly 20% Inference Overhead

The ETH study measured the following overhead across context file types:

MetricValue
Inference cost increase (LLM-generated context files)20 to 23%
Inference cost increase (developer-provided context files)Up to 19%
Reasoning token increase (GPT-series, LLM-generated files)+14% to +22%
Reasoning token increase (GPT-series, human-written files)+2% to +20%

Using Claude Sonnet 4.6 pricing ($3.00/MTok input, $15.00/MTok output) with a baseline agentic task of roughly 50K input tokens and 5K output tokens:

Monthly Task VolumeMonthly Overhead Cost
1,000 tasks~$45
10,000 tasks~$450
100,000 tasks~$4,500

Prompt caching is the primary mitigation; cache reads are 90% cheaper than standard input pricing. The 20% overhead applies regardless of whether the file is auto-generated or human-written. LLM-generated files give negative returns: worse performance at higher cost. Human-curated files yield roughly a 4-percentage-point improvement. Writing manually is worth the overhead. Auto-generating and committing is not.

Failure Patterns That Undermine AGENTS.md

Auto-generated files perform worse than no file. Per the ETH study, LLM-generated files reduced task success rates by 0.5% to 2% while increasing inference costs by over 20%. Rules should respond to observed failure, not be generated speculatively.

Open source
augmentcode/augment.vim614
Star on GitHub

Context file bloat reduces task success. More rules do not produce better performance. Every time an agent makes a mistake, the default reaction is to add another rule. Rules are rarely removed. The file accumulates contradictory patches and one-off fixes, working directly against effective context engineering.

Silent rule dropout in long sessions. Documented Claude Code issues report agents ignoring CLAUDE.md instructions, the "lost in the middle" phenomenon. Keep files short, place critical rules early, and start new sessions for new tasks. Anthropic guide notes that as context grows, agents preserve architectural decisions while discarding redundant tool outputs.

Stale structural references actively mislead. Context files documenting repository structure become liabilities when the codebase changes. Per the ETH study, architectural overviews increased inference cost and encouraged broader file traversal without improving task success.

Explore how Intent's coordinator, implementor, and verifier agents reduce manual reconciliation across long-running work.

Build with Intent

Free tier available · VS Code extension · Takes 2 minutes

Complete AGENTS.md Template

This template synthesizes patterns from OpenAI docs, GitHub analysis, and production repositories, including Vercel Next.js and Inngest repo.

markdown
# AGENTS.md - [Project Name]
## Project Overview
[One sentence: stack, versions, what makes it architecturally non-standard]
## Key Commands
- Install: `npm install`
- Dev server: `npm run dev`
- Build: `npm run build`
- Typecheck: `npm run typecheck`
- Lint: `npm run lint`
- Test all: `npm test`
- Test single file: `npx vitest run src/path/to/file.test.ts`
## Project Structure
- `src/` - application source code
- `src/components/` - React components
- `tests/` - test files (mirror src/ structure)
- See `src/App.tsx` for routing entry point
## Code Style
[Insert one representative code snippet from this codebase here]
- Named exports only, no default exports
- Keep files under [N] lines when possible
## Non-Obvious Patterns
[Document counterintuitive architectural decisions with mechanism explanations]
## Testing Rules
- Write tests for all new functionality
- Tests must be deterministic and isolated
- Mock all external dependencies
- Run `npm test` before marking any task complete
## Boundaries
### ✅ Allowed without asking
- Read files, list directory contents
- Run lint, typecheck, single test files
### ⚠️ Ask first
- Install or remove packages
- Delete files
- Push to git or open PRs
### 🚫 Never
- Commit secrets, `.env` files, or credentials
- Force push to main or protected branches
- Modify `vendor/`, `dist/`, or `build/` directories
## Key Files
- `src/main.ts` - application entry point
- `src/config/` - all environment and feature configuration

Version-control this file and treat updates as code changes. Remove the "Project Structure" section if your directory layout follows framework conventions the agent already knows. The "Non-Obvious Patterns" section is where AGENTS.md delivers the highest signal-to-noise ratio.

From Manual Context Files to Automated Context Management

Manual AGENTS.md files face a fundamental maintenance challenge: context files drift as codebases evolve, and there is no automated way to detect staleness.

When used as an MCP server alongside a manually maintained AGENTS.md, Augment Code's Context Engine is most useful during cross-service refactoring tasks. The Context Engine semantically indexes and maps the codebase across hundreds of thousands of files, maintaining a live understanding of how files connect without requiring manual updates.

Intent takes this further with spec-driven development. Instead of maintaining a static instruction file that agents read before acting, Intent introduces living specs that agents update as they work: "When an agent completes work, the spec updates to reflect reality."

DimensionManual AGENTS.mdIntent's Context Engine
MaintenanceThe developer writes and updates manuallyAgents update the living spec as they work
ScopeSingle Markdown file at repo rootReal-time semantic index across hundreds of thousands of files
Staleness riskRequires manual remediation after refactorsReal-time indexing
Work isolationShared across all workPer-workspace git worktrees
Dependency trackingNot built-inCross-service dependency tracking

Intent's coordinator agent analyzes the codebase, drafts the spec, generates tasks, and delegates to specialist agents. Implementor agents execute tasks in parallel waves. A verifier agent checks results against the spec and flags inconsistencies. The spec auto-updates to reflect what was actually built, addressing the staleness problem that manual AGENTS.md files cannot solve at scale.

Start With a Minimal Number Of AGENTS.md Before Context Drift Sets In

The hardest part of AGENTS.md isn't writing it. It's keeping it accurate as the underlying codebase changes. Non-inferable details, counterintuitive patterns, and custom tooling constraints deliver the highest signal, but they drift fastest as codebases evolve.

Start with the template above. Focus the first version on commands, boundaries, and the one or two architectural decisions that look wrong to an outsider but are intentional. Version-control changes and review them like code. When the maintenance burden outgrows what manual curation can sustain, that's the point where tools like Intent solve a real problem rather than a theoretical one.

See how Intent's living specs and isolated workspaces keep multi-agent development aligned as codebases change.

Build with Intent

Free tier available · VS Code extension · Takes 2 minutes

Frequently Asked Questions about Building AGENTS.md

Written by

Ani Galstian

Ani Galstian

Get Started

Give your codebase the agents it deserves

Install Augment to get started. Works with codebases of any size, from side projects to enterprise monorepos.