Skip to content
Install
Back to Guides

Agent Memory vs. Context Engineering: What Persists Between Sessions and What Doesn't

Apr 17, 2026
Paula Hingel
Paula Hingel
Agent Memory vs. Context Engineering: What Persists Between Sessions and What Doesn't

The key distinction between agent memory and context engineering is the scope of persistence: agent memory determines what information survives between sessions, while context engineering determines what information is loaded into the next session's finite context window. Confusing the two or neglecting either produces agents that forget critical decisions, resuggest rejected patterns and erode developer trust within days.

TL;DR

AI coding agents are stateless between sessions, but the failure mode looks different depending on which persistence layer is missing. If an agent re-derives the same architectural decisions every session, agent memory is missing or not being curated. If it applies team conventions inconsistently across developers, the context file is missing or not shared. If it loses track of a feature's current state mid-build, a living spec is needed. Three persistence layers solve this: static context files encode team-wide knowledge, agent memory captures session-learned decisions with human curation, and living specs maintain evolving implementation intent.

Most AI coding teams hit the persistence problem before they have a name for it. An agent who suggested the right library last week suggests the wrong one today. A teammate's agent ignores the naming conventions your agent finally learned. A multi-day feature build loses its thread every morning and rederives decisions that were already made. None of these failures trace back to a bad model or a weak prompt; they trace back to missing or misplaced persistence.

The discipline splits into two layers that teams routinely conflate. Agent memory is external storage that survives session boundaries: files, databases and vector stores. Context engineering is the practice of selecting what gets loaded from that storage into the model's finite context window for the current session. Both are required, and they solve different problems. Adding more memory without engineering how it loads produces agents that drown in stale context. Engineering context precisely, without populating memory, produces agents that start fresh every session, regardless of how carefully the window is managed.

Intent addresses the third layer that neither memory nor context engineering covers: evolving feature state. Living specs auto-update as agents complete work, propagating changes to all active agents without requiring a developer to manually synchronize anything. This guide provides the decision framework for placing information in the layer that matches its scope, stability, and audience.

See how Intent's living specs keep agents aligned across sessions without manual synchronization.

Build with Intent

Free tier available · VS Code extension · Takes 2 minutes

ci-pipeline
···
$ cat build.log | auggie --print --quiet \
"Summarize the failure"
Build failed due to missing dependency 'lodash'
in src/utils/helpers.ts:42
Fix: npm install lodash @types/lodash

The Persistence Problem: Agents Forget Everything Between Sessions

LLMs are stateless, and each conversation starts with no memory of previous interactions. An agent who chose RS256 tokens on Monday has no recollection of that decision by Wednesday. The developer who corrected an agent's import pattern last week will correct it again this week, and again the week after.

The persistence problem surfaces differently depending on the development scenario:

ScenarioWhat Breaks Without PersistenceLayer That Solves It
Multi-day feature buildAgent re-derives Monday's architectural choices on TuesdayLiving spec + agent memory
Team handoffNew developer's agent ignores prior decisionsContext files + promoted rules
AI model swapNew model ignores project-specific conventionsContext files (model-agnostic)
Context window overflowCritical decisions evicted mid-sessionContext engineering (compression + prioritization)
Weekend/holiday breakAgent loses all working state from FridayAgent memory + living spec resume point

Agent amnesia is a persistent problem. Context engineering exists because an agent running in a loop generates increasingly large amounts of data that could be relevant to the next turn of inference, and this information must be cyclically refined, as described in Anthropic's context engineering research.

The distinction between agent memory and context engineering clarifies where that solution lives:

ConceptDefinitionPersistence
Agent memoryExternal storage that survives session ends: files, databases, vector storesPersists across days, weeks, months
Context engineeringThe discipline of selecting, compressing, and injecting memory into the model's finite context windowOngoing, multi-turn, session-spanning management
Short-term memoryIn-session working state: recent messages, tool outputs, temporary variablesEphemeral; lost on session end

Memory is the library. Context engineering is the librarian who decides which books to put on the desk for this session. Without both layers working together, memory without context engineering becomes hoarding, and context engineering without memory becomes amnesia.

Agent MemoryContext Engineering
GoalRetain decisions across sessionsLoad the right subset into the next prompt
Failure modeStale or bloated recallWrong information loaded, right information excluded
PersistenceDays to monthsPer-turn, ephemeral selection
Human roleCurate what's storedDesign what's loaded
CostStorage + retrieval latencyToken budget per session

The Three Persistence Layers: Context Files, Agent Memory, Living Specs

Effective agent persistence requires three distinct layers, each handling a different scope of context.

LayerMechanismWhat It CapturesScope
Context filesAGENTS.md, .augment-guidelines, rulesStatic operational knowledge, coding standards, build commandsTeam-wide, version-controlled, permanent
Agent memoryAuto-captured memories with human curationSession-learned decisions, corrected patterns, project goalsPer-developer, cross-session, reviewable
Living specsIntent's spec documentsEvolving implementation intent, task completion state, architectural decisionsFeature-scoped, auto-updating, agent-readable

The persistence matrix shows which artifacts survive session boundaries and how they reach the model:

ArtifactPersists Across Sessions?Stored WhereHow It's Loaded
Chat historyNoEphemeralOnly within active session
Decision logs (agent memory)YesMemory systemInjected at session start via context engineering
AGENTS.md / rulesYesRepositoryAuto-discovered by directory traversal
Living specYesSpec documentAuto-injected for relevant tasks
Tool outputsNoEphemeralOnly within active session
Vector embeddings (Context Engine)YesSemantic indexRetrieved by semantic search

What This Looks Like in Practice

  • Auth module (multi-day build): Without persistence layers, Tuesday's session defaults to HS256 because the agent has no memory of Monday's RS256 decision. A teammate's agent suggests Lodash for token parsing because the "use jose library" decision was never shared. With three-layer persistence, AGENTS.md captures "use jose library, not jsonwebtoken." Agent memory stores "RS256 with rotating key pairs chosen for compliance reasons." The living spec tracks task completion status. Tuesday's agent loads all three layers and continues from where Monday left off.
  • Team handoff: Without persistence, the replacement developer's agent re-proposes approaches already evaluated and discarded. With three-layer persistence, AGENTS.md captures team conventions; agent memory captures "tried Redis Streams for event bus, switched to NATS due to backpressure handling"; and the living spec shows which tasks are complete, in progress, and blocked.
  • Context window overflow: Without persistence, critical early decisions are evicted from the context window as the session grows, causing the agent to contradict its own earlier reasoning. With context engineering, Intent's Context Engine retrieves relevant code semantically, agent memory stores key decisions outside the context window, and living specs maintain task state without consuming session tokens.

Context Files: Team-Wide, Static, Manually Curated

Context files like AGENTS.md and workspace rules are a widespread practice for tailoring AI coding agents to repositories, as documented in research on repository context files.

What Belongs in Context Files

A piece of information belongs in a static context file if it passes two conditions simultaneously: it is undiscoverable (the agent cannot infer it from reading the codebase) and it is universal (it applies to virtually every task in the project). Information that passes both conditions includes build and test commands, conventions that contradict defaults, environment constraints the agent cannot observe, and behavioral boundaries such as always/ask-first/never rules.

markdown
# AGENTS.md
## Key Commands
- Install: pnpm install
- Dev server: pnpm dev
- Test single file: npx vitest run src/path/to/file.test.ts
## Always / Ask First / Never
- Always: run typecheck before marking a task complete
- Ask first: changes to authentication flow, database schema migrations
- Never: modify src/config/production.ts directly

When using Intent's hierarchical rule discovery, the system walks up from the current file's directory, including any AGENTS.md and CLAUDE.md files found along the path. Teams working in monorepos can place an AGENTS.md inside each package; agents automatically read the nearest relevant file in the directory tree, so the closest one takes precedence.

What Does NOT Belong in Context Files

Anthropic's Claude Code docs recommend keeping context files concise because longer files consume more context and reduce adherence. Three categories do not belong: anything the agent can discover by reading the codebase, task-specific context (feature requirements belong in living specs), and code snippets that can become outdated.

Agent Memory: Session-Learned, Auto-Captured, Reviewable

Context files capture what teams already know at the start of a project. Agent memory captures what teams learn during work: debugging decisions, corrected patterns, project goals mentioned in conversation, and constraints discovered through iteration.

The Curation Problem

Every auto-memory system has the same core tension: automatic memory capture reduces developer burden, but uncurated memory degrades agent performance. Without review, auto-captured memories accumulate stale assumptions that contaminate future context, causing the agent to apply wrong approaches with apparent confidence.

The practical mitigation is to treat the approval queue as a lightweight end-of-session ritual: batch pending reviews at session end, when context is freshest, and approve, edit, or discard in a single pass. Memories reviewed in context are more accurately curated than memories reviewed in isolation days later.

Memory Review: Curating What Agents Remember

Augment Code's Memory Review, released September 8, 2025, lets developers review, edit, and curate memories as they are created through an approval workflow. Memory creation triggers include long-term project goals mentioned in chat, decisions made during debugging, relevant code or system details, and developer corrections to agent output.

Each memory entry includes a source field: Source: Agent (proposed by the agent) or Source: Correction (triggered when the developer corrects the agent's output). The curation flow:

  1. During the conversation, the agent proposes a memory (draft state)
  2. The IDE shows a "memories pending review" panel
  3. Per memory, the developer chooses: Save, Edit, or Discard
  4. Nothing gets stored without the developer's sign-off

Promoting Memories to Team Rules

Saved memories can be promoted to workspace Rules, making individual learning available team-wide:

TierMechanismScopeApproval
MemoriesPer-developer, cross-sessionIndividual workspacePer-memory human approval
Rules (.augment-guidelines)Team-wide, repo-committedAll developers, all sessionsCommitted to version control

The two-tier structure creates a pipeline: individual developers learn from their sessions, curate what matters, and promote patterns the entire team should follow.

Tools like Intent's living specs auto-update as agents complete work, keeping every human and agent aligned.

Build with Intent

Free tier available · VS Code extension · Takes 2 minutes

Living Specs: Feature-Scoped, Auto-Updating, Agent-Readable

Context files permanently encode what a team knows. Agent memory captures what a developer learns session by session. Neither handles evolving implementation intent that changes as a feature takes shape over days or weeks.

Why Memory Files Cannot Replace Specs

Static context files and agent memories both fail for feature-level persistence because they lack two properties that active development requires: bidirectional updates and structured task tracking. Static documentation files capture what the team knew at the start of a project, but not what the team learned during a given session. Manual session-end update workflows are fragile because they depend on the developer remembering to record the right details.

Open source
augmentcode/augment-swebench-agent868
Star on GitHub

Intent's living specs solve this by functioning as a coordination layer: the spec auto-updates as agents complete work and propagates changes to requirements to active agents.

markdown
### JWT Authentication, Cross-Service Implementation
**Architecture**
- auth-service: token issuance, refresh, revocation
- api-gateway: JWT validation middleware, rate limiting
- Signing: RS256 with rotating key pairs
**Tasks**
✓ Set up RS256 key pair generation
✓ Implement token issuance endpoint
/ JWT validation middleware in gateway [in progress]
○ Rate limiting on /auth/token
○ Integration tests across services

When an agent completes a task, the spec updates automatically. When a requirement changes, the update propagates to all active agents working in parallel.

Decision Framework: What Belongs Where

The following decision flow determines the correct persistence layer for any piece of information:

  1. Can the agent discover this by reading the codebase? Yes → Do not store it. No → Continue.
  2. Does this apply to every task in this project? Yes → Context file (AGENTS.md, workspace rules). No → Continue.
  3. Is this stable over weeks or months? Yes → Agent memory (promote to rules if team-wide). No → Continue.
  4. Is this specific to the current feature, and is it actively changing? Yes → Living spec. No → Agent memory.
Information TypeLayerExampleWhy This Layer
Build commandsContext filepnpm test --coverageUniversal, stable, undiscoverable
Naming conventionsContext file"Named exports only, no defaults"Team-wide standard
"Use Redis, not Memcached"Agent memory → promote to rulesDecision made during debuggingLearned mid-session, applicable to future tasks
"Amex CVV support pending product confirmation"Living specOpen question blocking implementationFeature-scoped, will resolve and change
Task completion statusLiving spec"✓ token issuance, / validation middleware"Changes daily during active development
Auth module retry logic warningContext file"Don't refactor src/auth/retry.js"Gotcha that applies across all tasks

OpenAI Codex documentation recommends a clear trigger for promoting information from agent memory to a context file: when the agent makes the same mistake twice, conduct a retrospective and update AGENTS.md with the resulting guidance.

What Breaks When Information Lives in the Wrong Layer

Anti-PatternWhat Happens
Feature decisions in AGENTS.mdContext file grows past 200 lines; agent adherence drops as token consumption increases without proportional guidance improvement
Team conventions in memory onlyIndividual developer's agent follows the pattern; teammate's agent suggests the wrong library because the memory isn't shared
Evolving feature intent in static filesSpec rot: the file says one thing, the code does another, and agents read the stale version
Full chat history as memoryMostly noise; failed attempts, superseded decisions, and clarifying questions all treated as current truth
No persistence layer at allAgent reinvents conventions and resurfaces already-rejected approaches every session

Intent's Context Engine processes 400,000+ files through semantic dependency analysis, providing architectural understanding across entire codebases. But even with deep codebase indexing, the Context Engine cannot infer undocumented team decisions, operational procedures, or evolving feature intent. Those require explicit persistence through the right layer.

Classify Your Persistence Layer Before the Next Session Reset

The practical next step is to identify which persistence failure is currently costing the most rework and address that layer first. Re-derived decisions every session mean the agent's memory is missing or not being curated. Inconsistent conventions among developers mean team conventions are stored in individual memory rather than in a shared context file. A feature state lost between sessions means a living spec is needed.

If the issue affects every task, add it to AGENTS.md or the workspace rules. If it was learned during work and needs review, keep it in the agent's memory. If it changes as a feature evolves, track it in a living spec so every active agent stays aligned.

Intent fits the third case directly: living specs keep evolving plans, tasks, and implementation state synchronized across sessions and across agents.

Build with Intent

Free tier available · VS Code extension · Takes 2 minutes

Frequently Asked Questions About Agent Memory and Context Engineering

Written by

Paula Hingel

Paula Hingel

Paula writes about the patterns that make AI coding agents actually work — spec-driven development, multi-agent orchestration, and the context engineering layer most teams skip. Her guides draw on real build examples and focus on what changes when you move from a single AI assistant to a full agentic codebase.

Get Started

Give your codebase the agents it deserves

Install Augment to get started. Works with codebases of any size, from side projects to enterprise monorepos.