Three things worth knowing
- claude-mem is an open-source persistent memory plugin for AI coding agents, now at 74.8K stars and 6.4K forks.
- Every new Claude Code session starts cold. claude-mem captures what happened in previous sessions, compresses it with AI, and injects relevant context automatically when a new session starts.
- v13.1.0 adds a server-beta runtime backed by Postgres and BullMQ, which means persistent memory now scales to team deployments, not just individual developers.
Close a Claude Code session. Open a new one. The architectural decisions from yesterday, the bug you tracked down, and the files you refactored are no longer available. You're explaining your own codebase to an agent that was there for the whole thing.
Every developer I talk to accepts this as normal. It shouldn't be.
The session amnesia problem compounds with every tool you add. More agents, more sessions, more re-explaining. The capability of the models keeps growing. The memory layer doesn't come with them. claude-mem is the most adopted community fix I've seen for this, and at 74.8K stars and 6.4K forks, a lot of developers have clearly hit the same wall.

What Happened
Developer Alex Newman shipped claude-mem v13.1.0 on May 11, 2026. The project has 1,895 commits, 269 releases, and 109 contributors. The commit history shows extensive co-authorship with Claude: the tool was built using the same workflow it serves.
The headline in v13.1.0 is the server-beta runtime: Postgres storage, BullMQ job queues, API key scoping, and audit trails. Generation workers scale horizontally via Docker Compose. Multiple developers can share a single Postgres-backed memory backend with tenant isolation. Each project's observations, sessions, and generation jobs stay strictly separate.
The license also changed from AGPL-3.0 to Apache-2.0. For teams embedding claude-mem in proprietary systems, that removes the open-source deployment obligation. It's a quiet signal about where the project is headed.
Key Features
- Agent-agnostic memory layer: Works with Claude Code, Codex, Gemini CLI, Copilot, OpenCode, OpenClaw, Windsurf, Cursor, and Hermes via lifecycle hooks and plugin integrations.
- AI-compressed observations: A background worker captures tool usage, generates semantic summaries via a second AI call, and stores them in SQLite with FTS5 full-text search.
- Hybrid search with Chroma vector DB: Combines keyword and semantic retrieval, so context injection pulls the most relevant past observations, not just the most recent.
- Three-layer progressive disclosure: Search returns compact index entries (~50-100 tokens each), timeline adds chronological context, and
get_observationsfetches full details only for filtered IDs. Roughly 10x token savings compared to fetching everything upfront. - Privacy controls: Wrap sensitive content in
<private>tags to exclude it from storage and AI summarization. All data stays local by default. - One-command install:
npx claude-mem installhandles Bun, uv, plugin registration, and worker startup. IDE auto-detection covers Claude Code, Cursor, Windsurf, Gemini CLI, and OpenCode.
To get started, run:
Why It Matters
AI coding agents can now perform multi-file refactors, bug investigations, and architectural changes. The problem is that capability and memory don't grow together. The more complex your workflow, the more it costs to re-explain context every session.
What I find significant about claude-mem's traction is the scope. 74.8K stars on a memory plugin says the session amnesia problem is widespread, and that developers aren't waiting for model vendors to solve it. They're building the fix themselves.
The server-beta runtime is the part worth paying attention to for teams. Individual developers already had the local SQLite mode. Postgres with tenant isolation allows teams to pool memory across developers while maintaining project-level isolation. That's shared agent memory as infrastructure, not a personal config. I haven't seen many open-source projects get there this quickly.
Example Use Case
A backend engineer works on a Node.js API with Express and Postgres. On Monday, they use Claude Code to refactor authentication middleware, switching from session cookies to JWT. claude-mem captures every file read, edit, and test run. The worker compresses these into a summary like:
On Wednesday, they open a new Claude Code session to add role-based access control. The SessionStart hook injects the compressed context from Monday. Claude already knows the app uses JWT, where the middleware lives, and which routes have changed. The re-explanation doesn't happen.
That time saving is small per session. Across a week of multi-session development work, it compounds.
Competitive Context
- Claude Code: Ships with a
.claude/directory for per-project instructions. The limitation is that these are static files written and maintained by the developer by hand. claude-mem generates and updates them automatically from actual session data. - GitHub Copilot: Relies on the open file and repository context within the IDE. It has no built-in mechanism for persisting what happened in previous sessions.
- claude-mem: Middleware layer covering Claude Code, Gemini, Codex, OpenCode, Windsurf, Cursor, and more. Positioned as a vendor-agnostic layer rather than a feature tied to any single agent.
That matters for teams running more than one coding agent across their stack.
My Take
Install it if you run multi-session AI coding workflows. One command, data stays local by default, and token cost stays low. The Apache-2.0 license change and the server-beta runtime together tell me this project is moving toward team and enterprise use cases.
I'm curious whether team memory actually changes how developers coordinate, or whether it just reduces the re-explanation tax. Worth testing and seeing.
Memory that stays local is a start. Cosmos makes it shared.
Free tier available · VS Code extension · Takes 2 minutes
Written by

Molisha Shah
GTM
Molisha is an early GTM and Customer Champion at Augment Code, where she focuses on helping developers understand and adopt modern AI coding practices. She writes about clean code principles, agentic development environments, and how teams are restructuring their workflows around AI agents. She holds a degree in Business and Cognitive Science from UC Berkeley.