Three things worth knowing
- claude-mem is a Claude Code plugin that gives AI coding sessions persistent memory, now at 72.4K stars and 6.2K forks.
- It automatically captures what Claude does during sessions, compresses observations with AI, and injects relevant context into future sessions.
- If you're tired of re-explaining your codebase every time you start a new conversation, this is the most practical fix I've seen.
Every developer I talk to who uses Claude Code regularly hits the same wall: close the session, open a new one, and Claude has no idea what you were doing. You re-explain the architecture, re-paste the decisions, re-set the context. claude-mem is the most practical fix I've seen for this.
The plugin automatically captures what Claude does during coding sessions, compresses the observations using AI, and injects relevant context back into the system when a new session starts. It just crossed 72.4K GitHub stars and 6.2K forks, which tells me this pain point is a lot more widespread than vendors like to acknowledge.

What Happened
Developer Alex Newman (@thedotmack) built claude-mem as a Claude Code plugin that hooks into session lifecycle events to observe every tool use, file read, and edit. A background worker compresses those observations using Claude's agent-sdk, stores them in SQLite, and retrieves relevant context when a new session starts.
The project reached v12.6.4 as of May 5, 2026, with 1,840 commits, 109 contributors, and 259 releases in roughly seven months. The commit history shows extensive co-authorship with Claude Opus 4.6 and 4.7, the tool was built using the same AI it serves.
Installation is a single command: npx claude-mem install. For Gemini CLI or OpenCode, add --ide gemini-cli or --ide opencode.
Key Features
- Automatic capture via lifecycle hooks: Five hooks (
SessionStart,PostToolUse,Stop,UserPromptSubmit,SessionEnd) record observations without manual tagging or commands. You don't have to remember to log anything. - AI-compressed summaries: Raw observations get processed through Claude's
agent-sdkinto semantic summaries stored in SQLite with FTS5 full-text search. The compression is what keeps token costs manageable. - Hybrid vector search: Chroma vector database handles semantic search alongside keyword matching. A 3-layer MCP tool workflow (
search,timeline,get_observations) reduces token usage by roughly 10x compared to fetching full details up front. That efficiency matters on long-running projects. - Multi-IDE support: Works with Claude Code, Cursor, Gemini CLI, OpenCode, Windsurf, and Codex CLI.
- Web viewer UI: A local dashboard at http://localhost:37777 shows the live observation stream, project filtering, and memory search.
- Privacy controls: Wrap content in
<private>tags to exclude it from storage. All data stays local in~/.claude-mem/.
Why It Matters
AI coding assistants lose all project context when a session ends. Developers compensate by re-explaining architecture, pasting prior decisions, or maintaining manual context files. claude-mem removes that overhead entirely.
A developer can close a Claude Code session on Friday and resume on Monday with Claude already aware of the refactoring decisions, bug fixes, and architectural patterns from the prior week. That's the scenario I keep coming back to when I think about what persistent memory actually changes in practice. The compressed observation format keeps context injection within token budgets, so you're not paying a huge cost for the continuity.
At 72.4K stars, 6.2K forks, and 259 releases, this has moved well past side project territory.
Example Use Case
A TypeScript team is migrating a Next.js app from the Pages Router to the App Router. Over three days of Claude Code sessions, a developer converts 40+ route files, refines data-fetching patterns, and fixes hydration bugs.
Without claude-mem, each new session requires re-explaining the migration strategy, which files have been converted, and which patterns to follow. With claude-mem installed, the PostToolUse hook captures every file read and edited. The worker compresses these into observations like "converted /pages/dashboard to /app/dashboard/page.tsx using server components for data fetching." When the developer opens a new session the next morning, Claude already knows the migration is 60% complete and which patterns to apply.
That's the workflow I'd demo to anyone skeptical about whether persistent memory actually changes how you work. It does.
Competitive Context
Claude Code ships with CLAUDE.md for manual project context, but developers write and maintain it themselves. claude-mem automates that entirely and generates it from actual session data rather than what someone remembered to document.
Compared to standalone AI memory tools like Mem0 or SuperMemory, claude-mem runs as a native Claude Code plugin using Claude's own agent-sdk for compression. That tight integration with the Anthropic ecosystem is its strength: the memory layer speaks the same language as the tool it's serving. The multi-IDE support via Cursor, Gemini CLI, and OpenCode broadens the reach for teams that aren't all-in on one runtime.
The AGPL-3.0 license is worth knowing about. Teams deploying modified versions on network servers must open-source their changes. For most users, this doesn't matter, but enterprise teams should read the license before building on top of it.
My Take
claude-mem solves a specific, painful problem cleanly. If you use Claude Code across multiple sessions on the same codebase, especially for debugging, refactoring, or long-running feature work, install it. One command, runs locally, zero changes to your existing workflow.
At 72.4K stars, 6.2K forks, and 109 contributors, the community weight is real.
claude-mem solves memory for one developer. Cosmos builds it into the whole team.
Free tier available · VS Code extension · Takes 2 minutes
Written by

Paula Hingel
Technical Writer
Paula writes about the patterns that make AI coding agents actually work — spec-driven development, multi-agent orchestration, and the context engineering layer most teams skip. Her guides draw on real build examples and focus on what changes when you move from a single AI assistant to a full agentic codebase.