You'd reach for this when running multiple LLMs that need to share context without blowing through token limits on every request. It compresses semantic information into a persistent memory layer that swarm agents can query instead of repeating full context windows. The implementation details are light from the source, but the core idea is clear: treat shared memory as a service rather than cramming everything into prompts. Useful if you're orchestrating multiple AI agents that need to reference common knowledge or past interactions without each one carrying the full conversational history. Runs as a remote SSE transport, so it sits between your agents rather than inside them.
claude mcp add --transport sse io.github.evozim-recallmax https://recallmax-mcp.vercel.app/api/mcp