This is a token optimization layer that sits between your prompts and LLM APIs to reduce context size through semantic pruning. You'd reach for it when you're repeatedly hitting token limits or want to cut API costs without losing meaning from your prompts. The server runs as a gateway service, letting you send verbose context that gets intelligently compressed before forwarding to the actual LLM. Think of it as a preprocessing step that strips redundant information while preserving semantic content. Useful for long document summarization workflows, chunked retrieval augmented generation, or any scenario where you're working with large context windows and want automatic optimization without manual prompt engineering.
claude mcp add --transport sse io.github.evozim-token-diet https://token-diet-mcp.vercel.app/api/mcp