A proxy layer that wraps existing MCP servers and replaces full tool schemas with compressed stubs (tool name plus one-line description), then lazy-loads complete schemas only when tools are actually invoked. Verified benchmarks show 6-7x token reduction across 10+ servers with 100+ tools, dropping overhead from 34k to 5k tokens. Includes disk caching with 24-hour TTL, response compression for large JSON payloads, and writes auditable proof logs to local JSONL files you can review with the built-in report command. Reach for this when you're running multiple MCP servers and context window overhead is eating your budget before the LLM even starts reasoning.
claude mcp add --transport stdio io.github.kira-autonoma-context-proxy uvx context-proxy