Adds a pre-LLM shield that flags prompt injections, jailbreaks, and social engineering attacks before they hit your agent. Exposes AgentShield's classification API (99.4% recall, sub-100ms p95 latency) through MCP tools so Claude can check user input or tool outputs for malicious payloads. You get a classify operation that returns verdict, category, and confidence score. Useful when building agents that handle untrusted input or need runtime protection beyond system prompts. Free tier gives you 100 requests per day. The benchmark harness is reproducible if you want to verify the numbers yourself against deepset, PINT, jackhhao, and SPML datasets.
claude mcp add --transport stdio io.github.dl-eigenart-agentshield-mcp -- npx -y @eigenart/agentshield-mcp