A testing harness that lets you write eval fixtures as YAML files and run them directly from your MCP client. Each fixture defines steps with tool calls, inputs, and assertions like output_contains, schema_match, or latency_under. It has two modes: live mode spawns a real MCP server via stdio and tests against actual tool responses, while simulation mode runs assertions against static expected_output strings. You get tools like run_suite for executing all tests, regression_report to compare runs, and create_test_case for scaffolding new fixtures. Step outputs can pipe into downstream inputs using template syntax. Useful when you're building MCP servers and need regression tests in version control without leaving your editor.
claude mcp add --transport stdio dbsectrainer-mcp-eval-runner -- npx -y mcp-eval-runner