Reads every benchmark run JSON in your docs folder and shows you how win rate, latency, escalation rate, and cost have drifted over time. The smoke gate is binary pass or fail, but this surfaces the slow creep from 100% to 85% win rate that still passes smoke yet signals real degradation. Run it before releases to catch performance regressions the gate missed, or after corpus changes to verify consistency. Flags when win rate drops or latency climbs 1.5x between first and last run. Basically turns your accumulated benchmark history into a regression detector instead of letting those JSON files sit unused.
npx skills add https://github.com/ruvnet/ruflo --skill cost-trend