This is a structured workflow for running machine learning experiments in four mandatory stages: reproduce the baseline, tune hyperparameters, validate your novel method, and run ablations. Each stage has an attempt budget (20, 12, 12, 18) and a gate condition you must pass before moving forward. The discipline is the point here. It forces you to verify your baseline actually works before you start testing your fancy new architecture, and it stops you from burning 50 GPU hours on hyperparameter searches that should have taken 12 runs. Integrates with other EvoScientist skills for loading prior experimental strategies and diagnosing failures. If you tend to jump straight to testing your idea without validating the baseline setup first, this will save you from yourself.
npx skills add https://github.com/evoscientist/evoskills --skill experiment-pipeline