You need this the moment you start wondering why your DSPy optimizer isn't improving. The real insight here is that GEPA lives or dies on rich textual feedback, not just scores. Returns a `dspy.Prediction(score=..., feedback=...)` instead of a dict (which breaks parallel aggregation), shows you how to build multi-axis metrics that actually teach the optimizer something useful, and hammers home the separate valset rule because optimizers overfit frighteningly fast. The canonical example walks through correctness plus citation plus conciseness with specific feedback strings. Also covers CI integration with cached LMs and the usual tracing hooks for MLflow and Weights & Biases. If your metric is a bare float, you're leaving signal on the table.
npx -y skills add intertwine/dspy-agent-skills --skill dspy-evaluation-harness --agent claude-codeInstalls into .claude/skills of the current project.
Select a file.
sickn33/antigravity-awesome-skills
kubesphere/kubesphere
supercent-io/skills-template