Agentic Eval

Editor's Note

Implements self-critique loops where Claude generates output, evaluates it against your criteria, then refines based on its own feedback. Includes evaluator-optimizer patterns, test-driven code refinement, and LLM-as-judge scoring with JSON-structured critiques. Most useful for quality-critical tasks like code generation, reports, or analysis where you have clear success metrics. The reflection patterns prevent single-shot mediocrity by forcing iterative improvement, though you'll want to set iteration limits to avoid endless loops. Works best when your evaluation criteria are specific rather than subjective.

Install

npx skills add https://github.com/github/awesome-copilot --skill agentic-eval

Votes

Installs9k

GitHub Stars29.9k

Agentic Eval

Install

Agentic Eval

Install

Related Testing & QA Skills

Related Testing & QA Skills