Inspired by Karpathy's autoresearch loop, this systematically improves your SKILL.md files through evaluation, modification, and validation cycles. It scores each skill across 8 dimensions (60 points for structure like workflow clarity and error handling, 40 points for actual effectiveness tested with real prompts), then runs hill climbing optimization with git version control. The key insight is dual evaluation: it doesn't just check if your markdown looks good, it spawns independent sub-agents to run test prompts and compare output quality against baseline. Each improvement must strictly beat the previous score or gets automatically reverted. Use this when you want to stop guessing whether your skills actually work better and start measuring it, though you'll need to confirm each change since it pauses for human approval between optimizations.
npx -y skills add alchaincyf/darwin-skill --skill darwin-skill --agent claude-codeInstalls to .claude/skills
Select a file.
juliusbrussee/caveman
mattpocock/skills
shadcn/improve
obra/superpowers
forrestchang/andrej-karpathy-skills
vercel-labs/skills