Instead of hand-tuning prompts and memory systems, this framework automatically searches over the scaffolding code around your base model: what it stores, retrieves, and sees at each step. You define an evaluation metric and the proposer agent (Claude Code by default) generates and tests harness variants in a loop, evolving everything from memory strategies to context construction. The repo includes reference experiments for text classification and terminal tasks, plus an onboarding doc that walks you through adapting it to new domains. Think of it as automated hyperparameter search, but for the entire wrapper around your LLM instead of just numerical configs. Works with uv for dependency management, and each experiment is self-contained.
npx skills add https://github.com/aradotso/trending-skills --skill meta-harness-optimization