This is a full-stack diagnostic for when your agent or LLM application works in the playground but breaks in production, or when wrapper layers somehow make the base model worse. It audits 12 layers where things go wrong: system prompts, memory pollution, tool discipline failures, hidden repair loops that silently rerun LLM calls, and rendering corruption between generation and delivery. You get severity-ranked findings with code-first fixes, not prompt tweaks. Use it when shipping agent features to production, when "the agent is getting worse" reports come in, or when you've been debugging agent behavior for more than 15 minutes without finding the root cause. The workflow searches your codebase for anti-patterns like tool requirements only in prompts instead of code, memory admission without user-correction priority, and silent output mutation layers.
npx skills add https://github.com/affaan-m/everything-claude-code --skill agent-architecture-audit