This is the operational playbook for agents that run longer than a single terminal session. It covers the stuff that matters when Claude is running in production: lifecycle controls, observability hooks, safety boundaries, and incident response patterns. The metrics section is especially practical, tracking things like retries per task and failure class distribution rather than vanity numbers. If you're moving beyond local experiments into containerized deployments, PM2 processes, or CI/CD pipelines, this gives you the baseline controls you need. The incident pattern is sound: freeze, capture traces, isolate, patch minimally, then resume gradually. Straightforward ops discipline adapted for agentic workloads.
npx -y skills add affaan-m/everything-claude-code --skill enterprise-agent-ops --agent claude-codeInstalls into .claude/skills of the current project.
Select a file.
sickn33/antigravity-awesome-skills
moizibnyousaf/ai-agent-skills
github/awesome-copilot