When production is on fire, this skill gives you the SRE playbook: severity classification, incident command structure, observability driven investigation using OpenTelemetry and distributed tracing, and blameless post-mortem templates. It pushes you through the first five minutes with clear stabilization steps, then guides communication cadences, recovery validation, and error budget analysis. The modern observability focus is solid, covering Prometheus, Grafana, service mesh analysis, and chaos engineering correlation. If you're building incident response muscles or need structured thinking during an outage instead of panic, this is the systematic approach that keeps you from missing obvious steps when everything's breaking.
npx skills add https://github.com/sickn33/antigravity-awesome-skills --skill incident-responder