NVIDIA's runtime safety layer for LLM apps that runs programmable checks before queries hit your model or responses reach users. Uses Colang 2.0 DSL to define safety rules like jailbreak detection, PII filtering, hallucination checks, and toxicity blocking. You'd use this when you need multiple safety mechanisms in one place rather than stitching together separate APIs. Runs on CPU or T4 GPU with 100-500ms overhead typical. The pattern matching approach means you'll tune thresholds to avoid false positives, but having everything from input validation to fact checking in a single framework beats managing five different services. Over 4,300 GitHub stars and deployed in NVIDIA's enterprise products, so it's battle tested for production.
npx skills add https://github.com/orchestra-research/ai-research-skills --skill nemo-guardrails