This is a comprehensive platform for LLM observability and evaluation that actually brings together the full stack: OpenTelemetry-native tracing, 50+ evaluation metrics, guardrails, and an OpenAI-compatible gateway. You instrument your OpenAI, LangChain, or LlamaIndex code with a few lines, and it automatically traces every call. The evaluation side handles everything from hallucination detection to custom rubrics and batch runs over datasets. It's self-hostable via Docker Compose or Kubernetes, which matters if you're dealing with sensitive data or want to avoid vendor lock-in. The skill gives you working examples for both Python and TypeScript, plus the actual setup commands for spinning up the infrastructure.
npx skills add https://github.com/aradotso/trending-skills --skill future-agi-platform