If you're running LLMs in production and not tracking what they're doing, you're flying blind. This skill gives Claude expertise in Langfuse, an open source observability platform that traces your LLM calls, manages prompts, and tracks costs across OpenAI, LangChain, and LlamaIndex integrations. It covers the full workflow from basic tracing setup to scoring responses and A/B testing prompts. The patterns are solid, especially the drop-in OpenAI wrapper that automatically logs everything. Most valuable when you're debugging why GPT-4 gave a weird response last Tuesday or trying to figure out which prompt version actually performs better. Just remember to call flush() in serverless environments or you'll lose your traces.
npx skills add https://github.com/davila7/claude-code-templates --skill langfuse