Langsmith Evaluator

Editor's Note

This walks you through the three pieces you need for LangSmith evaluations: writing evaluators (LLM-as-judge or custom code), defining run functions that capture your agent's outputs, and actually running the evals either locally with evaluate() or by uploading them via the CLI. The golden rule here is solid: always inspect your actual output structure before writing extraction logic, because frameworks vary wildly. One thing to watch: LLM-as-judge evaluators can't be uploaded yet, only run locally, so you'll want to use evaluate() with local evaluators for dataset comparisons. The examples cover both Python and TypeScript, and there's a helpful table showing the differences between local and uploaded evaluator behavior, which matters more than you'd think for return formats.

Install

npx skills add https://github.com/langchain-ai/langsmith-skills --skill langsmith-evaluator

Votes

Installs1.8k

GitHub Stars116

Install

npx skills add https://github.com/langchain-ai/langsmith-skills --skill langsmith-evaluator

Langsmith Evaluator

Install

Langsmith Evaluator

Install

Related Backend & APIs Skills

Related Backend & APIs Skills