Google Agents Cli Eval

Editor's Note

This handles the evaluation workflow for Google's Agent Development Kit, covering everything from writing evalsets to debugging why your agent scores tanked. The real value is in the eval-fix loop guidance: it walks you through the 5-10+ iteration cycle you'll actually go through, with a useful table calling out shortcuts that waste time (like lowering thresholds instead of fixing your agent). You get concrete metric selection advice (tool_trajectory_avg_score for CI/CD, final_response_match_v2 for semantic checks), schema examples for both evalsets and config files, and separate references for user simulation, multimodal inputs, and built-in tools. Use this when you need to systematically improve agent quality rather than just eyeballing outputs.

Install

npx skills add https://github.com/google/agents-cli --skill google-agents-cli-eval

Votes

Installs6.9k

GitHub Stars2.4k

Install

npx skills add https://github.com/google/agents-cli --skill google-agents-cli-eval

Google Agents Cli Eval

Install

Google Agents Cli Eval

Install

Related Testing & QA Skills

Related Testing & QA Skills