This skill wraps the `ax` CLI to help you create, run, and analyze model evaluation experiments in Arize. It handles the full workflow: export a dataset, process examples through your model, collect outputs and evaluations, then push the runs back as a named experiment. The commands cover listing experiments, exporting results (with automatic escalation to Arrow Flight for bulk transfers over 500 runs), and creating new experiments from JSON, CSV, or Parquet files. Useful when you're A/B testing models, measuring accuracy across prompts, or benchmarking different configurations. The skill enforces a strict rule: it will never fabricate model outputs or scores, so you need real API access to run experiments properly.
npx -y skills add github/awesome-copilot --skill arize-experiment --agent claude-codeInstalls into .claude/skills of the current project.
Select a file.
github/awesome-copilot
alirezarezvani/claude-skills
microsoft/win-dev-skills
github/awesome-copilot