Benchmark E2e

187 installs183 stars

Summary

This is an overnight CI pipeline for the vercel-plugin project. It spins up nine realistic projects (recipe platform, trivia game, code review bot), exercises skill injection through claude --print, verifies dev servers actually boot, then diffs which skills were expected versus which actually got injected. The output is a machine-readable report with copy-pasteable YAML fixes for missing patterns. Run it in a loop with sleep 3600 between iterations and you get a self-improvement cycle that shows exactly which gaps closed since the last run. Honestly feels like internal tooling that escaped into the wild, but if you're building a Claude plugin with similar injection logic, the contracts (run-manifest.json, events.jsonl) are worth stealing.

Install to Claude Code

npx -y skills add vercel-labs/vercel-plugin --skill benchmark-e2e --agent claude-code

Installs into .claude/skills of the current project.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

AI notepad for back-to-back meetings

Notes, actions and memory. Without a meeting bot. First month 100% off.

Download for free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Email for Agents: Free tier available

Give your AI agent a complete email layer—sending, inbound inboxes, and sandbox testing.

Get 4K emails/month free →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

CodeScene MCP Server

Your agent targets a perfect 10 Code Health score. Deterministic. Every commit.

Try For Free →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

AI notepad for back-to-back meetings

Notes, actions and memory. Without a meeting bot. First month 100% off.

Download for free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Email for Agents: Free tier available

Give your AI agent a complete email layer—sending, inbound inboxes, and sandbox testing.

Get 4K emails/month free →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

CodeScene MCP Server

Your agent targets a perfect 10 Code Health score. Deterministic. Every commit.

Try For Free →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Files

SKILL.md

Select a file.

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

AI notepad for back-to-back meetings

Notes, actions and memory. Without a meeting bot. First month 100% off.

Download for free →

Keep your Mac awake

Keep your Mac awake while Claude Code and 40+ AI agents run. Sleeps when they're idle.

One time payment $9 →

Email for Agents: Free tier available

Give your AI agent a complete email layer—sending, inbound inboxes, and sandbox testing.

Get 4K emails/month free →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

CodeScene MCP Server

Your agent targets a perfect 10 Code Health score. Deterministic. Every commit.

Try For Free →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Benchmark E2e

Install to Claude Code

Benchmark E2e

Install to Claude Code

Recommended

Recommended