CLAUDE CODE MARKETPLACES
SkillsMarketplacesMCPDigestLearnAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Web & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web CrawlingAutomation & Workflows
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Claude Code Marketplaces

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Learn
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic
  1. Skills
  2. /
  3. hamelsmu
  4. /
  5. evals-skills
  6. /
  7. Eval Audit

Eval Audit

Editor's Note

This audits your LLM evaluation pipeline and tells you what's broken. It walks through six diagnostic areas: whether you did error analysis on actual traces, if your judges are binary or using noisy Likert scales, whether judges are validated against human labels with TPR/TNR, if you're using similarity metrics like ROUGE as primary evals, how your human review process works, and if you have enough labeled data. The output is a prioritized findings report with concrete next steps linked to other skills. Most useful when you inherit an eval system and don't trust it, or when you have evals running but suspect they're missing real failures. The checks are opinionated but grounded in what actually breaks in production.

Install

npx skills add https://github.com/hamelsmu/evals-skills --skill eval-audit
Votes
0
Installs353
GitHub Stars1.3k
Categories
Security
First SeenJun 3, 2026
View on GitHub

Comments

Login to comment

Related Security Skills

View all →
security-audit-scanner

tbartel74/Vigil-Code

0
8
Automated security scanning for Vigil Guard v2.0.0. Use for OWASP Top 10 checks, TruffleHog secret detection, npm/pip vulnerability scanning, 3-branch service security, heuristics-service audit, and CI/CD security pipelines.
permission-auditor

useai-pro/openclaw-skills-security

0
377
58
permission auditor
supabase-audit-auth-config

yoanbernabeu/supabase-pentest-skills

0
237
43
supabase audit auth config
supabase-audit-auth-users

yoanbernabeu/supabase-pentest-skills

0
208
43
supabase audit auth users
supabase-audit-auth-signup

yoanbernabeu/supabase-pentest-skills

0
205
43
supabase audit auth signup
supabase-audit-authenticated

yoanbernabeu/supabase-pentest-skills

0
182
43
supabase audit authenticated