Connects Claude to PMAT's codebase analysis engine through 19 MCP tools for technical debt assessment, mutation testing, and AI context generation. Exposes operations to grade code quality using six orthogonal metrics (A+ through F scale), run semantic search across 20+ languages with complexity annotations, and perform git history RAG with commit fusion. You'd reach for this when doing code reviews, refactoring sessions, or generating comprehensive codebase summaries for AI assistants. Integrates directly with Claude Desktop, Cline, and other MCP-compatible tools to surface repository health scores, compliance checks, and quality gate enforcement without leaving your AI workflow.
Zero-configuration AI context generation for any codebase
Installation | Usage | Features | Examples | Documentation
PMAT (Pragmatic Multi-language Agent Toolkit) provides everything needed to analyze code quality and generate AI-ready context:
.pmat-gates.toml configPart of the PAIML Stack, following Toyota Way quality principles (Jidoka, Genchi Genbutsu, Kaizen).
pmat query "cache invalidation" --churn --duplicates --entropy --faults
Every result includes TDG grade, Big-O complexity, git churn, code clones, pattern diversity, fault annotations, call graph, and syntax-highlighted source.
# Install from crates.io
cargo install pmat
# Or from source (latest)
git clone https://github.com/paiml/paiml-mcp-agent-toolkit
cd paiml-mcp-agent-toolkit && cargo install --path .
# Generate AI-ready context
pmat context --output context.md --format llm-optimized
# Analyze code complexity
pmat analyze complexity
# Grade technical debt (A+ through F)
pmat analyze tdg
# Score repository health
pmat repo-score .
# Pre-flight verify before committing (CI-faithful: fmt + complexity + satd + clippy + tests)
pmat verify --format json
# Run mutation testing
pmat mutate --target src/
# Start MCP server (stdio) for Claude Code, Cline, etc.
MCP_VERSION=2024-11-05 pmat
pmat verify)pmat verify runs the exact gate set CI enforces — format, complexity, satd, clippy, tests — fail-fast, with machine-readable output, so an agent gets "green here ⇒ green in CI" before committing. The canonical loop: edit → pmat verify --format json → fix on red → commit on green. See docs/agent-instructions/autonomous-verify-loop.md.
PMAT releases are dogfooded with ultracode — Claude Code's multi-agent dynamic-workflow orchestration — as both the test harness and the target workload:
Findings from each sweep are adversarially re-verified by skeptic agents before they drive fixes — see the release case studies in the pmat book.
Generate comprehensive context for AI assistants:
pmat context # Basic analysis
pmat context --format llm-optimized # AI-optimized output
pmat context --include-tests # Include test files
Six orthogonal metrics for accurate quality assessment:
pmat analyze tdg # Project-wide grade
pmat analyze tdg --include-components # Per-component breakdown
pmat tdg baseline create # Create quality baseline
pmat tdg check-regression # Detect quality degradation
Grading Scale:
Validate test suite effectiveness:
pmat mutate --target src/lib.rs # Single file
pmat mutate --target src/ --threshold 85 # Quality gate
pmat mutate --failures-only # CI optimization
Supported Languages: Rust, Python, TypeScript, JavaScript, Go, C/C++, C#, Lua, Lean, Java, Kotlin, Ruby, Swift, PHP, Bash, SQL, Scala, YAML, Markdown + MLOps model formats (GGUF, SafeTensors, APR)
Evidence-based quality metrics (0-289 scale, 11 categories):
pmat rust-project-score # Fast mode (~3 min)
pmat rust-project-score --full # Comprehensive (~10-15 min)
pmat repo-score . --deep # Full git history
Pre-configured AI prompts enforcing EXTREME TDD:
pmat prompt --list # Available prompts
pmat prompt code-coverage # 85%+ coverage enforcement
pmat prompt debug # Five Whys analysis
pmat prompt quality-enforcement # All quality gates
Search git history by intent using TF-IDF semantic embeddings:
# Fuse git history into code search
pmat query "fix memory leak" -G
# Search with churn, clones, entropy, faults
pmat query "error handling" --churn --duplicates --entropy --faults
# Run the example
cargo run --example git_history_demo
Automatic quality enforcement:
pmat hooks install # Install pre-commit hooks
pmat hooks install --tdg-enforcement # With TDG quality gates
pmat hooks status # Check hook status
pmat comply)30+ automated checks across code quality, best practices, and governance:
pmat comply check # Run all compliance checks
pmat comply check --strict # Exit non-zero on failure
pmat comply check --format json # Machine-readable output
pmat comply migrate # Update to latest version
Key Checks:
Provable-Contracts Enforcement (CB-1200..1210):
binding.yaml functions exist in src/, detects ghost bindings (L0-L3 enforcement levels)tests/contract_traits.rs for compiler-verified trait impls (13 kernel traits)Configure via .pmat.yaml:
comply:
thresholds:
min_tdg_grade: "A"
pv_lint_is_error: true # CB-1201: FAIL on pv lint failure
min_binding_existence: 95 # CB-1208: 95% binding verification
require_all_traits: true # CB-1209: 13/13 traits required
min_kani_coverage: 20 # CB-1206: minimum Kani proof %
pmat infra-score)CI/CD quality scoring (0-100 + 10 bonus for provable-contracts):
pmat infra-score # Text output
pmat infra-score --format json # Machine-readable
pmat infra-score -v --failures-only # Show only failing checks
Categories: Workflow Architecture (25pts), Build Reliability (25pts), Quality Pipeline (20pts), Deployment & Release (15pts), Supply Chain (15pts), Provable Contracts bonus (10pts).
pmat query --docs)Search documentation files (Markdown, text, YAML) alongside code:
pmat query "authentication" --docs # Code + docs results
pmat query "deployment" --docs-only # Only documentation
pmat query "API endpoints" --no-docs # Exclude docs (default)
pmat kaizen)Toyota Way continuous improvement — scan, auto-fix, commit:
pmat kaizen --dry-run # Scan only (no changes)
pmat kaizen # Apply safe auto-fixes
pmat kaizen --commit --push # Fix, commit, and push
pmat kaizen --format json -o report.json # CI/CD integration
# Cross-stack mode: scan all batuta stack crates in one invocation
pmat kaizen --cross-stack --dry-run # Scan all crates
pmat kaizen --cross-stack --commit # Fix and commit per-crate
pmat kaizen --cross-stack -f json # Grouped JSON report
pmat extract)Extract function boundaries with metadata:
pmat extract src/lib.rs # Extract functions from file
pmat extract --list src/ # List all functions with imports and visibility
# For Claude Code
pmat context --output context.md --format llm-optimized
# With semantic search
pmat embed sync ./src
pmat semantic search "error handling patterns"
# Add to your CI pipeline
steps:
- uses: actions/checkout@v4
- run: cargo install pmat
- run: pmat analyze tdg --fail-on-violation --min-grade B
- run: pmat mutate --target src/ --threshold 80
# 1. Create baseline
pmat tdg baseline create --output .pmat/baseline.json
# 2. Check for regressions
pmat tdg check-regression \
--baseline .pmat/baseline.json \
--max-score-drop 5.0 \
--fail-on-regression
pmat/
├── src/
│ ├── cli/ Command handlers and dispatchers
│ ├── services/ Analysis engines (TDG, SATD, complexity, agent context)
│ ├── mcp_server/ MCP protocol server
│ ├── mcp_pmcp/ PMCP protocol integration
│ └── models/ Configuration and data models
├── examples/ 89 runnable examples
└── docs/
└── specifications/ Technical specs
| Metric | Value |
|---|---|
| Tests | 21,200+ passing |
| Coverage | 99.66% |
| Mutation Score | >80% |
| Languages | 20 supported + MLOps model formats |
| MCP Tools | 20 available |
Per Popper's demarcation criterion, all claims are measurable and testable:
| Commitment | Threshold | Verification Method |
|---|---|---|
| Context Generation | < 5 seconds for 10K LOC project | time pmat context on test corpus |
| Memory Usage | < 500 MB for 100K LOC analysis | Measured via heaptrack in CI |
| Test Coverage | ≥ 85% line coverage | cargo llvm-cov (CI enforced) |
| Mutation Score | ≥ 80% killed mutants | pmat mutate --threshold 80 |
| Build Time | < 3 minutes incremental | cargo build --timings |
| CI Pipeline | < 15 minutes total | GitHub Actions workflow timing |
| Binary Size | < 50 MB release binary | ls -lh target/release/pmat |
| Language Parsers | All 20 languages parse without panic | Fuzz testing in CI |
How to Verify:
# Run self-assessment with Popper Falsifiability Score
pmat popper-score --verbose
# Individual commitment verification
cargo llvm-cov --html # Coverage ≥85%
pmat mutate --threshold 80 # Mutation ≥80%
cargo build --timings # Build time <3min
Failure = Regression: Any commitment violation blocks CI merge.
All benchmarks use Criterion.rs with proper statistical methodology:
| Operation | Mean | 95% CI | Std Dev | Sample Size |
|---|---|---|---|---|
| Context (1K LOC) | 127ms | [124, 130] | ±12.3ms | n=1000 runs |
| Context (10K LOC) | 1.84s | [1.79, 1.90] | ±156ms | n=500 runs |
| TDG Scoring | 156ms | [148, 164] | ±18.2ms | n=500 runs |
| Complexity Analysis | 23ms | [22, 24] | ±3.1ms | n=1000 runs |
Comparison Baselines (vs. Alternatives):
| Metric | PMAT | ctags | tree-sitter | Effect Size |
|---|---|---|---|---|
| 10K LOC parsing | 1.84s | 0.3s | 0.8s | d=0.72 (medium) |
| Memory (10K LOC) | 287MB | 45MB | 120MB | - |
| Semantic depth | Full | Syntax only | AST only | - |
See docs/BENCHMARKS.md for complete statistical analysis.
PMAT uses ML for semantic search and embeddings. All ML operations are reproducible:
Random Seed Management:
Model Artifacts:
PMAT does not train models but uses these data sources for evaluation:
| Dataset | Source | Purpose | Size |
|---|---|---|---|
| CodeSearchNet | GitHub/Microsoft | Semantic search benchmarks | 2M functions |
| PMAT-bench | Internal | Regression testing | 500 queries |
Data provenance and licensing documented in docs/ml/REPRODUCIBILITY.md.
PMAT is built on the PAIML Sovereign Stack - pure-Rust, SIMD-accelerated libraries:
| Library | Purpose | Version |
|---|---|---|
| aprender | ML library (text similarity, clustering, topic modeling) | 0.41 |
| aprender-graph | CSR graph database (PageRank, Louvain) | 0.41 |
| aprender-db | Columnar analytics database (lib trueno_db) | 0.41 |
| aprender-rag | RAG pipeline with VectorStore | 0.41 |
| aprender-viz | Terminal graph visualization | 0.41 |
| aprender-compute | SIMD/GPU compute for matrix operations (lib trueno) | 0.41 |
| aprender-zram-core | SIMD LZ4/ZSTD compression (optional) | 0.41 |
| aprender-contracts | Provable contracts (with aprender-contracts-macros) | 0.49 |
| pmcp | MCP protocol SDK | 2.9 |
| pmat | Code analysis toolkit | 3.19.2 |
Key Benefits:
See CONTRIBUTING.md for development setup, testing, and pull request guidelines.
MIT License - see LICENSE for details.
io.github.ericm1018/skillfm-llm-cost-optimizer-openai-anthropic-usage
io.github.mikerawsonnz/llm-orchestration-agent
io.github.mikerawsonnz/authenticated-llm-agent
labforgedev/copilot-memory-mcp
csoai-org/agent-prompt-injection-firewall-mcp
io.github.mikerawsonnz/authenticated-multi-llm-agent