A research-grade biomedical integration that wraps AlphaFold DB and eight other public data sources (MONDO, HPO, Open Targets, ClinVar, gnomAD, DisGeNET, ChEMBL, Ensembl) behind 29 MCP tools. You get variant clinical reports, disease-target landscapes, heuristic druggability scoring, and cross-species structural comparisons via topological data analysis. Everything flows through a local SQLite knowledge graph with query and export capabilities. Ships with 730 tests and 100% coverage, but this is an unfunded independent project with no scientific validation yet and zero certification for clinical or regulated use. Reach for it when you need programmatic access to protein structures and biomedical ontologies in a research context, not production healthcare.
A Model Context Protocol server — an AlphaFold MCP server — that wraps AlphaFold DB and 8 other public biomedical data sources behind a set of MCP tool calls, backed by a local SQLite knowledge graph with query and export tools (results can be persisted through its API; automatic per-invocation persistence is not yet wired).
This is an unfunded, independent open-source project. It is not a service, not certified for any regulated use, and its outputs are research aids that should be reviewed by qualified humans before any clinical or regulatory use.
This project is not affiliated with, endorsed by, or sponsored by Google DeepMind or EMBL-EBI. "AlphaFold" is a trademark of its respective owner and is used here only to describe the public data (the AlphaFold DB API) that this software consumes.
Status: v1.2.0 (Beta). Engineering-validated (730 tests, 100%
line and branch coverage). Not yet scientifically validated by
independent domain experts; not yet deployed in production. See
STATUS.md and LIMITATIONS.md.
A Python MCP server that:
storage/knowledge_graph.py)
with query and export tools. Tool results can be persisted to it
through the knowledge-graph API; automatic per-invocation persistence
is not yet wired, so the store is populated only when a caller writes
to it explicitly.[tda] extra
(gudhi).It targets mcp-spec 2025-06-18 and runs on Python 3.10–3.13.
generate_variant_clinical_report
produces are a draft surface of the upstream evidence the
server can fetch automatically. They are not a substitute for
clinical-laboratory variant review.assess_target_druggability returns is
a heuristic built from drug-precedent counts, Open Targets
tractability labels, pLDDT, and gnomAD constraint. It is not a
validated prediction.For a complete, itemised list of known limitations (with module
references, impact, and planned resolution), see LIMITATIONS.md.
For the high-level posture — what is engineering-validated vs. what is
not yet scientifically validated — see STATUS.md.
pip install alphafold-sovereign-mcp
Or run it without installing using uvx:
uvx alphafold-sovereign-mcp
Every release on PyPI is built by the release.yml workflow under
OIDC Trusted Publishing, attached to a signed GitHub Release with
SLSA L3 build provenance and Sigstore (cosign) signatures, and
mirrored to a Zenodo DOI. Verify the supply chain with
scripts/replicate.sh.
git clone https://github.com/smaniches/alphafold-sovereign-mcp
cd alphafold-sovereign-mcp
uv pip install -e .
# With persistent-homology TDA (requires gudhi):
# uv pip install -e ".[tda]"
alphafold-sovereign --version # → 1.2.0
alphafold-sovereign --self-test # → PASS on the offline BRCA1 fixture
--self-test boots the server in offline mode and exercises the
deterministic logic of generate_variant_clinical_report against a
built-in BRCA1:c.5266dupC fixture. No network calls; returns exit
code 0 on PASS, non-zero on FAIL.
Add to claude_desktop_config.json:
{
"mcpServers": {
"alphafold-sovereign": {
"command": "alphafold-sovereign-mcp",
"args": []
}
}
}
Restart Claude Desktop and the tools become available in conversations.
See the examples/ directory for three end-to-end
illustrations of what a session looks like.
ALPHAFOLD_OFFLINE=1 alphafold-sovereign-mcp
Refuses all outbound HTTP. Serves only from the local SQLite cache.
The server exposes 29 MCP tools across four modules. Each tool's input schema is a Pydantic model; results are JSON.
tools/disease.py)| Tool | What it does |
|---|---|
lookup_disease | MONDO record + hierarchy + ICD cross-references |
search_diseases | Full-text MONDO ontology search |
lookup_phenotype | HPO term + associated diseases |
get_gene_phenotype_profile | HPO phenotypes + gnomAD constraint for a gene |
get_disease_targets | Top drug targets for a MONDO disease (Open Targets) |
get_target_diseases | Top diseases for a UniProt target (Open Targets) |
get_common_disease_targets | Parallel profiling across curated MONDO diseases |
triage_variant_3d | HGVS → ClinVar + gnomAD constraint (disease/structure context: pointer notes) |
phenotype_to_structures | HPO → diseases → OT targets → UniProt IDs |
get_orphan_disease_atlas | Orphanet → MONDO → HPO + OT targets |
compare_disease_target_overlap | Jaccard similarity of target sets for two diseases |
resolve_icd10_to_mondo | ICD-10 code → MONDO disease record |
tools/precision_medicine.py)| Tool | What it does |
|---|---|
generate_variant_clinical_report | HGVS → multi-source report + draft ACMG/AMP criteria |
assess_target_druggability | UniProt → HOT/WARM/COLD/NOT_DRUGGABLE tier |
synthesize_protein_dossier | UniProt → multi-source briefing |
map_disease_drug_landscape | MONDO → approved drugs + pipeline + ChEMBL phase counts |
classify_variant_acmg | HGVS → ACMG/AMP criteria checklist (PVS1, PM2, PP3, BP4, BP7, BS1, PP5) |
find_drug_repurposing_candidates | MONDO → candidates ranked by OT evidence × ChEMBL phase |
The ACMG/AMP criteria produced are a draft: they reflect the upstream evidence the server can fetch automatically, and they are not a substitute for clinical-laboratory review.
tools/structure_intelligence.py)| Tool | What it does |
|---|---|
analyze_structural_confidence | pLDDT distribution + PAE-derived domain map |
compute_topology_fingerprint | 64-dim TDA fingerprint (Betti numbers β₀ β₁ β₂) |
compare_proteins_topologically | Pairwise L2 fingerprint-distance matrix for 2–10 proteins |
find_evolutionary_structural_shifts | Cross-species structural divergence (TDA + Ensembl orthologs) |
score_binding_pocket_geometry | Geometric pocket detection + heuristic druggability index |
detect_intrinsically_disordered | IDR map (linkers, tails, long IDRs) |
tools/knowledge_graph_tools.py)| Tool | What it does |
|---|---|
query_variant_database | Search locally stored variant triage results |
query_protein_database | Search locally stored protein assessments |
get_knowledge_graph_stats | Database size, entity counts, last activity |
export_research_dataset | Export tables to JSON for pandas/ML pipelines |
find_drug_gene_network | Traverse the accumulated drug–gene–disease graph |
For three documented end-to-end illustrations of a Claude Desktop
session against this server — variant triage on BRCA1 c.5266dupC,
target characterisation on EGFR, and a drug-discovery walk-through
on Imatinib → BCR-ABL → CML — see the examples/
directory. Each example includes the user prompt, the tool calls
the model issues, the server's response shape, and the model's
paraphrased reply.
generate_variant_clinical_report(hgvs="BRCA1:c.181T>G")
The server resolves the HGVS, fetches ClinVar, gnomAD, AlphaMissense (via AlphaFold DB), Open Targets disease evidence, ChEMBL drug data, and Ensembl VEP consequence annotations, and returns a single JSON record with the cross-referenced fields plus the ACMG/AMP criteria that the available evidence supports.
find_drug_repurposing_candidates(disease_mondo_id="MONDO:0007739")
Returns drugs whose Open Targets evidence connects them to the disease, ranked by a composite of OT evidence score × the maximum ChEMBL clinical phase reached against the target.
find_evolutionary_structural_shifts(
gene_symbol="ACE2",
target_species=["mus_musculus", "rhinolophus_ferrumequinum"]
)
For each species: fetches the ortholog (Ensembl), the AlphaFold structure, computes the TDA fingerprint, and returns the L2 fingerprint distance from the human structure along with sequence identity.
| Source | What we use | License |
|---|---|---|
| AlphaFold DB v6 (EBI/DeepMind) | Structures, pLDDT, PAE, AlphaMissense | CC BY 4.0 |
| MONDO (OLS4) | Disease ontology, ICD cross-refs | CC BY 4.0 |
| HPO (JAX) | Phenotype terms, gene-disease links | HPO license (free for all use) |
| Open Targets | Disease–target evidence | CC0 1.0 (data) |
| ClinVar (NCBI) | Variant pathogenicity | Public domain |
| gnomAD v4 | Population allele frequencies | CC0 1.0 |
| DisGeNET | Gene–disease association scores | Free academic tier / commercial (MedBioinformatics) |
| ChEMBL v37 (EMBL-EBI) | Drug bioactivity, MoA, ADMET | CC BY-SA 3.0 |
| Ensembl (EMBL-EBI) | VEP, orthologs, gene lookup | No restrictions (data); Apache 2.0 (code) |
UniProt accessions are used throughout as protein identifiers — they key AlphaFold structures and Open Targets cross-references — but the UniProt API itself is not queried as a data source. Domain (InterPro), Gene Ontology, experimental-structure (RCSB PDB), and tissue-expression (Human Protein Atlas) lookups are not integrated in this release.
See NOTICE for full attributions.
clients/_base.py
├── Air-gap enforcement (refuses sockets when ALPHAFOLD_OFFLINE=1)
├── Token-bucket rate limiting (aiolimiter)
├── Exponential backoff with jitter (tenacity)
├── Circuit breaker (CLOSED / OPEN / HALF_OPEN)
└── Content-addressed SHA-256 dedup of upstream responses
storage/knowledge_graph.py
├── SQLite WAL mode (embedded, ACID)
├── 6 entity tables: proteins, variants, diseases, drugs, genes, phenotypes
├── 4 relationship tables: protein_disease, protein_drug, variant_disease, gene_phenotype
├── tool_invocations audit table (SHA-256 of input + output, timestamps)
└── Analytical views: variant_summary, drug_landscape
domain/disease.py
└── Pure Python frozen dataclasses (PathogenicityClass, VariantReport, ...)
See ARCHITECTURE.md for the full module map.
pytest --collect-only.src/alphafold_sovereign/clients,
domain, storage, server, tools): 100% line + branch,
every shipped module at 100%.ruff (full ruleset). Type checking: mypy --strict on the
domain, clients, and storage subtrees.bandit plus CodeQL security-extended.scripts/replicate.sh.The full CI matrix (Python 3.10, 3.11, 3.12, 3.13 × Ubuntu, macOS)
runs on every push. Test counts and coverage percentages above are
the numbers a git clone && uv run pytest produces on the current
HEAD; if you find a divergence, please open an issue.
DCO sign-off required (git commit -s). No copyright assignment.
Coverage gate: CI enforces 100% line and branch coverage on the shipped surface (nox -s cov).
Full guide: CONTRIBUTING.md.
uniprot-mcp — Model Context Protocol server for UniProt Swiss-Prot and TrEMBL (pip install uniprot-mcp-server).semantic-scholar-mcp — Semantic Scholar MCP server, 200M+ academic papers (pip install s2-mcp-server).Machine-readable metadata: CITATION.cff (GitHub
renders a "Cite this repository" button in the sidebar that consumes
this file).
@software{maniches_alphafold_sovereign_mcp,
author = {Maniches, Santiago},
title = {AlphaFold Sovereign MCP},
year = {2026},
version = {1.2.0},
url = {https://github.com/smaniches/alphafold-sovereign-mcp},
license = {Apache-2.0},
orcid = {0009-0005-6480-1987},
doi = {10.5281/zenodo.20134773}
}
When citing results derived from this software, please also cite the upstream data sources (AlphaFold DB, Open Targets, ChEMBL, Ensembl, ClinVar, gnomAD, MONDO, HPO, DisGeNET) according to their own citation requirements.
Copyright 2024–2026 Santiago Maniches.
Licensed under the Apache License, Version 2.0. See LICENSE.
Patent reservation: see PATENTS.md.
Trademark policy: see TRADEMARKS.md.
cocaxcode/database-mcp
io.github.infoinlet-marketplace/mcp-mysql
io.github.yash-0620/postgres-mcp-secured