Unified search and fetch across scientific data repositories that researchers actually use. One search query fans out to Zenodo, DataCite members like Figshare and Dryad, NCBI omics databases (GEO, SRA, BioProject), PubMed, DataONE, OmicsDI, and HuggingFace datasets, then deduplicates by DOI and normalizes everything into a single model. The resolve tool pulls full file manifests with checksums, and fetch streams files to disk with MD5 verification. Handles taxonomy synonym expansion through NCBI so renamed species don't break queries, bridges papers to their underlying datasets via eLink and ScholeXplorer, and surfaces trust signals like citation counts and version freshness. The optional operate tool lets you preview schemas and run read-only SQL against remote Parquet files without downloading them first.
claude mcp add --transport stdio musharna-data-aggregator-mcp uvx data-aggregator-mcp