Arxiv Mcp Server

2.8k10 toolsSTDIOregistry active

Summary

The Arxiv MCP Server enables AI assistants to search and access research papers from arXiv through the Model Context Protocol, providing tools to query papers with filters for date ranges and categories, download and read paper content, and list downloaded papers. It solves the problem of programmatically integrating arXiv's research repository with AI models by offering a standardized interface for paper discovery and access without requiring direct API management by the client.

Install to Claude Code

verified

claude mcp add arxiv --env ARXIV_STORAGE_PATH=YOUR_ARXIV_STORAGE_PATH -- uvx arxiv-mcp-server --storage-path '${ARXIV_STORAGE_PATH}'

Run in your terminal. Replace YOUR_* placeholders with real values; add --scope user to install for every project.

Review the command, arguments, and environment values before installing — MCP servers run with your local permissions.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Vibe Prospecting MCP

Connect Claude to +800M contacts, +150M companies. Find & Enrich leads in chat.

Try For Free →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Vibe Prospecting MCP

Connect Claude to +800M contacts, +150M companies. Find & Enrich leads in chat.

Try For Free →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Tools

Verified live against the running server on Jun 10, 2026.

verified live10 tools

search_papersSearch for papers on arXiv with advanced filtering and query optimization. QUERY CONSTRUCTION GUIDELINES: - Use QUOTED PHRASES for exact matches: "multi-agent systems", "neural networks", "machine learning" - Combine related concepts with OR: "AI agents" OR "software agents" O...6 params

Search for papers on arXiv with advanced filtering and query optimization. QUERY CONSTRUCTION GUIDELINES: - Use QUOTED PHRASES for exact matches: "multi-agent systems", "neural networks", "machine learning" - Combine related concepts with OR: "AI agents" OR "software agents" O...

Parameters* required

query*string

Search query using quoted phrases for exact matches (e.g., '"machine learning" OR "deep learning"') or specific technical terms. Avoid overly broad or generic terms.

date_tostring

End date for papers (YYYY-MM-DD format). Use with date_from to find historical work, e.g., '2020-12-31' for older research.

sort_bystring

Sort results by 'relevance' (most relevant first, default) or 'date' (newest first). Use 'relevance' for focused searches, 'date' for recent developments.one of relevance · date

date_fromstring

Start date for papers (YYYY-MM-DD format). Use to find recent work, e.g., '2023-01-01' for last 2 years.

categoriesarray

Strongly recommended: arXiv categories to focus search (e.g., ['cs.AI', 'cs.MA'] for agent research, ['cs.LG'] for ML, ['cs.CL'] for NLP, ['cs.CV'] for vision). Greatly improves relevance.

max_resultsinteger

Maximum number of results to return (default: 10, max: 50). Use 15-20 for comprehensive searches.

download_paperDownload a paper from arXiv and return its full text content. Tries the HTML version first for clean extraction; falls back to PDF conversion if HTML is unavailable. Returns the paper content directly so you can read it immediately.1 params

Download a paper from arXiv and return its full text content. Tries the HTML version first for clean extraction; falls back to PDF conversion if HTML is unavailable. Returns the paper content directly so you can read it immediately.

Parameters* required

paper_id*string

The arXiv ID of the paper to download (e.g. '2103.12345')

list_papersList all papers that have been downloaded and stored locally via download_paper. Returns arXiv IDs only — use read_paper to access content. Returns an empty list if no papers have been downloaded yet. Workflow: search_papers -> download_paper -> list_papers -> read_paper.

List all papers that have been downloaded and stored locally via download_paper. Returns arXiv IDs only — use read_paper to access content. Returns an empty list if no papers have been downloaded yet. Workflow: search_papers -> download_paper -> list_papers -> read_paper.

No parameters — call it with no arguments.

read_paperRead the full text content of a paper that was previously downloaded via download_paper. Returns the paper in markdown format. Will fail with a clear error if the paper has not been downloaded yet — call download_paper first. Workflow: search_papers -> download_paper -> read_p...1 params

Read the full text content of a paper that was previously downloaded via download_paper. Returns the paper in markdown format. Will fail with a clear error if the paper has not been downloaded yet — call download_paper first. Workflow: search_papers -> download_paper -> read_p...

Parameters* required

paper_id*string

The arXiv ID of the paper to read

get_abstractFetch the abstract and metadata of an arXiv paper by ID, WITHOUT downloading the full paper. Use this before download_paper to assess relevance and save tokens. Returns: title, authors, abstract, categories, published date, and PDF URL. Workflow tip: search_papers -> get_abstr...1 params

Fetch the abstract and metadata of an arXiv paper by ID, WITHOUT downloading the full paper. Use this before download_paper to assess relevance and save tokens. Returns: title, authors, abstract, categories, published date, and PDF URL. Workflow tip: search_papers -> get_abstr...

Parameters* required

paper_id*string

The arXiv paper ID (e.g. '2401.12345' or '2404.19756')

semantic_searchSemantic similarity search over papers you have already downloaded locally via download_paper. Supports free-text queries (e.g. 'attention mechanisms for long sequences') or finding papers similar to a given paper_id. IMPORTANT: only searches your local downloaded collection —...3 params

Semantic similarity search over papers you have already downloaded locally via download_paper. Supports free-text queries (e.g. 'attention mechanisms for long sequences') or finding papers similar to a given paper_id. IMPORTANT: only searches your local downloaded collection —...

Parameters* required

querystring

Free-text semantic query.

paper_idstring

Find papers semantically similar to this arXiv paper ID.

max_resultsinteger

Maximum number of results to return (default: 10).default: 10

reindexRebuild the local semantic index for downloaded papers.1 params

Rebuild the local semantic index for downloaded papers.

Parameters* required

clear_existingboolean

If true, clear the existing index before rebuilding.default: true

citation_graphReturn papers citing an arXiv paper and papers that it references using Semantic Scholar's citation graph.1 params

Return papers citing an arXiv paper and papers that it references using Semantic Scholar's citation graph.

Parameters* required

paper_id*string

arXiv ID (for example: 2401.12345).

watch_topicSave or update a persistent research topic watch. When checked via check_alerts, returns only papers published since the last check — acting as a standing alert for new work on a topic. The topic string uses the same query syntax as search_papers (quoted phrases, field specifi...3 params

Save or update a persistent research topic watch. When checked via check_alerts, returns only papers published since the last check — acting as a standing alert for new work on a topic. The topic string uses the same query syntax as search_papers (quoted phrases, field specifi...

Parameters* required

topic*string

Query string to monitor. Uses arXiv search syntax — quoted phrases for exact matches, field specifiers (ti:, au:, abs:), and boolean operators (AND, OR, ANDNOT). Example: '"reinforcement learning" AND "robotics"'.

categoriesarray

Optional arXiv category filter (e.g. ['cs.LG', 'cs.AI']). Narrows results to specific fields.

max_resultsinteger

Maximum papers to return per alert check (default: 10).default: 10

check_alertsCheck all saved topic watches for newly published papers since the last check. Omitting the topic parameter runs ALL saved watches and returns new papers for each. Passing a topic string checks only that specific watch. Updates each watch's last_checked timestamp after running...1 params

Check all saved topic watches for newly published papers since the last check. Omitting the topic parameter runs ALL saved watches and returns new papers for each. Passing a topic string checks only that specific watch. Updates each watch's last_checked timestamp after running...

Parameters* required

topicstring

Optional: check only this specific watched topic (must match the topic string used in watch_topic exactly). Omit to check all saved watches.

ArXiv MCP Server

🔍 Enable AI assistants to search and access arXiv papers through a simple MCP interface.

The ArXiv MCP Server provides a bridge between AI assistants and arXiv's research repository through the Model Context Protocol (MCP). It allows AI models to search for papers and access their content in a programmatic way.

🤝 Contribute • 📝 Report Bug

✨ Core Features

🔎 Paper Search: Query arXiv papers with filters for date ranges and categories
📄 Paper Access: Download and read paper content
📋 Paper Listing: View all downloaded papers
🗃️ Local Storage: Papers are saved locally for faster access
📝 Prompts: A set of research prompts for paper analysis

🔒 Security

Prompt Injection Risk

Paper content retrieved from arXiv is untrusted external input.

When an AI assistant downloads or reads a paper through this server, the paper's text is passed directly into the model's context. A maliciously crafted paper could embed adversarial instructions designed to hijack the AI's behavior — for example, instructing it to exfiltrate data, invoke other tools with unintended arguments, or override system-level instructions. This is a known class of attack described by OWASP as LLM01: Prompt Injection and by the OWASP Agentic AI framework as AG01: Prompt Injection in LLM-Integrated Systems.

Recommended Mitigations

Use read-only MCP configurations — where possible, configure the MCP client so that the arxiv-mcp-server cannot trigger write operations or invoke other tools on your behalf.
Review paper content before acting on AI summaries — if an AI summary asks you to run commands or visit external URLs that were not part of your original request, treat that as a red flag.
Be cautious in multi-tool setups — agentic pipelines that combine this server with filesystem, shell, or browser tools are higher risk; a prompt injection in a paper could chain tool calls unexpectedly.
Treat AI-generated summaries as data, not instructions — always apply human judgment before executing any action the AI recommends after reading a paper.

References

🚀 Quick Start

Installing via Smithery

To install ArXiv Server for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install arxiv-mcp-server --client claude

Installing via Claude Desktop (.mcpb)

The .mcpb bundle is the one-click install path for Claude Desktop on macOS. It bundles the server code and Python package dependencies, so users do not need uv, pip, or manual MCP JSON configuration. Python 3.11+ must still be available on the user's machine.

Download the artifact matching your Mac from the latest release:
- Apple Silicon: arxiv-mcp-server-darwin-arm64-<version>.mcpb
- Intel: arxiv-mcp-server-darwin-x86_64-<version>.mcpb
In Claude Desktop open Settings → Extensions (or drag-and-drop the file onto the Claude Desktop window).
Click Install and, when prompted, set your preferred paper storage directory (defaults to ~/.arxiv-mcp-server/papers).

Claude Desktop launches the bundled server over stdio — no configuration file edits needed.

Installing Manually

Important — use uv tool install, not npm/pnpm or uv pip install

This project publishes the supported server as a Python package on PyPI. Do not install arxiv-mcp-server with npm install, pnpm add, or npx arxiv-mcp-server: the npm package with this name is an unrelated third-party package and has its own Python-detection wrapper.

Running uv pip install arxiv-mcp-server installs the package into the current virtual environment but does not place the arxiv-mcp-server executable on your PATH. You must use uv tool install so that uv creates an isolated environment and exposes the executable globally:

uv tool install arxiv-mcp-server

After this, the arxiv-mcp-server command will be available on your PATH.

PDF fallback (older papers): Most arXiv papers have an HTML version which the base install handles automatically. For older papers that only have a PDF, the server needs the [pdf] extra (pymupdf4llm). Install it with:
uv tool install 'arxiv-mcp-server[pdf]'

You can verify it with:

arxiv-mcp-server --help

If you previously ran uv pip install arxiv-mcp-server and the command is missing, uninstall it and re-install with uv tool install as shown above.

For development:

# Clone and set up development environment
git clone https://github.com/blazickjp/arxiv-mcp-server.git
cd arxiv-mcp-server

# Create and activate virtual environment
uv venv
source .venv/bin/activate

# Install with test dependencies (development only — no global executable)
uv pip install -e ".[test]"

🤖 Codex Plugin Integration

This repository now includes a Codex plugin manifest at .codex-plugin/plugin.json and a portable MCP config at .mcp.json so Codex-oriented tooling can discover the server without inventing its own install recipe.

The Codex integration uses the same stdio launch path documented elsewhere in this README:

{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": ["arxiv-mcp-server"]
    }
  }
}

If your Codex client supports plugin manifests, point it at ./.codex-plugin/plugin.json. If it only supports raw MCP configuration, use ./.mcp.json directly.

🔌 MCP Integration

Add this configuration to your MCP client config file:

{
    "mcpServers": {
        "arxiv-mcp-server": {
            "command": "uv",
            "args": [
                "tool",
                "run",
                "arxiv-mcp-server",
                "--storage-path", "/path/to/paper/storage"
            ]
        }
    }
}

For Development:

{
    "mcpServers": {
        "arxiv-mcp-server": {
            "command": "uv",
            "args": [
                "--directory",
                "path/to/cloned/arxiv-mcp-server",
                "run",
                "arxiv-mcp-server",
                "--storage-path", "/path/to/paper/storage"
            ]
        }
    }
}

HTTP Transport

For server deployments where stdio is not practical, run the server with Streamable HTTP:

TRANSPORT=http HOST=127.0.0.1 PORT=8080 arxiv-mcp-server --storage-path /path/to/papers

Then configure an MCP client that supports Streamable HTTP:

{
    "mcpServers": {
        "arxiv-mcp-server": {
            "type": "http",
            "url": "http://127.0.0.1:8080/mcp"
        }
    }
}

The default HTTP bind host is 127.0.0.1. Streamable HTTP enables MCP DNS rebinding protection by default and allows loopback hosts for the configured port. If exposing the server through a reverse proxy, keep it bound to localhost unless you have added authentication and network controls upstream; set ALLOWED_HOSTS and ALLOWED_ORIGINS to the external host/origin values your proxy forwards.

🔒 Security Note

arXiv papers are user-generated, untrusted content. Paper text returned by this server may contain prompt injection attempts — crafted text designed to manipulate an AI assistant's behavior. Treat all paper content as untrusted input.

In production environments, apply appropriate sandboxing and avoid feeding raw paper content into agentic pipelines that have access to sensitive tools or data without review. See SECURITY.md for the full security policy.

💡 Available Tools

Core Workflow

The typical workflow for deep paper research is:

search_papers → download_paper → read_paper

list_papers shows what you have locally. semantic_search searches across your local collection.

1. Paper Search

Search arXiv with optional category, date, and boolean filters. Enforces arXiv's 3-second rate limit automatically. If rate limited, wait 60 seconds before retrying.

result = await call_tool("search_papers", {
    "query": "\"KAN\" OR \"Kolmogorov-Arnold Networks\"",
    "max_results": 10,
    "date_from": "2024-01-01",
    "categories": ["cs.LG", "cs.AI"],
    "sort_by": "date"   # or "relevance" (default)
})

Supported categories include cs.AI, cs.LG, cs.CL, cs.CV, cs.NE, stat.ML, math.OC, quant-ph, eess.SP, and more. See tool description for the full list.

2. Paper Download

Download a paper by its arXiv ID. Tries HTML first, falls back to PDF. Stores the paper locally for read_paper and semantic_search. The response includes content_length, returned_chars, next_start, and is_truncated so clients can safely page through very large papers without mistaking client-side output caps for failed downloads.

result = await call_tool("download_paper", {
    "paper_id": "2401.12345"
})

# For very large papers, request bounded chunks:
result = await call_tool("download_paper", {
    "paper_id": "2401.12345",
    "start": 0,
    "max_chars": 50000
})

For older papers that only have a PDF, install the [pdf] extra: uv tool install 'arxiv-mcp-server[pdf]'

3. List Papers

List all papers downloaded locally. Returns arXiv IDs only — use read_paper to access content.

result = await call_tool("list_papers", {})

4. Read Paper

Read the full text of a locally downloaded paper in markdown. Requires download_paper to be called first. Use start and max_chars with the returned next_start value to page through large papers.

result = await call_tool("read_paper", {
    "paper_id": "2401.12345"
})

result = await call_tool("read_paper", {
    "paper_id": "2401.12345",
    "start": 50000,
    "max_chars": 50000
})

📝 Research Prompts

The server offers specialized prompts to help analyze academic papers:

Paper Analysis Prompt

A comprehensive workflow for analyzing academic papers that only requires a paper ID:

result = await call_prompt("deep-paper-analysis", {
    "paper_id": "2401.12345"
})

This prompt includes:

Detailed instructions for using available tools (list_papers, download_paper, read_paper, search_papers)
A systematic workflow for paper analysis
Comprehensive analysis structure covering:
- Executive summary
- Research context
- Methodology analysis
- Results evaluation
- Practical and theoretical implications
Future research directions
Broader impacts

Pro Prompt Pack

summarize_paper: concise structured summary for one paper.
compare_papers: side-by-side technical comparison across paper IDs.
literature_review: thematic synthesis across a topic and optional paper set.

⚙️ Configuration

Configure through command-line options and environment variables:

Setting	Purpose	Default
`--storage-path`	Paper storage location	`~/.arxiv-mcp-server/papers`
`MAX_RESULTS`	Maximum search results	`50`
`REQUEST_TIMEOUT`	API timeout in seconds	`60`
`TRANSPORT`	Transport type: `stdio`, `http`, or `streamable-http`	`stdio`
`HOST`	Host to bind to in HTTP mode	`127.0.0.1`
`PORT`	Port to listen on in HTTP mode	`8000`
`ALLOWED_HOSTS`	Comma-separated extra allowed Host header values for Streamable HTTP DNS rebinding protection	empty
`ALLOWED_ORIGINS`	Comma-separated extra allowed Origin header values for Streamable HTTP DNS rebinding protection	empty

🧪 Testing

Run the test suite:

python -m pytest

🧪 Experimental Features

These features are not yet fully tested and may behave unexpectedly. Use with caution.

The following tools require additional dependencies and are under active development:

uv pip install -e ".[pro]"

Semantic Search

Semantic similarity search over your locally downloaded papers only. Returns empty results if no papers have been downloaded yet. Requires [pro] dependencies.

result = await call_tool("semantic_search", {
    "query": "test-time adaptation in multimodal transformers",
    "max_results": 5
})
# or find papers similar to a known paper:
result = await call_tool("semantic_search", {
    "paper_id": "2404.19756",
    "max_results": 5
})

Citation Graph

Fetch references and citing papers via Semantic Scholar. Works on any arXiv ID — no local download required.

result = await call_tool("citation_graph", {
    "paper_id": "2401.12345"
})

Research Alerts

Save topic watches and poll for newly published papers since the last check. Uses the same query syntax as search_papers.

# Register a watch (idempotent — calling again updates the existing watch)
await call_tool("watch_topic", {
    "topic": "\"multi-agent reinforcement learning\"",
    "categories": ["cs.AI", "cs.LG"],
    "max_results": 10
})

# Check all watches — returns only papers published since last check
result = await call_tool("check_alerts", {})

# Check a single watch
result = await call_tool("check_alerts", {"topic": "\"multi-agent reinforcement learning\""})

Advanced Prompts

summarize_paper, compare_papers, and literature_review for deeper research workflows. Requires [pro] dependencies.

📄 License

Released under the Apache License 2.0. See the LICENSE file for details.

Made with ❤️ by the Pearl Labs Team

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Vibe Prospecting MCP

Connect Claude to +800M contacts, +150M companies. Find & Enrich leads in chat.

Try For Free →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Configuration

ARXIV_STORAGE_PATH

Optional path for storing downloaded papers locally.

ArXiv MCP Server

🔍 Enable AI assistants to search and access arXiv papers through a simple MCP interface.

🤝 Contribute • 📝 Report Bug

✨ Core Features

🔎 Paper Search: Query arXiv papers with filters for date ranges and categories
📄 Paper Access: Download and read paper content
📋 Paper Listing: View all downloaded papers
🗃️ Local Storage: Papers are saved locally for faster access
📝 Prompts: A set of research prompts for paper analysis

🔒 Security

Prompt Injection Risk

Paper content retrieved from arXiv is untrusted external input.

Recommended Mitigations

Use read-only MCP configurations — where possible, configure the MCP client so that the arxiv-mcp-server cannot trigger write operations or invoke other tools on your behalf.
Review paper content before acting on AI summaries — if an AI summary asks you to run commands or visit external URLs that were not part of your original request, treat that as a red flag.
Be cautious in multi-tool setups — agentic pipelines that combine this server with filesystem, shell, or browser tools are higher risk; a prompt injection in a paper could chain tool calls unexpectedly.
Treat AI-generated summaries as data, not instructions — always apply human judgment before executing any action the AI recommends after reading a paper.

References

🚀 Quick Start

Installing via Smithery

To install ArXiv Server for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install arxiv-mcp-server --client claude

Installing via Claude Desktop (.mcpb)

Download the artifact matching your Mac from the latest release:
- Apple Silicon: arxiv-mcp-server-darwin-arm64-<version>.mcpb
- Intel: arxiv-mcp-server-darwin-x86_64-<version>.mcpb
In Claude Desktop open Settings → Extensions (or drag-and-drop the file onto the Claude Desktop window).
Click Install and, when prompted, set your preferred paper storage directory (defaults to ~/.arxiv-mcp-server/papers).

Claude Desktop launches the bundled server over stdio — no configuration file edits needed.

Installing Manually

Important — use uv tool install, not npm/pnpm or uv pip install

This project publishes the supported server as a Python package on PyPI. Do not install arxiv-mcp-server with npm install, pnpm add, or npx arxiv-mcp-server: the npm package with this name is an unrelated third-party package and has its own Python-detection wrapper.

Running uv pip install arxiv-mcp-server installs the package into the current virtual environment but does not place the arxiv-mcp-server executable on your PATH. You must use uv tool install so that uv creates an isolated environment and exposes the executable globally:

uv tool install arxiv-mcp-server

After this, the arxiv-mcp-server command will be available on your PATH.

PDF fallback (older papers): Most arXiv papers have an HTML version which the base install handles automatically. For older papers that only have a PDF, the server needs the [pdf] extra (pymupdf4llm). Install it with:
uv tool install 'arxiv-mcp-server[pdf]'

You can verify it with:

arxiv-mcp-server --help

If you previously ran uv pip install arxiv-mcp-server and the command is missing, uninstall it and re-install with uv tool install as shown above.

For development:

# Clone and set up development environment
git clone https://github.com/blazickjp/arxiv-mcp-server.git
cd arxiv-mcp-server

# Create and activate virtual environment
uv venv
source .venv/bin/activate

# Install with test dependencies (development only — no global executable)
uv pip install -e ".[test]"

🤖 Codex Plugin Integration

The Codex integration uses the same stdio launch path documented elsewhere in this README:

{
  "mcpServers": {
    "arxiv": {
      "command": "uvx",
      "args": ["arxiv-mcp-server"]
    }
  }
}

If your Codex client supports plugin manifests, point it at ./.codex-plugin/plugin.json. If it only supports raw MCP configuration, use ./.mcp.json directly.

🔌 MCP Integration

Add this configuration to your MCP client config file:

{
    "mcpServers": {
        "arxiv-mcp-server": {
            "command": "uv",
            "args": [
                "tool",
                "run",
                "arxiv-mcp-server",
                "--storage-path", "/path/to/paper/storage"
            ]
        }
    }
}

For Development:

{
    "mcpServers": {
        "arxiv-mcp-server": {
            "command": "uv",
            "args": [
                "--directory",
                "path/to/cloned/arxiv-mcp-server",
                "run",
                "arxiv-mcp-server",
                "--storage-path", "/path/to/paper/storage"
            ]
        }
    }
}

HTTP Transport

For server deployments where stdio is not practical, run the server with Streamable HTTP:

TRANSPORT=http HOST=127.0.0.1 PORT=8080 arxiv-mcp-server --storage-path /path/to/papers

Then configure an MCP client that supports Streamable HTTP:

{
    "mcpServers": {
        "arxiv-mcp-server": {
            "type": "http",
            "url": "http://127.0.0.1:8080/mcp"
        }
    }
}

🔒 Security Note

💡 Available Tools

Core Workflow

The typical workflow for deep paper research is:

search_papers → download_paper → read_paper

list_papers shows what you have locally. semantic_search searches across your local collection.

1. Paper Search

Search arXiv with optional category, date, and boolean filters. Enforces arXiv's 3-second rate limit automatically. If rate limited, wait 60 seconds before retrying.

result = await call_tool("search_papers", {
    "query": "\"KAN\" OR \"Kolmogorov-Arnold Networks\"",
    "max_results": 10,
    "date_from": "2024-01-01",
    "categories": ["cs.LG", "cs.AI"],
    "sort_by": "date"   # or "relevance" (default)
})

Supported categories include cs.AI, cs.LG, cs.CL, cs.CV, cs.NE, stat.ML, math.OC, quant-ph, eess.SP, and more. See tool description for the full list.

2. Paper Download

result = await call_tool("download_paper", {
    "paper_id": "2401.12345"
})

# For very large papers, request bounded chunks:
result = await call_tool("download_paper", {
    "paper_id": "2401.12345",
    "start": 0,
    "max_chars": 50000
})

For older papers that only have a PDF, install the [pdf] extra: uv tool install 'arxiv-mcp-server[pdf]'

3. List Papers

List all papers downloaded locally. Returns arXiv IDs only — use read_paper to access content.

result = await call_tool("list_papers", {})

4. Read Paper

result = await call_tool("read_paper", {
    "paper_id": "2401.12345"
})

result = await call_tool("read_paper", {
    "paper_id": "2401.12345",
    "start": 50000,
    "max_chars": 50000
})

📝 Research Prompts

The server offers specialized prompts to help analyze academic papers:

Paper Analysis Prompt

A comprehensive workflow for analyzing academic papers that only requires a paper ID:

result = await call_prompt("deep-paper-analysis", {
    "paper_id": "2401.12345"
})

This prompt includes:

Detailed instructions for using available tools (list_papers, download_paper, read_paper, search_papers)
A systematic workflow for paper analysis
Comprehensive analysis structure covering:
- Executive summary
- Research context
- Methodology analysis
- Results evaluation
- Practical and theoretical implications
Future research directions
Broader impacts

Pro Prompt Pack

summarize_paper: concise structured summary for one paper.
compare_papers: side-by-side technical comparison across paper IDs.
literature_review: thematic synthesis across a topic and optional paper set.

⚙️ Configuration

Configure through command-line options and environment variables:

Setting	Purpose	Default
`--storage-path`	Paper storage location	`~/.arxiv-mcp-server/papers`
`MAX_RESULTS`	Maximum search results	`50`
`REQUEST_TIMEOUT`	API timeout in seconds	`60`
`TRANSPORT`	Transport type: `stdio`, `http`, or `streamable-http`	`stdio`
`HOST`	Host to bind to in HTTP mode	`127.0.0.1`
`PORT`	Port to listen on in HTTP mode	`8000`
`ALLOWED_HOSTS`	Comma-separated extra allowed Host header values for Streamable HTTP DNS rebinding protection	empty
`ALLOWED_ORIGINS`	Comma-separated extra allowed Origin header values for Streamable HTTP DNS rebinding protection	empty

🧪 Testing

Run the test suite:

python -m pytest

🧪 Experimental Features

These features are not yet fully tested and may behave unexpectedly. Use with caution.

The following tools require additional dependencies and are under active development:

uv pip install -e ".[pro]"

Semantic Search

Semantic similarity search over your locally downloaded papers only. Returns empty results if no papers have been downloaded yet. Requires [pro] dependencies.

result = await call_tool("semantic_search", {
    "query": "test-time adaptation in multimodal transformers",
    "max_results": 5
})
# or find papers similar to a known paper:
result = await call_tool("semantic_search", {
    "paper_id": "2404.19756",
    "max_results": 5
})

Citation Graph

Fetch references and citing papers via Semantic Scholar. Works on any arXiv ID — no local download required.

result = await call_tool("citation_graph", {
    "paper_id": "2401.12345"
})

Research Alerts

Save topic watches and poll for newly published papers since the last check. Uses the same query syntax as search_papers.

# Register a watch (idempotent — calling again updates the existing watch)
await call_tool("watch_topic", {
    "topic": "\"multi-agent reinforcement learning\"",
    "categories": ["cs.AI", "cs.LG"],
    "max_results": 10
})

# Check all watches — returns only papers published since last check
result = await call_tool("check_alerts", {})

# Check a single watch
result = await call_tool("check_alerts", {"topic": "\"multi-agent reinforcement learning\""})

Advanced Prompts

summarize_paper, compare_papers, and literature_review for deeper research workflows. Requires [pro] dependencies.

📄 License

Released under the Apache License 2.0. See the LICENSE file for details.

Made with ❤️ by the Pearl Labs Team

Arxiv Mcp Server

Install to Claude Code

Tools

ArXiv MCP Server

✨ Core Features

🔒 Security

Prompt Injection Risk

Recommended Mitigations

References

🚀 Quick Start

Installing via Smithery

Installing via Claude Desktop (.mcpb)

Installing Manually

🤖 Codex Plugin Integration

🔌 MCP Integration

HTTP Transport

🔒 Security Note

💡 Available Tools

Core Workflow

1. Paper Search

2. Paper Download

3. List Papers

4. Read Paper

📝 Research Prompts

Paper Analysis Prompt

Pro Prompt Pack

⚙️ Configuration

🧪 Testing

🧪 Experimental Features

Semantic Search

Citation Graph

Research Alerts

Advanced Prompts

📄 License

Configuration

Arxiv Mcp Server

Install to Claude Code

Tools

ArXiv MCP Server

✨ Core Features

🔒 Security

Prompt Injection Risk

Recommended Mitigations

References

🚀 Quick Start

Installing via Smithery

Installing via Claude Desktop (.mcpb)

Installing Manually

🤖 Codex Plugin Integration

🔌 MCP Integration

HTTP Transport

🔒 Security Note

💡 Available Tools

Core Workflow

1. Paper Search

2. Paper Download

3. List Papers

4. Read Paper

📝 Research Prompts

Paper Analysis Prompt

Pro Prompt Pack

⚙️ Configuration

🧪 Testing

🧪 Experimental Features

Semantic Search

Citation Graph

Research Alerts

Advanced Prompts

📄 License

Configuration

Related Search & Web Crawling MCP Servers

Related Search & Web Crawling MCP Servers