A comprehensive scraping toolkit that brings 26 web extraction tools into Claude and other MCP clients. The server includes both free local operations (text extraction, link parsing, metadata scraping) and metered premium features (deep crawling, autonomous research agent, stealth browsing). Notable is the extract_with_llm tool that runs against local Ollama models by default, keeping your data on your machine with no per-token cost. The autonomous agent tool is interesting: describe what you need in natural language and it plans and executes the research without requiring URLs upfront. Includes pre-built templates for common sites like Amazon, GitHub, and YouTube. Free tier starts at 1,000 credits with rollover, and the 15 local tools require no API key at all.
26 web scraping, crawling, deep-research & autonomous-extraction tools for Claude, Cursor & any MCP client.
Clean Markdown & structured JSON from any site. Get started with 1,000 free credits — no credit card required.
⭐ Star us on GitHub to follow along — it genuinely helps others discover the project.
agent, a unified multi-format scrape, document processing, stealth browsing, and more, callable directly from your AI assistant.extract_with_llm runs against a local Ollama model out of the box: no LLM API key, no per-token cost, and your data never leaves your machine. Cloud (OpenAI/Anthropic) is opt-in.agent — describe what you need in natural language; it plans, gathers, and shapes an answer under orchestrator-enforced hard stops (max steps/URLs/wall-clock) — no URLs required.| CrawlForge MCP | Firecrawl | Raw scraping API | |
|---|---|---|---|
| Native MCP server | ✅ 26 tools | ✅ | ❌ |
| Free tier | ✅ 1,000 credits, rollover | Limited | Varies |
| Self-hosted / local LLM extraction (Ollama) | ✅ default, $0/token | ❌ | ❌ |
| Autonomous agent (no URLs needed) | ✅ agent | ✅ | ❌ |
| Deep research with source verification | ✅ deep_research | Partial | ❌ |
| Browser automation / actions | ✅ scrape_with_actions | ✅ | Varies |
| Stealth / anti-detection engines | ✅ Chromium + Camoufox | ✅ | Add-on |
| Pre-built site templates | ✅ 10 sites | ❌ | ❌ |
| License | MIT | AGPL-3.0 | Proprietary |
Comparison reflects publicly documented capabilities at time of writing. CrawlForge is MIT-licensed and MCP-first — built to plug straight into AI coding assistants.
npm install -g crawlforge-mcp-server
The 15 free local tools work immediately with no API key at all — skip straight to step 3 if that's all you need. To unlock the metered premium tools (search_web, crawl_deep, stealth_mode, agent, …):
npx crawlforge-setup
This will:
Don't have an API key? Get one free at https://www.crawlforge.dev/signup
One-step setup (v4.6.0+):
crawlforge initdetects your API key, installs the agent skill, and idempotently merges the MCP config stanza into Claude Code, Claude Desktop, and Cursor. Usecrawlforge init --all --yesto configure every detected client non-interactively.
Add to claude_desktop_config.json:
{
"mcpServers": {
"crawlforge": {
"command": "npx",
"args": ["-y", "crawlforge-mcp-server"]
}
}
}
Location:
~/Library/Application Support/Claude/claude_desktop_config.json%APPDATA%/Claude/claude_desktop_config.json~/.config/Claude/claude_desktop_config.jsonRestart Claude Desktop to activate.
The setup wizard automatically configures Claude Code by adding to ~/.claude.json:
{
"mcpServers": {
"crawlforge": {
"type": "stdio",
"command": "crawlforge-mcp"
}
}
}
After setup, restart Claude Code to activate.
The setup wizard automatically configures Cursor by adding to ~/.cursor/mcp.json:
{
"mcpServers": {
"crawlforge": {
"type": "stdio",
"command": "crawlforge-mcp"
}
}
}
Restart Cursor to activate.
Which launch command?
npx -y crawlforge-mcp-serverneeds no global install and always runs the published version (recommended for Claude Desktop). For a global install (npm i -g crawlforge-mcp-server), use the dedicatedcrawlforge-mcpbin — it resolves on yourPATH, so it survives Node/nvm version switches. The barecrawlforgecommand still launches the server when an MCP client spawns it over stdio (backward compatibility for configs created before v4.2.5); interactively it's the CLI — runcrawlforge mcpto start the server by hand.
CrawlForge is open-core: 15 tools run locally on your machine and are completely free — no API key required. The metered premium tools cover real infrastructure (search fees, proxies, browser farms) and need an API key.
Free Local Tools (0 credits, no API key needed)
| Tool | What it does |
|---|---|
fetch_url | Fetch content from any URL |
extract_text | Extract clean text from web pages |
extract_links | Get all links from a page |
extract_metadata | Extract page metadata (title, OG tags, schema.org) |
scrape | Unified single-fetch, multi-format extraction. Pass a formats array (markdown/html/rawHtml/text/links/metadata/screenshot/json-schema) plus onlyMainContent; one fetch serves every requested format with per-format partial-success warnings. The screenshot format is the one metered exception (2 credits — needs a server browser) |
scrape_structured | Extract structured data with CSS selectors |
scrape_template | Structured data from well-known sites (Amazon, GitHub, LinkedIn, YouTube, Reddit, Hacker News, npm, and more) without writing selectors |
extract_content | Enhanced content extraction |
summarize_content | Generate intelligent summaries |
analyze_content | Comprehensive content analysis |
extract_structured | LLM-powered schema-driven extraction (your own LLM key or local Ollama) |
extract_with_llm | Natural-language extraction. Defaults to a local Ollama model — no API key, no API costs. Pass provider: "openai" | "anthropic" with the matching key for cloud models |
process_document | Multi-format document processing |
list_ollama_models | List the Ollama models installed locally (helps you pick a model for extract_with_llm) |
get_batch_results | Retrieve paginated results for a batch_scrape job by batchId |
Metered Premium Tools (3–10 credits, API key required)
| Tool | Credits | What it does |
|---|---|---|
map_site | 3 | Discover and map website structure (optional search= ranks the discovered URLs) |
track_changes | 3 | Monitor content changes over time |
search_web | 5 | Search the web using Google Search API |
crawl_deep | 5 | Deep crawl entire websites |
batch_scrape | 5 | Process multiple URLs simultaneously |
scrape_with_actions | 5 | Browser automation chains |
generate_llms_txt | 5 | Generate AI interaction guidelines |
localization | 5 | Multi-language and geo-location management |
agent | 8 | Autonomous research/extraction from a natural-language prompt — no URLs required. Plans, gathers, and shapes an answer under hard safety stops (max steps/URLs/wall-clock enforced by the orchestrator, never the LLM) |
deep_research | 10 | Multi-stage research with source verification |
stealth_mode | 10 | Anti-detection browser management |
For the full canonical capabilities reference (all tools, CLI commands, stealth engines, research workflow), see SKILL.md.
15 local tools are free forever — no API key, no credit card. Credits only meter the premium tools that run on CrawlForge infrastructure.
| Plan | Credits/Month | Best For |
|---|---|---|
| Free | 1,000 | Testing & personal projects |
| Hobby ($19) | 5,000 | Small projects & development |
| Professional ($99) | 50,000 | Professional use & production |
| Business ($399) | 250,000 | Large scale operations |
All plans include:
# Optional: Set API key via environment
export CRAWLFORGE_API_KEY="cf_live_your_api_key_here"
# Optional: Custom API endpoint (for enterprise)
export CRAWLFORGE_API_URL="https://api.crawlforge.dev"
# As of v3.0.18, this variable is validated against an allow-list of CrawlForge backend hosts.
# Optional: Local LLM (Ollama) overrides — extract_with_llm defaults to Ollama
export OLLAMA_BASE_URL="http://localhost:11434" # default
export OLLAMA_DEFAULT_MODEL="llama3.2" # default; any locally-pulled model name works
# Optional: Cloud LLM keys — only needed when you pass provider: "openai" or "anthropic"
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
extract_with_llm with Ollama)extract_with_llm defaults to a local Ollama model — no API key, no API costs, no data leaving your machine.
# 1. Install Ollama: https://ollama.com
# 2. Pull any model from https://ollama.com/library
ollama pull llama3.2
# 3. Discover what's installed (from your MCP client)
# list_ollama_models()
# 4. Extract — defaults to Ollama with the model from step 2
# extract_with_llm({ url: "https://example.com", prompt: "…", model: "llama3.2" })
Your configuration is stored at ~/.crawlforge/config.json:
{
"apiKey": "cf_live_...",
"userId": "user_...",
"email": "you@example.com"
}
Once configured, use these tools in your AI assistant:
"Search for the latest AI news"
"Extract all links from example.com"
"Crawl the documentation site and summarize it"
"Monitor this page for changes"
"Extract product prices from this e-commerce site"
~/.crawlforge/config.json{crawlforge.dev, www.crawlforge.dev, api.crawlforge.dev}, HTTPS required). Setting CRAWLFORGE_API_URL to an arbitrary host is blocked at parse time.scrape_with_actions accepts only 7 action types (wait, click, type, press, scroll, screenshot, executeJavaScript). No download, file-write, or arbitrary cross-page navigation primitives exist.executeJavaScript action throws by default. Set ALLOW_JAVASCRIPT_EXECUTION=true at deploy time to enable (not recommended in production).deep_research (>50 URLs), batch_scrape (sync mode, >25 URLs), crawl_deep (projected >500 pages), extract_structured (schema has >3 required fields with no LLM configured). Credit-low situations also elicit. Confirmation is best-effort: if the MCP client does not support elicitation the tool proceeds (fail-open).withAuth(); metered tools check and deduct credits before execution (fail-closed since v3.0.18). Free local tools (cost 0) skip the credit path entirely.See docs/sandboxing-and-approvals.md for the full reference.
v3.0.3 (2025-10-01): Removed authentication bypass vulnerability. All users must authenticate with valid API keys.
For the full security policy and how to report a vulnerability, see SECURITY.md.
MIT License - see LICENSE file for details.
Contributions are welcome! Please read our Contributing Guide first.
Built with ❤️ by the CrawlForge team
io.github.pipeworx-io/brave-search
marcopesani/mcp-server-serper
brave/brave-search-mcp-server
com.mcparmory/google-search-console
acamolese/google-search-console-mcp
io.github.sarahpark/google-search-console