Autonomous web crawler packaged as a single Rust binary with MCP server built in. Exposes 17 browser tools (navigate, click, fill forms, run JS, screenshots, multi-tab control) plus a goal-driven run_goal agent that takes plain English descriptions and figures out what to click and extract. Smart about when to use headless Chrome versus fast HTTP fetching by detecting JS framework markers. Works as both an MCP client (can call other MCP servers to extend its toolset) and server (plug into Claude Desktop, Cursor, Zed). Ships with stealth browser capabilities and supports 25 LLM providers. Reach for this when you need structured scraping without writing selectors or want to give Claude real browser control beyond conversational browsing.
█████╗ ██████╗██████╗ █████╗ ██╗ ██╗██╗ ██╔══██╗██╔════╝██╔══██╗██╔══██╗██║ ██║██║ ███████║██║ ██████╔╝███████║██║ █╗ ██║██║ ██╔══██║██║ ██╔══██╗██╔══██║██║███╗██║██║ ██║ ██║╚██████╗██║ ██║██║ ██║╚███╔███╔╝███████╗ ╚═╝ ╚═╝ ╚═════╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚══╝╚══╝ ╚══════╝
LLM-powered web crawler. Describe what you want in plain English — get structured data back.
Single binary. No Python runtime. 21 tools. 25 LLM providers. MCP server built-in.
Most web scraping still means writing code: XPath selectors, pagination logic, retry handling, anti-bot workarounds. LLMs can read pages like humans do, but wiring one up to a browser is a project in itself.
acrawl is that wiring, packaged as a single Rust binary. You describe a goal; the agent figures out which pages to visit, what to click, what to extract, and when it's done.
cargo build --release produces a self-contained executable. No Python, no Node runtime — just Rust and a Chromium download for browser automation.__next_data__, __nuxt, __vue, ng-app, React roots), auth redirects, and short <noscript> bodies — then transparently escalates to a headless browser.acrawl mcp exposes all 17 browser tools plus an autonomous run_goal agent to any MCP-compatible client — Claude Code, Cursor, Windsurf, VS Code, Zed, JetBrains, TRAE, Gemini CLI, and more. Install with acrawl mcp install.| acrawl | browser-use | Stagehand | Skyvern | Firecrawl | Playwright MCP | Scrapy | Playwright scripts | |
|---|---|---|---|---|---|---|---|---|
| No code needed | Yes | No | No | Partial | No | No | No | No |
| Single binary | Yes | No | No | No | No | No | No | No |
| JS rendering | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes |
| LLM-powered navigation | Yes | Yes | Yes | Yes | Limited | No | No | No |
| No Python / Node needed | Yes | No | No | No | No | No | No | No |
| Form filling / interaction | Yes | Yes | Yes | Yes | No | Yes | No | Yes |
| Sub-agent parallelism | Yes | No | No | Partial | Partial | No | Partial | No |
| 25 LLM providers | Yes | Via LiteLLM | Partial | Partial | N/A | N/A | N/A | N/A |
| MCP client (use tools) | Yes | No | No | No | No | No | No | No |
| MCP server (expose as tools) | Yes | No | No | No | Yes | Yes | No | No |
| Stealth browser built-in | Yes | Cloud only | Via Browserbase | Cloud only | No | No | No | No |
| Open source | Yes | Yes (MIT) | Yes (MIT) | Yes (Apache) | Engine only | Yes (MIT) | Yes (BSD) | Yes (Apache) |
Notes:
pip install. Every action calls an LLM: 2-5s/step, ~$0.02-0.30/task. Cloud tier adds stealth; self-hosted is bare Playwright.act(), extract(), observe()). Action caching reuses successful clicks without re-calling the LLM. Requires Node and, for production, Browserbase cloud hosting.Most AI providers offer some form of browsing, but it is designed for conversational information retrieval, not programmatic web automation. Key constraints:
| acrawl | ChatGPT Agent | Claude Computer Use | Claude in Chrome | Gemini Deep Research | Copilot / Edge | |
|---|---|---|---|---|---|---|
| Real JS-rendered browser | Yes | Yes (sandboxed cloud VM) | Indirect (dev provides env) | Yes (your Chrome) | No (search API only) | Limited (Bing retrieval) |
| Click / fill forms | Yes | Yes (requires user confirmation) | Yes | Yes | No | Limited |
| Programmable / scriptable | Yes | No | Yes (API beta) | No | No | No |
| Sub-agent parallelism | Yes | No | No | No | No | No |
| MCP server (expose as tools) | Yes | No | No | No | No | No |
| Returns structured data | Yes | No (text summaries) | No (screenshots) | No | No | No |
| Stealth / anti-bot | Yes | No | No | No | No | No |
| No vendor lock-in | Yes (25 providers) | OpenAI only | Anthropic only | Anthropic only | Google only | OpenAI / Bing only |
| Runs without paid subscription | Yes (OSS; LLM key needed) | No (Plus/Pro/Business) | No (API cost) | No (Max plan) | Partial | Yes (free tier) |
Notes:
Linux / macOS (x64 / ARM64):
curl -fsSL https://raw.githubusercontent.com/Mingye-Lu/AgenticCrawler/main/install.sh | bash
Windows (x64, PowerShell):
irm https://raw.githubusercontent.com/Mingye-Lu/AgenticCrawler/main/install.ps1 | iex
This downloads the latest binary, verifies its SHA256 checksum, and sets up CloakBrowser for stealth browser automation. Requires Node.js 20+ for browser features.
acrawl checks for updates on startup and shows a notification when a new version is available.
git clone https://github.com/Mingye-Lu/AgenticCrawler.git
cd AgenticCrawler
cargo build --release
# Install CloakBrowser (required for browser automation — binary auto-downloads on first use)
npm install
The acrawl Bridge extension lets acrawl control your real browser (with your sessions, cookies, and existing extensions) instead of a headless CloakBrowser instance. Download acrawl-extension.zip from the latest release, unzip it, then load it into your browser:
| Browser | Extensions page | Developer mode toggle |
|---|---|---|
| Chrome | chrome://extensions | Top-right |
| Edge | edge://extensions | Bottom-left |
| Brave | brave://extensions | Top-right |
| Arc / Vivaldi / Opera | <browser>://extensions | Varies |
Enable Developer mode, click Load unpacked, and select the unzipped folder. Then run /extension in the acrawl REPL to connect. See extension/README.md for full setup details.
# Set up your LLM provider (interactive prompt)
./target/release/acrawl auth anthropic # or: openai, other
Credentials are stored in ~/.acrawl/credentials.json. Override the config directory with ACRAWL_CONFIG_HOME.
# Interactive REPL
./target/release/acrawl
# One-shot mode
./target/release/acrawl prompt "scrape all book titles and prices from books.toscrape.com"
# Resume a saved session
./target/release/acrawl --resume session.json /status /compact
Scrape a product catalog:
acrawl > scrape all book titles, prices, and ratings from books.toscrape.com
The agent navigates to the site, reads the page, extracts the data, paginates through all 50 pages, and returns structured JSON.
Fill and submit a form:
acrawl > go to example.com/contact, fill in name "Jane Doe", email "jane@example.com",
message "Hello", and submit the form
The agent locates form fields, fills them in, clicks submit, and confirms the result.
Monitor a price:
acrawl > check the current price of "Rust in Action" on books.toscrape.com
Single-page extraction — the agent fetches, reads, and returns the price without unnecessary navigation.
Extract from JS-rendered pages:
acrawl > get all repository names and star counts from github.com/trending
Static HTTP won't work here. acrawl detects React/Next.js markers and automatically escalates to a headless browser to render the JavaScript.
Parallel multi-page crawl:
acrawl > scrape the title, author, and price of every book across all 50 pages on books.toscrape.com.
Fork a sub-agent for each page to speed this up.
The agent spawns up to 5 concurrent sub-agents, each on its own browser tab, to crawl pages in parallel. Results are merged when all sub-agents finish.
| Tool | Description |
|---|---|
navigate | Go to a URL (supports format: markdown/text/html/fit_markdown). Uses HTTP first, auto-escalates to browser when JS is detected. Returns structured content with a page_map. fit_markdown prunes boilerplate DOM nodes before conversion, saving tokens. |
go_back | Browser back button. Returns page_state with the resulting page structure. |
scroll | Scroll up or down by pixel amount (pixels, default: 500). Returns page_state after scrolling. |
switch_tab | Switch to a different browser tab by index. Returns page_state of the new tab. |
wait | Wait for a CSS selector to reach a given state (visible, hidden, attached, detached) or a fixed timeout (up to 300s). Returns page_state after the condition is met. |
The navigate tool's format parameter controls how the page is returned:
| Format | Description |
|---|---|
markdown | Full HTML → markdown conversion. All content preserved. |
fit_markdown | Recommended. Prunes boilerplate before conversion, saving 30-60% tokens on typical pages. |
text | Plain text, no markdown. |
html | Raw HTML. |
fit_markdown works in two passes:
Hard-block removal, elements whose class or id attribute contains any of these strings are removed immediately: nav, footer, header, sidebar, ads, comment, promo, advert, social, share.
Score-based pruning, remaining elements are scored; anything below 0.48 is removed. The score is:
0.4 × text_density
+ 0.2 × (1 − link_density)
+ 0.2 × tag_weight
+ 0.1 × class_id_score
+ 0.1 × ln(text_length + 1)
Tag weights: article = 1.5 · h1 = 1.2 · h2 = 1.1 · h3/p/section = 1.0 · h4 = 0.9 · h5/table = 0.8 · h6 = 0.7 · span = 0.3 · div/li/ul/ol = 0.5.
Use markdown instead of fit_markdown when: the page has important content inside elements named sidebar, nav, or similar, for example, metadata panels, related-article links, or author info stored in a sidebar div.
If fit_markdown prunes all content (empty result), the tool automatically falls back to plain text.
| Tool | Description |
|---|---|
click | Click an element by CSS selector. Returns page_state after the click. |
click_at | Click at specific viewport coordinates (x, y). Use for canvas, maps, or SVGs. Returns page_state. |
fill_form | Fill form fields by selector or name, with optional auto-submit. Returns page_state. |
select_option | Select a dropdown option by value, label, or index. Returns page_state. |
hover | Hover over an element to reveal tooltips or menus. Returns page_state. |
press_key | Press a keyboard key (Enter, Escape, Tab, etc.), optionally targeting an element. Returns page_state. |
execute_js | Run arbitrary JavaScript in the page context and return the result. |
| Tool | Description |
|---|---|
page_map | Get the page's structural map: headings, landmarks, forms, links, and interactive elements (with selectors and state). Supports scope to query within a specific element (e.g. a modal). |
read_content | Extract text by heading name or CSS selector, with offset/limit pagination for large pages. |
list_resources | List all links, images, and forms on the current page. |
screenshot | Capture a full-page screenshot (base64 PNG). |
save_file | Download a URL to the output directory (path traversal protected). |
| Tool | Description |
|---|---|
fork | Spawn a sub-agent on a new browser tab with its own goal and step budget. |
wait_for_subagents | Wait for specific or all sub-agents to finish and collect results. |
subagent_status | Check the status and results of one or all active sub-agents without blocking. |
cancel_subagent | Cancel a running sub-agent by ID. |
Interaction tools (click, click_at, fill_form, hover, press_key, go_back, scroll, switch_tab, select_option) all return a page_state object. There are two variants:
Full page_state, returned after the first interaction on a URL, or when changes are too extensive to diff:
{
"url": "https://example.com/page",
"title": "Page Title",
"page_map": {
"headings": [{ "level": 1, "text": "...", "id": "...", "selector": "...", "char_count": 0, "preview": "..." }],
"landmarks": [{ "tag": "nav", "role": "navigation", "id": "...", "selector": "...", "text_preview": "..." }],
"links": [{ "text": "...", "href": "...", "selector": "..." }],
"interactive": { "counts": { "buttons": 0, "inputs": 0, "selects": 0, "textareas": 0, "total": 0 }, "elements": [] },
"meta": { "title": "...", "url": "...", "description": "..." },
"truncated_links": false,
"truncated_forms": false,
"truncated_landmarks": false
}
}
Diff page_state, returned on subsequent interactions on the same URL, showing only what changed:
{
"url": "https://example.com/page",
"title": "Page Title",
"changed": true,
"changes": {
"added_headings": [],
"removed_headings": [],
"added_links": [],
"removed_links": [],
"added_landmarks": [],
"removed_landmarks": [],
"added_interactive": [{ "selector": "...", "tag": "...", "text": "..." }],
"removed_interactive": [],
"modified_interactive": [{ "selector": "...", "tag": "...", "text": "...", "state_changes": { "aria_expanded": "true" } }]
}
}
If nothing changed, { "url": "...", "title": "...", "changed": false } is returned. On bridge failure, { "url": "unknown", "title": "unknown", "page_map": null }. Caps: max 50 links, 20 landmarks per page_state.
The agent can fork child agents to crawl multiple pages concurrently. Each child gets its own browser tab, step budget, and independent state.
| Setting | Default | Description |
|---|---|---|
max_concurrent_per_parent | 5 | Max children running in parallel per parent |
max_fork_depth | 3 | Max nesting depth (agents forking agents) |
max_total_agents | 10 | Global cap across all parents |
fork_child_max_steps | 15 | Step budget per child agent |
fork_wait_timeout_secs | 60 | Timeout waiting for sub-agents |
Before a child agent is spawned, its scope is registered in a shared claim registry. This prevents two sibling agents from crawling the same URL simultaneously.
Rules:
SinglePage (exact URL), UrlList (all-or-nothing batch), UrlPattern (regex, checked for overlap with all existing claims).UrlList, it is silently deduplicated (the LLM sometimes produces duplicates).Claiming is automatic, it happens inside fork before the child starts, not inside navigate. The agent does not call it explicitly.
Every navigate call goes through a two-tier fetch router:
__next_data__, __nuxt, __vue, ng-app, _react, data-reactroot/login, /signin, /auth, /oauth, accounts.google.com<noscript> tagWhen --no-headless / --headed is set, all fetches go directly through the browser.
| Category | Provider | Auth | Env Var |
|---|---|---|---|
| Popular | Anthropic | API key | ANTHROPIC_API_KEY |
| OpenAI | API key | OPENAI_API_KEY | |
| Google Gemini | API key | GEMINI_API_KEY | |
| DeepSeek | API key | DEEPSEEK_API_KEY | |
| Enterprise | Amazon Bedrock | AWS SigV4 | AWS_ACCESS_KEY_ID |
| Azure OpenAI | Azure API key | AZURE_OPENAI_API_KEY | |
| Google Vertex AI | GCP service account | GOOGLE_APPLICATION_CREDENTIALS | |
| GitHub Copilot | Device OAuth | — | |
| SAP AI Core | API key | SAP_AI_CORE_API_KEY | |
| GitLab Duo | GitLab token | GITLAB_TOKEN | |
| OSS Hosting | Groq | API key | GROQ_API_KEY |
| Cerebras | API key | CEREBRAS_API_KEY | |
| DeepInfra | API key | DEEPINFRA_API_KEY | |
| Together AI | API key | TOGETHER_API_KEY | |
| Mistral AI | API key | MISTRAL_API_KEY | |
| Specialized | Perplexity | API key | PERPLEXITY_API_KEY |
| xAI (Grok) | API key | XAI_API_KEY | |
| Cohere | API key | COHERE_API_KEY | |
| Alibaba (DashScope) | API key | DASHSCOPE_API_KEY | |
| Gateways | OpenRouter | API key | OPENROUTER_API_KEY |
| Vercel AI | API key | VERCEL_API_KEY | |
| Cloudflare Workers AI | API token | CLOUDFLARE_API_TOKEN | |
| Cloudflare AI Gateway | API token | CLOUDFLARE_API_TOKEN | |
| Other | Venice AI | API key | VENICE_API_KEY |
| Custom (OpenAI-compatible) | API key (optional) | — |
To use any OpenAI-compatible endpoint (Ollama, LMStudio, vLLM, a local proxy, etc.):
acrawl auth other
You'll be prompted for a base URL and an optional API key. Examples:
| Setup | Base URL | Model string |
|---|---|---|
| Ollama (local) | http://localhost:11434/v1 | other/llama3.2 |
| LMStudio | http://localhost:1234/v1 | other/local-model |
| vLLM | http://localhost:8000/v1 | other/meta-llama/Llama-3.1-8B |
| Any OpenAI-compatible API | Your endpoint | other/<model-id> |
This creates a credentials.json entry with auth_method: "api_key" and your base_url. Leave the API key blank if your server doesn't require one.
Models use the provider/model-id format: anthropic/claude-sonnet-4-6, openai/gpt-4o, amazon-bedrock/anthropic.claude-sonnet-4-6-20250514-v1:0, etc.
GitHub Copilot, device code flow:
acrawl auth copilot
https://github.com/login/device) and an 8-character user code, and attempts to open your browser automatically.credentials.json.No API key is needed, the entire flow is interactive. If authorization succeeds, credentials are stored automatically.
GitLab Duo, API key:
acrawl auth gitlab
Prompts for a GitLab Personal Access Token (PAT). Paste your token and press Enter. No browser redirect required. Note: GitLab Duo does not support tool calling.
Amazon Bedrock, AWS credentials:
acrawl auth amazon-bedrock
Prompts for AWS Access Key ID, AWS Secret Access Key, and region (default: us-east-1). These are stored directly in credentials.json and used to sign requests with AWS SigV4.
Azure OpenAI:
acrawl auth azure
Prompts for Resource Name (e.g. myresource), Deployment Name (e.g. gpt-4o), and API key.
The default interface is a full terminal UI with:
/ to see all commands with Tab completion/model opens a searchable list grouped by provider category/auth walks through provider setup interactively/debug toggles raw tool call input/output in the transcriptCtrl+T cycles through high/medium/low for reasoning models (o3, o4-mini)Keybindings:
| Key | Action |
|---|---|
Enter | Submit prompt |
Shift+Enter / Ctrl+J | Insert newline |
PageUp / PageDown | Scroll transcript |
Ctrl+T | Cycle reasoning effort |
Ctrl+C | Interrupt task (busy) or exit (idle) |
Esc Esc | Interrupt task (double-tap while busy) |
Tab | Auto-complete slash command |
Running acrawl without a TTY on stdout (e.g. piped or redirected) exits with an error pointing at acrawl prompt for one-shot use and acrawl --resume for session maintenance.
--resume session.json reloads a conversation. Resume-safe slash commands (/status, /compact, /cost, /config, /version, /export, /help, /clear) can be appended to the command line./export [file] writes a human-readable markdown transcript.ToolResult is ever left without its matching ToolUse.[… output truncated from N chars] marker before the preserved window is calculated.compaction_llm_summarization: true in settings.json, this sends the removed messages to the model and uses its output as the summary, with a fallback to the template if the LLM fails./session list to browse, /session switch <id> to switch.Use --allowedTools to restrict which tools the agent can invoke (comma-separated, flag is repeatable):
acrawl prompt "scrape titles" --allowedTools navigate,read_content,screenshot
Omit --allowedTools to allow all 21 tools. Useful for locking down a crawl to read-only tools or excluding fork/wait_for_subagents when sub-agent parallelism is not desired.
acrawl supports Model Context Protocol servers as a client, allowing you to extend the agent with custom tools. MCP tools are namespaced as server_name__tool_name and available alongside the built-in 21.
Supported transports: stdio, SSE, HTTP, WebSocket.
acrawl mcp starts a built-in MCP server that exposes acrawl's browser automation capabilities to external agents like Claude Code, Cursor, VS Code, Zed, JetBrains, TRAE, Gemini CLI, or any MCP-compatible client.
The server provides 18 tools in two modes:
Direct browser tools (17) — fine-grained control for clients that orchestrate themselves:
navigate, click, click_at, fill_form, page_map, read_content, screenshot, go_back, scroll, wait, select_option, execute_js, hover, press_key, switch_tab, list_resources, save_file
Autonomous agent (1) — delegate a full crawl task:
run_goal — Execute a high-level crawl goal autonomously. The agent plans, navigates, and extracts data using its own LLM loop. Requires ~/.acrawl/credentials.json configured with a model.Transport: stdio only (no SSE / HTTP / WebSocket in this release).
acrawl mcp install
Interactive installer that auto-detects your IDEs, lets you toggle which to configure (Space to select, Enter to confirm), and writes the correct config for each. Supports global (user-level) and project-level scopes.
Supported clients: Claude Code, Claude Desktop, Cursor, Windsurf, VS Code (Copilot), OpenCode, Zed, TRAE, JetBrains IDEs, Gemini CLI, Qwen Code, Codex CLI, Hermes, OpenClaw, Goose, Crush, Aider.
If you prefer to configure manually, add this to your IDE's MCP config file:
| IDE | Config file | Configuration |
|---|---|---|
| Claude Code | .mcp.json (project)~/.claude.json (user) |
|
| Cursor | .cursor/mcp.json | |
| Windsurf | ~/.codeium/windsurf/mcp_config.json | |
| Claude Desktop | %APPDATA%\Claude\claude_desktop_config.json (Win)~/Library/Application Support/Claude/claude_desktop_config.json (Mac) | |
| TRAE | .trae/mcp.json | |
| Gemini CLI | ~/.gemini/settings.json | |
| Qwen Code | ~/.qwen/settings.json | |
| VS Code (Copilot) | .vscode/mcp.json |
|
| OpenCode | opencode.json |
|
| Zed | ~/.config/zed/settings.json |
|
Or via the Claude Code CLI directly:
claude mcp add acrawl -- acrawl mcp
The browser tools share a persistent session across calls. run_goal creates its own isolated agent and browser.
Requirements: The 17 browser tools work without any configuration. run_goal requires ~/.acrawl/credentials.json (via acrawl auth) for its internal LLM.
acrawl [OPTIONS] [COMMAND]
Commands:
prompt <text> Run a single goal non-interactively
mcp Start MCP server (stdio transport)
mcp install Install MCP config into your IDEs interactively
auth [provider] Configure provider credentials
system-prompt Print the system prompt (for debugging)
Options:
--model MODEL Model in provider/id format (e.g. anthropic/claude-sonnet-4-6)
--output-format FORMAT text | json
--resume FILE Resume a saved session (with optional /commands)
--compact Compact history on resume
--headless[=BOOL] Force browser headless on/off
--no-headless, --headed Launch browser in visible mode
--allowedTools TOOLS Restrict available tools (comma-separated, repeatable)
-p TEXT Shorthand for prompt mode
-V, --version Print version
| Command | Description | Resume-safe |
|---|---|---|
/help | List available commands | Yes |
/status | Session info — model, tokens, cost | Yes |
/model [name] | Show or switch the active model | No |
/compact | Compact conversation history | Yes |
/clear | Start a fresh session | Yes |
/cost | Detailed cost breakdown | Yes |
/sessions | Open the session picker (TUI) | No |
/export [file] | Export conversation to markdown | Yes |
/config [model] | View acrawl config | Yes |
/auth [provider] | Configure credentials | No |
/headed | Switch to visible browser | No |
/headless | Switch to headless browser | No |
/extension [stop] | Start/show the extension bridge, or stop it | No |
/cloakbrowser | Switch back to CloakBrowser mode | No |
/debug | Show debug details for the last browser tool call | No |
/version | Version and build info | Yes |
/exit | Exit and save session | No |
All config lives in ~/.acrawl/ (override with ACRAWL_CONFIG_HOME).
credentials.jsonManaged via acrawl auth. Stores per-provider:
| Field | Description |
|---|---|
active_provider | Currently selected provider |
auth_method | api_key, oauth, or aws_sigv4 |
api_key | Provider API key |
oauth | OAuth tokens — access, refresh, expiry, scopes |
default_model | Default model for this provider |
base_url | Custom API endpoint (e.g. local Ollama, Azure resource) |
Azure additionally requires resource_name and deployment_name. Bedrock requires aws_access_key_id, aws_secret_access_key, and region. Vertex requires gcp_project_id and gcp_region.
settings.jsonCreated with defaults on first run.
| Field | Default | Description |
|---|---|---|
headless | true | Run browser without a visible window |
max_steps | 50 | Max agent loop iterations per goal |
output_dir | "output" | Where save_file writes output |
auto_compact_input_tokens | 200000 | Token threshold for auto-compaction |
reasoning_effort | "high" | For reasoning models: high / medium / low |
max_concurrent_per_parent | 5 | Max concurrent sub-agents per parent |
max_fork_depth | 3 | Max nesting depth for forked agents |
max_total_agents | 10 | Global cap on total agents |
fork_child_max_steps | 15 | Step budget for each child agent |
fork_wait_timeout_secs | 60 | Timeout for wait_for_subagents |
browser_backend | null | Active browser backend: "extension" or null (CloakBrowser) |
extension_bridge_port | 19876 | Port for Chrome extension bridge WebSocket server |
All fields are optional; omitting a field uses the default. The optimization block accepts a nested object with the following fields (all default to false/0/null, safe to omit entirely):
| Field | Default | Description |
|---|---|---|
html_diff_mode | false | On repeated visits to the same URL, returns only changed content sections with [unchanged: N sections] markers. 50 to 70% token reduction on multi-turn sessions. No behavior change on first visit. |
content_aware_profiles | false | Auto-selects a cleaning profile based on the task keyword: ReadingMode for extraction tasks, Minimal for interaction tasks, Aggressive for content > 50KB. |
loop_detection | false | Detects repeated identical actions and injects escalating nudges (soft, medium, strong). Also detects page stagnation. |
loop_detection_window | 20 | Rolling window size for action hash comparison. |
loop_nudge_threshold | 5 | Number of repeated actions before first nudge fires. |
page_fingerprinting | false | Enables lightweight page fingerprints used by loop detection and action caching. |
failure_classification | false | Classifies errors into 16 categories (SelectorNotFound, CaptchaDetected, RateLimited, etc.) using keyword matching. Zero LLM cost. |
self_healing | false | On SelectorNotFound/SelectorAmbiguous, fetches a fresh page_map and text-matches to a replacement element ref. Logs [healed: @eOLD -> @eNEW]. Zero LLM calls. |
self_healing_max_retries | 2 | Max healing attempts per failed action. |
action_caching | false | Caches results of read-only tools (page_map, read_content, list_resources, execute_js) keyed by tool + input + page fingerprint. Cache is invalidated when the page changes. |
action_cache_ttl_secs | 30 | Cache entry TTL in seconds. |
planning_interval | 0 | Every N steps, injects a planning checkpoint into the system prompt. 0 = disabled. |
confidence_tracking | false | Asks the LLM to self-report confidence after each action ([confidence: HIGH/MEDIUM/LOW]). Two consecutive LOWs trigger a stagnation alert. |
compound_enrichment | false | Adds enrichment metadata to complex form controls in page_map: date format hints, range min/max/value, select option lists (max 20 + overflow count), file accept types, textarea maxlength. Max 200 bytes per element. |
budget_max_session_cost_usd | null | Session cost limit in USD. Null = no limit. |
budget_enforcement | null | How to enforce the budget: warn injects a warning into the prompt; block terminates the session when the limit is reached. |
budget_warn_threshold_pct | 80 | Percentage of budget at which warnings start. |
per_agent_cost_tracking | false | When ON, /cost shows a per-child-agent cost breakdown. |
| Variable | Description |
|---|---|
ACRAWL_CONFIG_HOME | Override config directory (default: ~/.acrawl/) |
Provider-specific env vars (see provider table above) are read as fallbacks when no credentials.json entry exists.
acrawl ships 14 vendor-derived optimizations (sourced from browser-use, Stagehand, crawl4ai, Skyvern, Spider, nanobrowser, and ZeroClaw). All are disabled by default, enable selectively via settings.json.
Example settings.json with a cost-optimized profile:
{
"optimization": {
"html_diff_mode": true,
"action_caching": true,
"page_fingerprinting": true,
"loop_detection": true,
"self_healing": true,
"budget_max_session_cost_usd": 0.50,
"budget_enforcement": "warn"
}
}
| Optimization | Flag | Benefit |
|---|---|---|
| HTML Diff Mode | html_diff_mode | Reduces tokens by 50 to 70% on repeated visits by returning only changed content. |
| Content-Aware Profiles | content_aware_profiles | Auto-selects cleaning profiles (ReadingMode, Minimal, Aggressive) based on task. |
| Loop Detection | loop_detection | Prevents infinite loops by detecting repeated actions and injecting nudges. |
| Page Fingerprinting | page_fingerprinting | Generates lightweight page fingerprints for loop detection and action caching. |
| Failure Classification | failure_classification | Classifies errors into 16 categories using keyword matching with zero LLM cost. |
| Self-Healing | self_healing | Automatically heals broken selectors using text-matching with zero LLM calls. |
| Action Caching | action_caching | Caches read-only tool results to avoid redundant LLM calls. |
| Planning Interval | planning_interval | Injects periodic planning checkpoints to keep the agent focused. |
| Confidence Tracking | confidence_tracking | Tracks LLM self-reported confidence to alert on stagnation. |
| Compound Enrichment | compound_enrichment | Enriches complex form controls in the page map with metadata. |
| Budget Limit | budget_max_session_cost_usd | Sets a hard session cost limit in USD to prevent runaway costs. |
| Budget Enforcement | budget_enforcement | Controls whether to warn or block when the session budget is reached. |
| Budget Warning | budget_warn_threshold_pct | Triggers warnings when a percentage of the budget is consumed. |
| Per-Agent Cost Tracking | per_agent_cost_tracking | Breaks down costs per child agent in the /cost command. |
acrawl works well on most public web content, but some situations are outside what the agent can reliably handle:
| Scenario | Behavior |
|---|---|
| CAPTCHA / bot challenges | CloakBrowser uses stealth techniques to avoid bot detection, but unsolvable CAPTCHAs (image puzzles, Cloudflare Turnstile requiring proof-of-work) will block progress. Use the real-browser extension (/extension) where your browser already has a trusted session. |
| SMS / TOTP 2FA | The agent can fill in a 2FA code if you paste it into the REPL, but it cannot receive or generate codes itself. |
| Login-walled content | For sites where you must be logged in, use the extension mode so the agent operates in your existing authenticated browser session. |
| Single-page apps that load content on scroll | The agent can scroll to trigger lazy loading, but infinite-scroll feeds with no end condition may require explicit step limits. |
| PDF and binary file content | save_file downloads any URL to disk. The agent cannot read the text content of a saved PDF, use navigate on a URL that serves HTML, or pipe the download through a text extractor externally. |
| WebGL / canvas fingerprinting | Some anti-bot systems fingerprint the GPU via WebGL. CloakBrowser mitigates common checks but cannot spoof hardware-level fingerprints. |
| Sites that require a real mouse trajectory | Bot-detection systems that analyse mouse movement patterns may flag headless browser interactions. Switch to extension mode for these sites. |
flowchart LR
Goal([Goal\nnatural language]) --> Plan
Plan --> Navigate --> Observe --> Act --> Extract
Extract -->|repeat until done| Plan
Extract --> Output([Output\nJSON / CSV])
navigate hits the FetchRouter, which tries HTTP first and auto-escalates to a headless Chromium browser when JavaScript, auth redirects, or framework markers are detected./extension command) using CDP over a local WebSocket bridge.fork child agents onto separate browser tabs, each with independent state and step budgets. wait_for_subagents collects results; cancel_subagent aborts a running child; subagent_status polls without blocking.crates/
core/ Shared types, traits, error hierarchy (acrawl-core)
api/ 25 provider clients (Anthropic, OpenAI, Gemini, DeepSeek, Bedrock, Azure, ...), SSE streaming
browser/ PlaywrightBridge, ExtensionBridge, FetchRouter, BrowserContext, WsBridgeServer
agent/ 21 tools, agent loop, sub-agent fork/join, CrawlState
runtime/ ConversationRuntime, config, sessions, MCP client stack, OAuth PKCE
render/ Markdown rendering, tool output formatting, OutputSink
mcp-server/ Built-in MCP server (JSON-RPC over stdio), IDE installer
tui/ Ratatui terminal UI (acrawl-tui)
ui/ Shared application layer (LiveCli, session management, tool executor, auth)
cli/ Thin binary entry point (main.rs, self_update.rs, uninstall.rs)
commands/ 17 slash commands with resume-safety annotations
11 crates, ~40K lines of Rust, 1,097 tests.
cargo build --release # build
cargo test --workspace # run all tests
cargo clippy --workspace --all-targets -- -D warnings # lint (pedantic)
cargo fmt --check # format check
See CONTRIBUTING.md for the full development guide.
See CHANGELOG.md.
See SECURITY.md for the security policy and how to report vulnerabilities.
therealtimex/browser-use
jae-jae/fetcher-mcp
merajmehrabi/puppeteer-mcp-server
com.thenextgennexus/playwright-mcp-server
saik0s/mcp-browser-use