Wraps the Octen Extract API so Claude can pull live web pages as clean markdown with three filters most extract tools skip: category labels (tech, health, finance), page structure flags (article, homepage, login wall, no main content), and query driven highlights instead of full body dumps. The big win is filtering upstream. When your RAG pipeline fetches 100 URLs, maybe 20 are index pages or paywalls that cost you LLM tokens to discover they're useless. Octen flags them at fetch time via page_structure so you skip the embedding step entirely. Pass a query parameter and you get ranked snippets per page instead of paying to process full content. Supports batches of up to 20 URLs, configurable cache TTL, and optional image or video URL extraction.
claude mcp add --transport stdio octen-team-octen-mcp uvx octen-mcp