Extracts text from documents, detects language, pulls keywords, and chunks content for RAG pipelines. Built by Intelagent as part of their open-source MCP collection. Runs over stdio, works in mock mode without dependencies for testing. If you're preprocessing files before feeding them into embeddings or need to break PDFs and docs into semantic chunks, this handles the grunt work. Check the file-processor package in their monorepo for the full tool list. Pairs with their planned knowledge-grid server for the full indexing and retrieval stack.
claude mcp add --transport stdio io.github.intelagentstudios-mcp-file-processor -- npx -y @intelagent/mcp-file-processor