Adds three PDF tools to Claude: list_pdfs for finding files with glob patterns, read_pdf for extracting content as ordered blocks, and grep_pdf for text search. The smart bit is corruption detection. It tries text extraction first, then automatically switches to image mode when it hits encoding garbage or mangled output. You can force text_only or image_only modes, configure page limits to avoid token explosions, and it filters out junk images like logos and headers. Returns JSON with page numbers and content blocks in reading order. Install via pip or uvx, then point Claude Desktop at the server.py script. Requires pdfgrep if you want the search tool.
claude mcp add --transport stdio io.github.pyjudge-pdf4vllm -- uvx pdf4vllm-mcp