Brings local document extraction into Claude through a two-tier OCR pipeline (PaddleOCR/EasyOCR on CPU, GOT-OCR2.0/VLM on GPU for low-confidence fallback) paired with local LLMs via vLLM or Ollama. You define extraction schemas as Pydantic models or use eight built-in ones (invoices, receipts, Korean tax forms, bills of lading), and it returns validated JSON with confidence scores and cross-field checks like checkdigit verification and sum totals. The stdio transport exposes extract, ocr, validate, and batch commands. Reach for this when you need structured data from scanned documents without cloud APIs, or when you need custom validation rules like container number checksums or business registration verification baked into the extraction flow.
claude mcp add --transport stdio quartzunit-docpick uvx docpick