Wraps Apple's Vision Framework to run OCR, face detection, barcode reading, and image classification entirely on your Mac. Instead of sending a 44-page PDF to Claude as 73,500 tokens, it extracts structured text locally first (paragraphs, bounding boxes, reading order) and costs around 2,400 tokens. Works with any MCP client over stdio. Exposes five tools: ocr_image for text extraction, detect_faces, detect_barcodes, classify_image, and analyze_document for full pipelines that return JSON ready for the model to rebuild as Markdown or HTML. Requires macOS 13.0+ and runs offline after install. Useful when you're processing contracts, invoices, or medical records and want the file to stay local while still getting structured output the LLM can reason over.
claude mcp add --transport stdio woladi-macos-vision-mcp uvx macos-vision-mcp