Connects Claude to Microsoft's Florence-2 vision model for local image processing. Exposes two tools: ocr for extracting text from images and PDFs, and caption for generating descriptive summaries of image content. Works with both local files and web URLs. Reach for this when you need offline OCR capabilities or want to generate alt text and image descriptions without hitting external APIs. Ships as a pre-built bundle that drops right into Claude Desktop, or runs via uvx for Goose and LM Studio. The model downloads locally on first run, so you own the processing pipeline end to end.
claude mcp add --transport stdio jkawamoto-mcp-florence2 uvx mcp-florence2