This is a Claude Code skill that wraps DeepSeek's OCR vision model for extracting text from images, PDFs, and documents. It's built for production use with vLLM inference (hitting around 2500 tokens/s on an A100) and supports batch processing, dynamic resolutions, and multiple prompt modes including document-to-markdown conversion, free OCR, and grounded text location. The setup is a bit heavy with specific CUDA and PyTorch requirements, plus you'll need to grab a vLLM wheel from their releases, but once running it handles things like table detection with whitelisted tokens and context optical compression for longer documents. Good option if you're processing documents at scale and want structured markdown output rather than just raw text strings.
npx skills add https://github.com/aradotso/trending-skills --skill deepseek-ocr