Deepseek Ocr

Editor's Note

This is a Claude Code skill that wraps DeepSeek's OCR vision model for extracting text from images, PDFs, and documents. It's built for production use with vLLM inference (hitting around 2500 tokens/s on an A100) and supports batch processing, dynamic resolutions, and multiple prompt modes including document-to-markdown conversion, free OCR, and grounded text location. The setup is a bit heavy with specific CUDA and PyTorch requirements, plus you'll need to grab a vLLM wheel from their releases, but once running it handles things like table detection with whitelisted tokens and context optical compression for longer documents. Good option if you're processing documents at scale and want structured markdown output rather than just raw text strings.

Install

npx skills add https://github.com/aradotso/trending-skills --skill deepseek-ocr

Votes

Installs1.1k

GitHub Stars7

Install

npx skills add https://github.com/aradotso/trending-skills --skill deepseek-ocr

Deepseek Ocr

Install

Deepseek Ocr

Install

Related Testing & QA Skills

Related Testing & QA Skills