This is a solid OCR wrapper that handles the common case of needing to extract text from images and PDFs without fussing with API keys or cloud services. It runs locally using PaddleOCR by default, with an option to use DeepSeek-OCR for more complex tasks like table extraction. The prompt option is clever: you can ask it to structure output as markdown tables or JSON instead of just dumping raw text. Supports 100+ languages and all the usual image formats. If you're building agents that need to read screenshots, process scanned documents, or pull data from charts, this handles the annoying parts so you don't have to wire up OCR libraries yourself.
npx skills add https://github.com/mr-shaper/opencode-skills-paddle-ocr --skill ocr