This is a comprehensive OCR solution that gives you six different engines to choose from depending on your needs. Local options like Tesseract and EasyOCR for privacy, PaddleOCR when you're dealing with Chinese/Japanese/Korean text or tables, and cloud services like Google Vision or AWS Textract when you need maximum accuracy. The preprocessing pipeline handles all the usual image problems (skew, noise, bad contrast) that tank OCR accuracy on real world photos. Comes with Python and Node.js implementations, and the structured output parsing is genuinely useful for invoices and forms where you need more than just raw text. If you're doing any serious text extraction work, having all these engines in one place saves a lot of integration headache.
npx skills add https://github.com/fearovex/claude-config --skill image-ocr