Ocr Document Processor

4.3k installs60 stars

Summary

The ocr-document-processor skill extracts text and structured data from scanned images, PDFs, and handwritten documents using optical character recognition (OCR), supporting specialized parsing for receipts and business cards. It serves developers and data processors who need to convert unstructured visual documents into searchable, machine-readable formats like JSON, markdown, or HTML. The skill solves the problem of recovering legible text from low-quality or skewed scans while providing confidence assessments and specialized extraction modes for common document types.

Install to Claude Code

npx -y skills add dkyazzentwatwa/chatgpt-skills --skill ocr-document-processor --agent claude-code

Installs into .claude/skills of the current project.

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Vibe Prospecting MCP

Connect Claude to +800M contacts, +150M companies. Find & Enrich leads in chat.

Try For Free →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.

Start Free Trial →

Vibe Prospecting MCP

Connect Claude to +800M contacts, +150M companies. Find & Enrich leads in chat.

Try For Free →

Context.dev

Integrate web data into your AI product. One API to scrape website & brand data.

Get API Key Now →

Files

SKILL.mdView on GitHub

OCR Document Processor

Handle OCR-heavy inputs where text must be recovered from images or scanned pages.

Use This For

OCR on images and scanned PDFs
Searchable PDF export
Structured extraction to text, markdown, JSON, or HTML
Table extraction from scanned material
Receipt parsing and business card parsing

Workflow

Decide whether plain OCR, structured extraction, or document-specific parsing is needed.
Preprocess noisy inputs before extraction when skew, blur, or shadows are present.
Use scripts/ocr_processor.py for core OCR tasks.
Use the focused helpers when the input is specialized:
- scripts/business_card_scanner.py
- scripts/receipt_scanner.py
Return confidence caveats when the source is low quality, rotated, handwritten, or multilingual.

Guardrails

Prefer explicit language selection when accuracy matters.
Do not claim fields are exact when OCR confidence is weak.
Route non-scanned digital PDFs to document-converter-suite instead of OCR by default.

Featured

CodeRabbit

AI writes the code. CodeRabbit catches the slop.

Try For Free →

Make your agent a DeFi expert

Agent, run crypto. Access onchain data & trade routes via 1inch.

Install now →

Make money from your Skills

On Capafy, your Skill runs online 24/7 as an agent product, and you get paid every time someone uses it.

Start earning →

AppSignal

Monitor with ease. Code with confidence.