If you work with Korean government documents or HWP files, this handles the entire parsing pipeline. It converts HWP 5.x, HWPX, and PDF to Markdown or structured IRBlock data, extracts tables with proper colspan handling, diffs documents across formats, and even does reverse conversion from Markdown back to HWPX. The form field extraction is genuinely useful for processing official documents with labeled data. Ships as a library, CLI, and MCP server, so you can parse interactively in Claude or batch process hundreds of files. The OCR plugin system for image-based PDFs is smarter than most parsers that just fail silently. Handles the proprietary binary formats that trip up generic PDF tools.
npx skills add https://github.com/aradotso/trending-skills --skill kordoc-korean-document-parser