This is a vision model wrapper for extracting structured data from images. Drop in a chart screenshot, table, or diagram, and it captions via `scripts/caption.py`, then you parse the markdown table output into a DataFrame for analysis or export. The auto-detection is decent but you'll want custom prompts for precision work. Built-in caching means repeat runs are free, and batch mode handles folders. The repo includes parsing helpers and matplotlib setup with Chinese font config. Main gotcha is large tables can truncate, so you might need to caption in chunks. It's fast, no API key juggling, and the JSON output mode is clean for scripting. Use it when you need numbers out of a PNG, not for photo editing or generation.
npx -y skills add opensensenova/sensenova-skills --skill sn-da-image-caption --agent claude-codeInstalls into .claude/skills of the current project.
Select a file.
prisma/skills
firebase/agent-skills
wordpress/agent-skills
Dexploarer/hyper-forge
prisma/skills