Wraps OpenAI's Whisper model for local audio and video transcription with a Python CLI. You get single file transcription, batch processing, and multiple output formats including SRT and VTT for subtitles. The model selection guide is actually helpful: tiny through large with clear tradeoffs between speed and accuracy. Most useful for content teams turning podcasts into blog posts or generating YouTube captions without sending files to third party APIs. Requires ffmpeg and some Python setup, but once running it's straightforward batch work. The 10x speedup with GPU is real if you're processing hours of content regularly.
npx skills add https://github.com/guia-matthieu/clawfu-skills --skill whisper-transcription