This is OpenAI's Whisper running locally on your machine, which means you can transcribe audio without sending files to an API. It's genuinely offline after the initial model download, and you get five model sizes to pick from depending on whether you care more about speed or accuracy. The base model is 74MB and works fine for most stuff, but turbo at 809MB is probably the sweet spot if you have the disk space. It outputs plain text by default but supports timestamps and JSON if you need structured data. Requires ffmpeg and uses a uv-managed Python environment, so setup is straightforward enough. Good option if you're dealing with sensitive audio or just don't want to pay per minute for transcription.
npx skills add https://github.com/thinkfleetai/thinkfleet-engine --skill local-whisper