Transcribes audio and video to text using Qwen3-ASR with two paths: local MLX inference on Apple Silicon Macs (15-27x realtime, no API key) or remote API for other platforms. The skill auto-detects your hardware and recommends the best mode, handles long recordings without truncation (a common gotcha with the default max_tokens setting), and includes fallback chunking strategies when needed. Works with meetings, lectures, podcasts, screen recordings, anything you want converted to text. The bundled scripts handle extraction, model loading, and cleanup. After transcription it prompts you to run a separate fixer skill since raw ASR output always has recognition errors. Solid choice if you need offline transcription on Mac or want to point at your own vLLM server.
npx skills add https://github.com/daymade/claude-code-skills --skill asr-transcribe-to-text