Adds offline speech-to-text and speaker diarization to Claude through whisper.cpp with zero cloud dependencies. Exposes five MCP tools: transcribe audio files to text, SRT, VTT, or JSON with word-level timestamps; generate live captions with partial and final events; identify who spoke when via sherpa-onnx diarization; capture from microphone with RNNoise denoising and VAD segmentation; export to WAV or FLAC. The engine is a C++ pipeline wrapping PortAudio, FFmpeg, and Whisper, shipped as a Python package with stdio transport. Useful when audio must stay on-device for privacy or cost reasons, or when you need meeting transcripts with speaker labels without hitting a commercial API.
claude mcp add --transport stdio io.github.chicogong-ffvoice uvx ffvoice