Hyperframes Media

Editor's Note

Three preprocessing commands that generate assets for video compositions: text-to-speech with Kokoro (54 voices, 9 languages, runs locally), Whisper transcription (word-level timestamps for captions), and background removal with u2net (transparent cutouts for overlay work). Each tool downloads its own model on first run and caches under ~/.cache/hyperframes/. The real value is in the output formats. TTS produces clean wav files ready to drop into a timeline. Transcribe normalizes everything to the same JSON shape whether you're importing SRT, VTT, or running fresh inference. Background removal can emit both the cutout and the inverse plate in one pass, which saves time when you need layered composites. Useful if you're assembling programmatic video and need narration, captions, or transparent talking heads without hitting external APIs.

Install

npx skills add https://github.com/heygen-com/hyperframes --skill hyperframes-media

Votes

Installs19k

GitHub Stars18.5k

Install

npx skills add https://github.com/heygen-com/hyperframes --skill hyperframes-media

Hyperframes Media

Install

Hyperframes Media

Install

Related Backend & APIs Skills

Related Backend & APIs Skills