This is essentially an audio API router that saves you from juggling multiple services. Point it at text and it'll generate speech via ElevenLabs, ask for background music and it hits fal.ai's models, request sound effects and it picks the fastest available option. The routing logic is solid and it handles the async complexity of music generation, but you're still dependent on external API reliability and costs. Most useful when you're building something that needs multiple types of audio generation without wanting to implement each service's quirks yourself. Setup requires at minimum an ElevenLabs key, though fal.ai access unlocks the music features.
npx skills add https://github.com/pexoai/pexo-skills --skill videoagent-audio-studio