If you need text-to-speech with voice cloning on Alibaba Cloud, this taps into their Qwen TTS models to replicate someone's voice from sample audio. You get two model options: a standard one and a realtime variant, both from early 2026. The setup requires working in a Python virtual environment with their SDK. It's passed security audits from three different tools, which is reassuring given you're potentially handling voice data. This is clearly built for Chinese cloud infrastructure, so it makes sense if you're already in that ecosystem or need specific compliance requirements that Alibaba Cloud satisfies. Not much point otherwise when you have more accessible alternatives.
npx skills add https://github.com/cinience/alicloud-skills --skill alicloud-ai-audio-tts-voice-clone