This connects Claude to Alibaba Cloud's Qwen VL multimodal model for image understanding tasks. You feed it an image and text prompt, it returns analysis via the DashScope API in compatible mode. The setup includes a Python script for image analysis and saves both raw responses and normalized results to an output directory for traceability. If you're already in the Alibaba Cloud ecosystem or need an alternative to Western vision APIs, this gives you a straightforward integration. It's got decent traction with 337 installs and passes most security audits, though Snyk flagged a warning worth checking.
npx skills add https://github.com/cinience/alicloud-skills --skill alicloud-ai-multimodal-qwen-vl