This gives Claude the ability to work with Open-AutoGLM, a framework that lets you control Android, HarmonyOS, and iOS devices through natural language by running a 9B parameter vision-language model. You point it at screenshots and say things like "open Meituan and search for nearby hot pot restaurants," and it generates structured ADB/HDC/WebDriverAgent commands to execute the task. The skill covers both self-hosted deployment with vLLM or SGLang and third-party APIs like BigModel. The documentation is thorough on the model serving requirements, which matters because the chain-of-thought output is fragile and breaks if you misconfigure the max_pixels or context length parameters. Useful if you're building phone automation workflows or testing mobile apps with AI-driven interactions instead of brittle UI selectors.
npx skills add https://github.com/aradotso/trending-skills --skill open-autoglm-phone-agent