This teaches you to build AI agents that control computers through vision and actions, using the same perception-reasoning-action loop that powers Anthropic's Computer Use and OpenAI's Operator. The patterns cover the fundamentals (screenshot → vision model → execute mouse/keyboard → repeat) plus critical security sandboxing with Docker containers, because giving an AI direct system access is asking for trouble. Includes working code for Claude's official computer use tools, which now support zoom actions in Opus 4.5 for inspecting UI details. The honest reality is that dropdowns and scrollbars are still tricky for these agents. Use this when you need desktop automation beyond browser-only tools, but always run it isolated.
npx skills add https://github.com/davila7/claude-code-templates --skill computer-use-agents