A full-featured computer-use server that gives Claude direct control over Windows 10/11 desktops through 29 tools spanning UI Automation, Chrome DevTools Protocol, keyboard/mouse control, and terminal operations. The core innovation is a discover-then-act pattern: instead of guessing pixel coordinates, you call desktop_discover to get short-lived entity leases, then desktop_act operates on semantic targets with perception guards that verify window identity before input lands. Under the hood it's a Rust native engine (2ms UIA queries, SSE2 image diffing) with transparent PowerShell fallback. Includes Set-of-Marks visual OCR for non-accessible apps, reactive perception graphs to reduce screenshot round-trips, and 1:1 coordinate mode so image pixels map directly to screen positions. Ships with auto-dock CLI, full CJK support via GetWindowTextW, and a failsafe that kills the server when you mouse to the top-left corner.
claude mcp add --transport stdio harusame64-desktop-touch-mcp uvx desktop-touch-mcp