Connects Claude to the DINO-X visual AI platform through object detection, localization, and captioning APIs. Exposes four core operations: full-scene object detection, text-prompted detection (search for specific objects by name), human pose estimation with 17-point keypoints, and local visualization with annotated bounding boxes. Runs via STDIO locally or streamable HTTP for cloud deployment. You'd reach for this when building visual agents that need to count objects, locate specific items in images, or extract structured data from visual content for automation pipelines. Requires a DINO-X API key with free quota for new users.
claude mcp add --transport stdio idea-research-dino-x-mcp uvx dino-x-mcp