A remote MCP server that runs full multimodal analysis on video URLs from YouTube, TikTok, Instagram, Vimeo, Twitter, and direct links. It exposes four tools: quick_transcribe for timestamped audio with speaker ID, deep_analyze for the full pipeline (transcript plus keyframe vision plus OCR in one structured output), clip_context for analyzing specific timestamp ranges, and batch_analyze for processing up to 10 videos in parallel. Connects over streamable HTTP with OAuth, no local installation. Built on yt-dlp, Groq Whisper, Tesseract OCR, and Claude Vision. Reach for it when you need an agent to reason over what's shown on screen, not just what's said in the audio.
claude mcp add --transport http app.contendeo-contendeo https://contendeo.app/mcp/