Modal gives you GPU access without managing infrastructure, and this skill covers how to use it. You write Python code to define your compute needs (T4 through H100), and Modal handles scaling, cold starts, and billing by the second. The guide walks through deploying ML models as APIs, running batch jobs, and using persistent storage for model caching. It's honest about when to use alternatives like RunPod for long-running workloads or SkyPilot for multi-cloud setups. The real value is in the practical patterns: dynamic batching, keeping containers warm, and using the class-based approach with lifecycle hooks to avoid reloading models on every request.
npx skills add https://github.com/orchestra-research/ai-research-skills --skill modal-serverless-gpu