Tencent's HY-World 2.0 reconstructs 3D scenes from images or video, or generates them from text prompts. The core is WorldMirror 2.0, a 1.2B parameter feed-forward model that predicts depth, normals, camera parameters, and 3D Gaussian Splatting attributes in one pass. It handles 50K to 500K pixel resolution and supports prior injection if you already know camera poses. The skill wraps the Python pipeline with fp16 options for memory management and export to PLY or OBJ. Use this when you need to turn a folder of photos into a navigable 3D scene or prototype spatial AI tasks without manually rigging geometry. The Gradio app makes it approachable for quick tests.
npx skills add https://github.com/aradotso/trending-skills --skill hy-world-2-0-3d-world-model