This brings LLaVA's open source vision-language model into your Claude workflow for image understanding tasks. With 23,000+ GitHub stars and model sizes ranging from 7B to 34B parameters, it's designed for visual question answering, image captioning, and multi-turn conversations about images. The Apache 2.0 license makes it a solid choice if you need GPT-4V level vision capabilities without proprietary restrictions. Worth noting that it failed the Gen Agent Trust Hub audit while passing Socket, so review the security findings before production use. Best fit for chatbots that need to discuss images or document understanding workflows where you want local control.
npx skills add https://github.com/davila7/claude-code-templates --skill llava