This is a multimodal image recognition tool that takes a public image URL and returns AI-generated analysis of what's in the picture. You can ask it to do general descriptions, extract text (OCR), identify products, or answer specific questions about visual content by passing a natural language requirement parameter. It supports common formats like JPG, PNG, GIF, and WebP. The documentation is thorough, with clear examples for e-commerce use cases like analyzing Amazon product listings or A+ content. One limitation: it only accepts public URLs, so local images need to be uploaded first using their included Python script. The large response handling pattern they document is overkill for most use cases, but the core functionality is straightforward and well explained.
npx skills add https://github.com/linkfox-ai/linkfox-skills --skill linkfox-multimodal-recognize-image