If you've ever wasted an afternoon downloading a 70B model only to find it won't run on your GPU, this is for you. It detects your actual hardware (RAM, CPU, GPU, even multi-GPU setups), scores hundreds of LLM models across fit/quality/speed dimensions, and tells you exactly which ones will work. Ships as both a TUI and CLI, handles quantization selection automatically, and works with Ollama, llama.cpp, MLX, and Docker runners. The REST API is particularly clever for cluster scheduling. You can also flip it around with the plan command to ask "what hardware do I need for model X at 8K context?" Honest take: the override flags for broken nvidia-smi and VM passthrough scenarios show this was built by someone who's actually deployed models in messy real-world environments.
npx skills add https://github.com/aradotso/trending-skills --skill llmfit-hardware-model-matcher