This handles the setup for quantizing large language models using the bitsandbytes library, which is exactly what you need when a model won't fit in your GPU memory. The skill gives you ready-to-use code for both 8-bit quantization (cuts memory in half) and 4-bit quantization (75% reduction), with minimal accuracy loss according to the docs. It's a straightforward implementation helper rather than anything fancy. If you're running local LLMs and hitting OOM errors, this gets you the boilerplate to compress models down to size. The skill has passed most security audits and has decent traction with 342 installs.
npx skills add https://github.com/davila7/claude-code-templates --skill quantizing-models-bitsandbytes