This wraps bitsandbytes for quantizing large language models directly in your Claude Code workflow. You get 8-bit quantization for 50% memory savings or 4-bit for 75% reduction, both with less than 1% accuracy loss according to the docs. Useful when you're trying to run models locally or fit bigger ones into limited VRAM. The skill includes the standard transformers integration setup, so you're basically getting the same quantization approach that's become common in the open source LLM space. Originally from ovachiever/droid-tings, now maintained under orchestra-research. It passed Gen Agent Trust Hub and Socket audits but has a Snyk warning, so check dependencies if that matters for your project.
npx skills add https://github.com/orchestra-research/ai-research-skills --skill quantizing-models-bitsandbytes