This is HuggingFace's official library for fine-tuning large language models without breaking the bank on GPU memory. Instead of training all 8 billion parameters, LoRA trains less than 1% by inserting small adapter layers, letting you fine-tune a Llama 3.1 8B on a single consumer GPU. The QLoRA variant adds 4-bit quantization so you can actually fine-tune 70B models on 24GB of VRAM. The skill covers the full workflow from configuration to multi-adapter serving, where you can swap between task-specific adapters at runtime. If you're doing any serious LLM fine-tuning and don't have a datacenter budget, this is the standard approach.
npx skills add https://github.com/orchestra-research/ai-research-skills --skill peft-fine-tuning