Karpathy's minimal harness for training your own GPT-2 grade model on a single 8×H100 node in about 2 hours for ~$48. The entire pipeline is here: tokenization, pretraining, finetuning, evaluation against DCLM CORE, and a chat UI. The depth dial is clever, one parameter auto-configures width, heads, learning rate, and training horizon for compute-optimal runs. Depth 12 gets you GPT-1 scale in 5 minutes for fast iteration, depth 26 reproduces GPT-2 capability. It's genuinely hackable, explicit dtype management instead of autocast, and the code is readable enough to fork and modify. Good for anyone who wants to understand LLM training end to end without framework abstraction getting in the way.
npx skills add https://github.com/aradotso/trending-skills --skill nanochat-llm-training