Nanochat Llm Training

Editor's Note

Karpathy's minimal harness for training your own GPT-2 grade model on a single 8×H100 node in about 2 hours for ~$48. The entire pipeline is here: tokenization, pretraining, finetuning, evaluation against DCLM CORE, and a chat UI. The depth dial is clever, one parameter auto-configures width, heads, learning rate, and training horizon for compute-optimal runs. Depth 12 gets you GPT-1 scale in 5 minutes for fast iteration, depth 26 reproduces GPT-2 capability. It's genuinely hackable, explicit dtype management instead of autocast, and the code is readable enough to fork and modify. Good for anyone who wants to understand LLM training end to end without framework abstraction getting in the way.

Install

npx skills add https://github.com/aradotso/trending-skills --skill nanochat-llm-training

Votes

Installs1.2k

GitHub Stars7

Nanochat Llm Training

Install

Nanochat Llm Training

Install

Related DevOps & CI/CD Skills

Related DevOps & CI/CD Skills