If you're tired of writing the same GPU handling, distributed training, and logging boilerplate in PyTorch, this wraps it all in a clean LightningModule structure. You define training_step and configure_optimizers, then Trainer handles device placement, DDP/FSDP/DeepSpeed, checkpointing, and TensorBoard logging. The same code runs on your laptop or 8 GPUs without changes. It's opinionated about code organization, which is either helpful structure or annoying constraint depending on your project. Works well for standard training loops, though you'll fight it if you need unusual control flow. The callback system is genuinely useful for early stopping and learning rate monitoring without cluttering your main code.
npx skills add https://github.com/orchestra-research/ai-research-skills --skill pytorch-lightning