Generated from official PyTorch documentation, this pulls in reference material for Fully Sharded Data Parallel training. You'll want this when distributing large model training across multiple GPUs and need to handle uneven inputs or debug FSDP configurations. The included docs cover the Join context manager API, which prevents hanging when different processes finish at different times during distributed training. It's focused on the algorithmic side of FSDP rather than basic setup, so most useful when you're already working with distributed PyTorch and hitting edge cases. The skill itself is auto-generated documentation, so expect API references and class definitions rather than high-level guidance.
npx skills add https://github.com/davila7/claude-code-templates --skill pytorch-fsdp