If you're reverse-engineering how transformers work internally, this gives you TransformerLens expertise for mechanistic interpretability research. It's the standard library for inspecting model internals through HookPoints, letting you cache activations, patch them between runs to test causality, and decompose outputs by attention head or layer. The workflow walkthroughs cover activation patching for causal tracing, circuit analysis like the famous IOI paper, and induction head detection. Works with 50+ models including GPT-2, LLaMA, and Mistral. The main limitation is it's transformer-only, so you'll need nnsight or pyvene for other architectures. If you're doing SAE work specifically, SAELens is purpose-built for that instead.
npx skills add https://github.com/orchestra-research/ai-research-skills --skill transformer-lens-interpretability