Walks you through the full two-stage PGO workflow for C/C++ projects: instrument your binary, run representative workloads to collect profile data, then rebuild with optimizations tuned to actual runtime behavior. Covers both GCC and Clang approaches, plus BOLT for post-link binary layout optimization when you need that extra 5-15% after PGO. Most useful when standard O3 optimization has plateaued and you're working with branch-heavy code or large binaries like compilers and databases. The CMake integration examples are solid, and the SamplePGO section is handy if you need production profiling without instrumentation overhead. This is what you reach for when performance actually matters and you've exhausted the easy wins.
npx skills add https://github.com/mohitmishra786/low-level-dev-skills --skill pgo