If your Spark jobs are crawling or crashing, this gives you the exact patterns to fix them. It covers the critical stuff: calculating optimal partition sizes, implementing broadcast joins versus sort-merge joins, setting up proper caching with storage levels, and tuning executor memory configurations. The salting technique for handling data skew is especially useful when you have hot keys killing performance. It includes actual configuration values and memory breakdowns, not just theory. Most helpful when you're dealing with production workloads that need to scale beyond toy datasets.
npx skills add https://github.com/wshobson/agents --skill spark-optimization