This one generates Airflow DAGs, validates data quality with Great Expectations, and optimizes SQL/Spark queries with specific recommendations. It's built around three Python tools that handle pipeline orchestration, quality checks, and performance tuning. The workflows are solid: batch ETL with dbt and Snowflake, real-time streaming with Kafka and Spark, plus a data contract validation framework. The decision trees for batch vs streaming and warehouse vs lakehouse are actually helpful. If you're wiring together modern data stacks and need generated boilerplate plus validation hooks, this covers the repetitive parts. The SQL optimization tool estimates costs too, which is nice for keeping BigQuery bills in check.
npx skills add https://github.com/borghei/claude-skills --skill senior-data-engineer