A proper scraping cascade that tries trafilatura first, falls back to requests with rotated user agents, then escalates to Playwright with stealth mode if the site runs JavaScript or basic anti-bot checks. The code is clean and tracks which method succeeded. The anti-bot landscape section is honest about what playwright-stealth actually handles (navigator.webdriver patches, fingerprint evasion) versus what it doesn't (TLS fingerprinting, Cloudflare Turnstile). The async Playwright variant for Jupyter notebooks is a nice touch since sync Playwright breaks in notebook event loops. This won't beat DataDome or sophisticated bot management, but it covers the 80% case where you just need content extraction with reasonable resilience.
npx skills add https://github.com/jamditis/claude-skills-journalism --skill web-scraping