ideabrowser.com — find trending startup ideas with real demand
Try itnpx skills add https://github.com/jimliu/baoyu-skills --skill baoyu-url-to-markdownFetches any URL via Chrome CDP, saves the rendered HTML snapshot, and converts it to clean markdown.
Important: All scripts are located in the scripts/ subdirectory of this skill.
Agent Execution Instructions:
{baseDir}{baseDir}/scripts/<script-name>.ts${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bun{baseDir} and ${BUN_X} in this document with actual valuesScript Reference:
| Script | Purpose |
|---|---|
scripts/main.ts | CLI entry point for URL fetching |
scripts/html-to-markdown.ts | Markdown conversion entry point and converter selection |
scripts/parsers/index.ts | Unified parser entry: dispatches URL-specific rules before generic converters |
scripts/parsers/types.ts | Unified parser interface shared by all rule files |
scripts/parsers/rules/*.ts | One file per URL rule, for example X status and X article |
scripts/defuddle-converter.ts | Defuddle-based conversion |
scripts/legacy-converter.ts | Pre-Defuddle legacy extraction and markdown conversion |
scripts/markdown-conversion-shared.ts | Shared metadata parsing and markdown document helpers |
Check EXTEND.md existence (priority order):
# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-url-to-markdown/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-url-to-markdown/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-url-to-markdown/EXTEND.md" && echo "user"
# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-url-to-markdown/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-url-to-markdown/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-url-to-markdown/EXTEND.md") { "user" }
┌────────────────────────────────────────────────────────┬───────────────────┐ │ Path │ Location │ ├────────────────────────────────────────────────────────┼───────────────────┤ │ .baoyu-skills/baoyu-url-to-markdown/EXTEND.md │ Project directory │ ├────────────────────────────────────────────────────────┼───────────────────┤ │ $HOME/.baoyu-skills/baoyu-url-to-markdown/EXTEND.md │ User home │ └────────────────────────────────────────────────────────┴───────────────────┘
┌───────────┬───────────────────────────────────────────────────────────────────────────┐ │ Result │ Action │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Found │ Read, parse, apply settings │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Not found │ MUST run first-time setup (see below) — do NOT silently create defaults │ └───────────┴───────────────────────────────────────────────────────────────────────────┘
EXTEND.md Supports: Download media by default | Default output directory | Default capture mode | Timeout settings
CRITICAL: When EXTEND.md is not found, you MUST use AskUserQuestion to ask the user for their preferences before creating EXTEND.md. NEVER create EXTEND.md with defaults without asking. This is a BLOCKING operation — do NOT proceed with any conversion until setup is complete.
Use AskUserQuestion with ALL questions in ONE call:
Question 1 — header: "Media", question: "How to handle images and videos in pages?"
Question 2 — header: "Output", question: "Default output directory?"
Question 3 — header: "Save", question: "Where to save preferences?"
After user answers, create EXTEND.md at the chosen location, confirm "Preferences saved to [path]", then continue.
Full reference: references/config/first-time-setup.md
| Key | Default | Values | Description |
|---|---|---|---|
download_media | ask | ask / 1 / 0 | ask = prompt each time, 1 = always download, 0 = never |
default_output_dir | empty | path or empty | Default output directory (empty = ./url-to-markdown/) |
EXTEND.md → CLI mapping:
| EXTEND.md key | CLI argument | Notes |
|---|---|---|
download_media: 1 | --download-media | |
default_output_dir: ./posts/ | --output-dir ./posts/ | Directory path. Do NOT pass to -o (which expects a file path) |
Value priority:
--download-media, -o, --output-dir)-captured.html filex.com / twitter.comarchive.ph / related archive mirrors can restore the original URL from input[name=q] and prefer #CONTENT before falling back to the page bodydefuddle.md/<url> and still save markdown# Auto mode (default) - capture when page loads
${BUN_X} {baseDir}/scripts/main.ts <url>
# Wait mode - wait for user signal before capture
${BUN_X} {baseDir}/scripts/main.ts <url> --wait
# Save to specific file
${BUN_X} {baseDir}/scripts/main.ts <url> -o output.md
# Save to a custom output directory (auto-generates filename)
${BUN_X} {baseDir}/scripts/main.ts <url> --output-dir ./posts/
# Download images and videos to local directories
${BUN_X} {baseDir}/scripts/main.ts <url> --download-media
| Option | Description |
|---|---|
<url> | URL to fetch |
-o <path> | Output file path — must be a file path, not directory (default: auto-generated) |
--output-dir <dir> | Base output directory — auto-generates {dir}/{domain}/{slug}.md (default: ./url-to-markdown/) |
--wait | Wait for user signal before capturing |
--timeout <ms> | Page load timeout (default: 30000) |
--download-media | Download image/video assets to local imgs/ and videos/, and rewrite markdown links to local relative paths |
| Mode | Behavior | Use When |
|---|---|---|
| Auto (default) | Capture on network idle | Public pages, static content |
Wait (--wait) | User signals when ready | Login-required, lazy loading, paywalls |
Wait mode workflow:
--wait → script outputs "Press Enter when ready"Each run saves two files side by side:
url, title, description, author, published, optional coverImage, and captured_at, followed by converted markdown content*-captured.html, containing the rendered page HTML captured from ChromeWhen Defuddle or page metadata provides a language hint, the markdown front matter also includes language.
The HTML snapshot is saved before any markdown media localization, so it stays a faithful capture of the page DOM used for conversion.
If the hosted defuddle.md API fallback is used, markdown is still saved, but there is no local -captured.html snapshot for that run.
Default: url-to-markdown/<domain>/<slug>.md
With --output-dir ./posts/: ./posts/<domain>/<slug>.md
HTML snapshot path uses the same basename:
url-to-markdown/<domain>/<slug>-captured.html
./posts/<domain>/<slug>-captured.html
<slug>: From page title or URL path (kebab-case, 2-6 words)
Conflict resolution: Append timestamp <slug>-YYYYMMDD-HHMMSS.md
When --download-media is enabled:
imgs/ next to the markdown filevideos/ next to the markdown fileConversion order:
https://defuddle.md/<url> API and save its markdown output directlyCLI output will show:
Converter: parser:... when a URL-specific parser succeededConverter: defuddle when Defuddle succeedsConverter: legacy:... plus Fallback used: ... when fallback was neededConverter: defuddle-api when local browser capture failed and the hosted API was used insteadBased on download_media setting in EXTEND.md:
| Setting | Behavior |
|---|---|
1 (always) | Run script with --download-media flag |
0 (never) | Run script without --download-media flag |
ask (default) | Follow the ask-each-time flow below |
--download-media → markdown savedhttps:// in image/video links)AskUserQuestion:
--download-media (overwrites markdown with localized links)| Variable | Description |
|---|---|
URL_CHROME_PATH | Custom Chrome executable path |
URL_DATA_DIR | Custom data directory |
URL_CHROME_PROFILE_DIR | Custom Chrome profile directory |
Troubleshooting: Chrome not found → set URL_CHROME_PATH. Timeout → increase --timeout. Complex pages → try --wait mode. If markdown quality is poor, inspect the saved -captured.html and check whether the run logged a legacy fallback.
--wait and capture after the watch page is fully hydrated.https://defuddle.md/<url>. In shell form: curl https://defuddle.md/stephango.comCustom configurations via EXTEND.md. See Preferences section for paths and supported options.