article-exporter
Export any web article to a local Obsidian-ready Markdown directory. Fetches page content via actionbook CLI, downloads images locally, rewrites image references to relative paths, and optionally translates the article using AI. Produces a self-contained folder with README.md, images/, and an index.md navigation file.
What this skill does
# Article Exporter - Export Articles to Obsidian
> **Version:** 0.5.0 | **Last Updated:** 2026-03-13
You are an expert at web content archiving and Obsidian workflow automation.
## Lessons from Failed Exports
These rules were extracted from real export failures. Each one prevents a specific class of error:
1. **Twitter/X needs AI reformatting** — `fetch` returns flat text because Twitter uses custom UI without semantic HTML. The AI reformatting step reconstructs headings, lists, and code blocks. See `references/twitter-handling.md`.
2. **Ask for output path first** — users have different vault locations. Assuming a default creates files in the wrong place and wastes time moving them.
3. **Check actionbook version >= 0.9.1** — the `--wait-hint` parameter was added in 0.9.1. Without it, dynamic content (SPAs, lazy-loaded pages) returns empty or partial results.
4. **Wait after navigation** — use `--wait-hint heavy` for Twitter, Medium, and other dynamic sites. Without it, the page hasn't finished rendering when content is extracted.
5. **Rate limit batch exports** — 3-5s delay between requests prevents being flagged as a bot (ToS compliance).
## Quick Reference
| Task | Command | Success Criteria |
|------|---------|------------------|
| Check deps | `actionbook --version` | Shows version >= 0.9.1 |
| Fetch article | `actionbook browser fetch <url> --wait-hint heavy` | Returns plain text (AI reformats to Markdown in Step 1b) |
| Translate | AI session directly | README_CN.md created |
| Open in Obsidian | `obsidian-cli open "path/index.md"` | File opens in Obsidian |
---
## Complete Export Workflow
**Goal:** Export web article to Obsidian directory with images and optional translation
**Success criteria:**
- Article directory created with README.md
- All images downloaded to images/
- index.md navigation file created
- Optional: README_CN.md translation
- Opened in Obsidian (if obsidian-cli available)
---
### Step 1: Fetch Article Content
**Execution:** Direct (Bash)
```bash
# Fetch article as readability text (with log cleaning)
actionbook browser fetch "$URL" --wait-hint heavy 2>/dev/null | \
sed '/^[[:space:]]*$/d;/^\x1b\[/d;/^INFO/d' > /tmp/article_raw.txt
```
**Success criteria:**
- `/tmp/article_raw.txt` exists and size > 0 bytes
- Content contains the article's main text
The fetch command returns readability-extracted **plain text** (not Markdown).
AI reformatting in Step 1b is always needed to produce proper Markdown.
**Rules:**
- Use `--wait-hint heavy` for Twitter, Medium, dynamic content
- Use `--wait-hint light` for static blogs
- `2>/dev/null` suppresses stderr logs
- `sed` removes ANSI codes, INFO lines, empty lines
**Twitter/X Special Handling**
Twitter uses non-semantic HTML, so `fetch` output loses all structure (headings become flat text, code blocks disappear). If the URL contains `x.com` or `twitter.com`, pay extra attention to structure reconstruction in Step 1b. See `references/twitter-handling.md`.
---
### Step 1b: AI Reformat to Markdown
**Execution:** Direct (AI session)
Read `/tmp/article_raw.txt` and convert the plain text into well-structured Markdown. Save the result to `/tmp/article.md`.
**Reformatting rules:**
- Reconstruct headings (`#`, `##`, `###`) from the text structure
- Preserve original image URLs as `` references
- Format code blocks, lists, tables, and blockquotes
- Keep the original article title as the first `# H1` heading
**Success criteria:**
- `/tmp/article.md` exists and starts with `# <Title>`
- Image URLs are preserved as Markdown image syntax
---
### Step 2: Extract Metadata
**Execution:** Direct (Bash)
```bash
# Extract title (first H1 heading from AI-reformatted markdown)
TITLE=$(grep -m 1 "^# " /tmp/article.md | sed 's/^# //')
# Extract image URLs (filter out data: URLs)
IMAGE_URLS=$(grep -o '!\[[^]]*\]([^)]*)' /tmp/article.md | \
sed -E 's/!\[[^]]*\]\(([^)]*)\)/\1/' | \
grep -v '^data:')
```
**Success criteria:**
- `$TITLE` is non-empty
- `$IMAGE_URLS` count matches expected (use `wc -l`)
---
### Step 3: Ask Output Directory
**Execution:** [human]
**Human checkpoint:** Confirm output location before creating files
Ask user: "Where should I save the exported article?"
Suggested paths:
- `~/Work/Write/Articles` (default)
- `~/Documents/Obsidian/Articles`
- `~/Notes/Imported`
- (or custom path from `$output_dir` argument)
**Success criteria:** User confirms output directory
**Artifacts:** `$OUTPUT_DIR` variable set
---
### Step 4: Create Directory Structure
**Execution:** Direct (Bash)
```bash
# Use argument if provided, otherwise use confirmed path
OUTPUT_DIR="${output_dir:-$USER_CONFIRMED_PATH}"
# Sanitize title for directory name
SAFE_TITLE=$(echo "$TITLE" | sed 's/[/:*?"<>|]//g' | cut -c1-100 | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
# Create output directory
ARTICLE_DIR="$OUTPUT_DIR/$SAFE_TITLE"
mkdir -p "$ARTICLE_DIR/images"
```
**Success criteria:**
- Directory `$ARTICLE_DIR` exists
- Subdirectory `images/` exists
- Directory is writable
**Rules:**
- Remove special characters: `/ : * ? " < > |`
- Limit title length to 100 characters
- Trim leading/trailing whitespace
---
### Step 5: Download Images (Parallel if possible)
**Execution:** Direct (Bash)
```bash
counter=1
for url in $IMAGE_URLS; do
ext=$(echo "$url" | grep -oE '\.(jpg|jpeg|png|gif|webp|svg)' || echo ".jpg")
curl -L -s "$url" -o "$ARTICLE_DIR/images/image_${counter}${ext}"
# Check file size (detect 0-byte failures)
if [ ! -s "$ARTICLE_DIR/images/image_${counter}${ext}" ]; then
# Try alternative format (Twitter)
curl -L -s "${url}?format=jpg&name=orig" -o "$ARTICLE_DIR/images/image_${counter}.jpg"
fi
counter=$((counter + 1))
done
```
**Success criteria:**
- All image files exist and size > 0 bytes
- File count matches `$IMAGE_URLS` count
**Rules:**
- Use `curl -L` to follow redirects
- Check file size after download
- Try alternative formats for Twitter images
---
### Step 6: Update Image References
**Execution:** Direct (Bash)
```bash
# Replace remote URLs with local paths
counter=1
for url in $IMAGE_URLS; do
ext=$(echo "$url" | grep -oE '\.(jpg|jpeg|png|gif|webp|svg)' || echo ".jpg")
sed -i.bak "s|$url|./images/image_${counter}${ext}|g" /tmp/article.md
counter=$((counter + 1))
done
# Save updated markdown
cp /tmp/article.md "$ARTICLE_DIR/README.md"
rm /tmp/article.md.bak
```
**Success criteria:**
- `README.md` contains `./images/image_N.*` references
- No remote URLs remain in image links
---
### Step 7: AI Translation (Optional)
**Execution:** Direct (AI session)
**Human checkpoint:** Ask user: "Do you want to translate the article? (y/n)"
If yes:
1. Read `$ARTICLE_DIR/README.md`
2. Translate using AI capabilities (no external API)
3. Write to `$ARTICLE_DIR/README_CN.md` (or other language code)
**Translation Prompt Template:**
```
Translate the following Markdown article to [LANGUAGE] while preserving:
- All Markdown formatting (headings, lists, code blocks, tables)
- Image references exactly as-is: 
- Links and URLs unchanged
- Code blocks and technical terms in original language
Only output the translated Markdown content.
---
[Paste README.md content]
```
**Success criteria:** Translation file exists and size ≈ original ± 20%
**Supported languages:** en, zh, es, fr, de, ja, ko
---
### Step 8: Create Navigation Index
**Execution:** Direct (Bash)
```bash
# Auto-detect source from URL
case "$URL" in
*x.com*|*twitter.com*) SOURCE="X" ;;
*medium.com*) SOURCE="Medium" ;;
*dev.to*) SOURCE="Dev.to" ;;
*openai.com*) SOURCE="OpenAI Blog" ;;
*substack.com*) SOURCE="Substack" ;;
*github.com*) SOURCE="GitHub" ;;
*) SOURCE=$(echo "$URL" | sed 's|https\?://||' | cut -d/ -f1) ;;
esac
# Create index.md
cat > "$ARTICLE_DIR/index.md" <<EOF
# $TITLE
> **Export Date**: $(date +%Y-%m-%d)
> **Original URL**: $URL
> **Source**: $SOURCE
## 📚 LanRelated in Image & Video
watch
IncludedWatch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.
physical-ai-defect-image-generation
IncludedUse when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.
accelint-react-best-practices
IncludedReact performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.
elevenlabs-agents
IncludedBuild conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication
humanizer
IncludedHumanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.
generating-mermaid-diagrams
IncludedSalesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.