Claude
Skills
Sign in
Back

doc-snapshot-agent

Included with Lifetime
$97 forever

Automatically illustrate Markdown documents by turning image markers into browser screenshots or AI-generated images, then writing an image-enriched Markdown output. Use when a document needs screenshots, generated visuals, semantic image placement, or end-to-end document illustration automation.

Designscripts

What this skill does


## When to Use

Load this skill when a Markdown document needs real images — screenshots of live web pages, AI-generated editorial illustrations, or a rerun that only fixes image placement in an already-processed file.

Use it when the user asks to:

- add images to a Markdown article
- process a case file with image markers
- capture screenshots for documentation
- generate article visuals and insert them into a document
- rerun or fix image placement in an already processed document

Do not use it for pure text editing, proofreading, or translation — those tasks do not benefit from browser automation or image generation and should be handled directly.

## Architecture

This skill has a single entry point (this file) plus four sibling references for depth. It does not create hidden memory folders, does not persist browser state, and does not send any data beyond what the target workflow requires.

All paths (input cases, output images, illustrated Markdown, cache) resolve under one `{project-root}` the user names at the start of the run. Browser work routes exclusively through the Playwright MCP server; generated images route through a bundled Python script that calls OpenRouter.

## Quick Start

1. **Check Playwright MCP tools** — confirm `mcp__playwright__browser_navigate` and other `mcp__playwright__*` tools are available. If missing, send the user the install snippet from `references/mcp-setup.md` and stop.
2. **Confirm the project root** — ask once; default to `/tmp/doc-snapshot-agent` if the user has no preference.
3. **Inspect existing artifacts** — reuse anything already on disk (see Incremental Execution).
4. **Parse the case file** — merge markers from heading form, HTML-comment form, and the Image Summary table.
5. **Capture, generate, place, write README** — follow the Workflow section.

## Quick Reference

| Topic | File |
|-------|------|
| Install Playwright MCP for each client, grant permissions, runtime setup | `references/mcp-setup.md` |
| Navigate, snapshot, login, capture, verify — full browser loop and tool patterns | `references/browser-capture.md` |
| Build and maintain site-specific navigation knowledge | `references/site-explorer.md` |
| Prompt construction and script usage for generated images | `references/image-generation.md` |
| Image generation CLI | `scripts/generate_image.py` |

## Approach Selection

| Situation | Best path | Why |
|-----------|-----------|-----|
| Article already has screenshots in `output/{article-id}/raw/` and the user only wants the Markdown rebuilt | Skip capture, rerun Step 5 (Illustrated Markdown) | Browser work is expensive; Markdown regeneration is cheap |
| Marker type is `screenshot` and the page is publicly reachable | Playwright MCP navigate → snapshot → capture | Reliable, inspectable, handles JS rendering |
| Marker type is `screenshot` and the page is behind auth | Playwright MCP with `PLAYWRIGHT_CRED_*` env vars | Keeps secrets out of prompts and the transcript |
| Marker type is `generated` (editorial, hero, conceptual) | `scripts/generate_image.py` via OpenRouter | Screenshots cannot render conceptual imagery |
| Marker landed on the wrong paragraph | Reparse case file, reapply semantic placement | Re-capturing won't fix placement bugs |
| Required MCP tools are missing in the runtime | Stop, point user to `references/mcp-setup.md` | Workflow cannot proceed without MCP |

## Workflow

### Step 0: Verify Playwright MCP

Run this check at the start of **every** execution, not just the first time.

1. Detect tools whose name starts with `mcp__playwright__`. Required: `browser_navigate`, `browser_snapshot`, `browser_take_screenshot`.
2. If they are missing, stop and hand the user the matching install snippet from `references/mcp-setup.md` (Claude Code, Codex, VS Code/Cursor/Kiro, Claude Desktop, or standalone). Include the `permissions.allow: ["mcp__playwright__*"]` note for Claude Code and Codex.
3. After the user installs and restarts the client, resume from here rather than restarting the run.

Do **not** substitute direct Playwright library calls or any browser tool that lacks the `mcp__playwright__` prefix. If the prefix is missing, the call does not go through the MCP server.

### Step 0.5: Confirm the project root

Ask once:

> Which directory should I use as the project root for this run?

- If the user provides a path, use it as `{project-root}`.
- If the user says "no preference", skips, or does not answer, default to `/tmp/doc-snapshot-agent`.
- Create the directory if it does not exist.

All subsequent paths (`cases/`, `output/`, `.cache/`, `scripts/`, `references/`) resolve under `{project-root}/`.

Recommended layout inside `{project-root}/`:

```text
{project-root}/
├── cases/
│   └── {article-id}.md
├── output/
│   ├── {article-id}/
│   │   ├── raw/
│   │   │   ├── A1_example.png
│   │   │   └── A2_example.png
│   │   ├── A1_example.png
│   │   ├── A2_example.png
│   │   └── README.md
│   └── markdowns/
│       └── {article-id}.md
└── .cache/
    └── screenshots/
        └── {article-id}/
```

Conventions:
- `cases/` holds the source Markdown.
- `output/{article-id}/raw/` holds original browser screenshots — **never overwrite** files here.
- `output/{article-id}/` holds post-processed assets that the final Markdown references.
- `output/markdowns/` holds the final illustrated Markdown.
- `.cache/screenshots/` holds reusable screenshot cache entries.

If the user specifies a different layout, follow their instruction.

### Step 1: Parse the case file

Merge image requirements from three sources:

1. inline heading-based screenshot markers
2. inline `<!-- IMAGE: ... -->` markers
3. the `Image Summary` table

For each image, record: type (`screenshot` or `generated`), filename, marker id if present, description or purpose, source URL if present, post-processing instruction if present, exact inline location if present, and whether semantic placement is still required. Also detect the target websites referenced by the article.

### Step 2: Prepare the environment

- create output directories
- check the screenshot cache for reusable entries
- load credentials from environment variables (pattern: `PLAYWRIGHT_CRED_{SERVICE}_{FIELD}`)
- re-confirm Playwright MCP tools are present
- if the Chromium runtime is missing, run `npx playwright install chromium` (see `references/mcp-setup.md`)
- if the target flow needs login/signup/invite/verification and the required information is not already supplied, pause and ask the user before taking any account-specific action

### Step 2.5: Understand the target site

Bad screenshots usually come from landing on the wrong page, not from the wrong capture command. Before capturing:

1. Check for existing site knowledge under `$IMAGE_AGENT_SITE_KNOWLEDGE_DIR/` and `$IMAGE_AGENT_SITE_LEARNING_DIR/`.
2. Derive a stable `site-key` from the domain (`memclaw.me` → `memclaw`, `app.felo.ai` → `felo`).
3. If `{site-key}.md` exists and is recent, read it before browsing.
4. If knowledge is missing or stale, run a structured site exploration — see `references/site-explorer.md` — and save findings for reuse.
5. Map every screenshot description to a specific page or UI state: target URL or click path, required visible elements, scroll/tab/expand actions needed.
6. Append new knowledge to the site knowledge files whenever browsing discovers something worth remembering.

### Step 3: Capture browser screenshots

Follow `references/browser-capture.md` for the full navigate → snapshot → act → wait → capture → verify loop and the concrete tool patterns.

Typical flow:
- open the target website
- log in if required (credentials from env)
- navigate to the correct page or UI state
- wait for key content to load
- resize the viewport if the requested layout needs it
- save screenshots to `{project-root}/output/{article-id}/raw/`

Naming rule:
- if a marker id exists, save as `{marker-id}_{filename}` (e.g. `A1_workspace-dashboard.png`)
- otherwise use the origina
Files: 9
Size: 57.0 KB
Complexity: 73/100
Category: Design

Related in Design