dev-browser
Browser automation and scraping with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.
What this skill does
# Dev Browser
Browser automation that maintains page state across script executions. Write small, focused scripts to accomplish tasks incrementally.
## Setup
Two modes available. Use Standalone Mode by default. Use Extension Mode only
when the user intentionally wants their existing Chrome profile, logged-in
sessions, or installed extensions.
### Standalone Mode (Default)
Launches a new Chromium browser. Start the server and verify readiness:
```bash
cd skills/dev-browser && ./server.sh &
# Verify server is ready (poll HTTP endpoint — do NOT watch stdout)
for i in $(seq 1 30); do
curl -s http://localhost:9222 > /dev/null 2>&1 && echo "Server ready" && break
sleep 1
done
```
Add `--headless` to `./server.sh --headless &` for headless mode.
If the server is already running from a previous session, the curl check succeeds immediately — no restart needed.
### Extension Mode
Connects to the user's existing Chrome browser (preserving logged-in sessions):
```bash
cd skills/dev-browser && npm i && npm run start-extension &
```
Wait for `Extension connected` in the output. If the extension hasn't connected, tell the user to install it from: https://github.com/SawyerHood/dev-browser/releases
For detailed server lifecycle management, see [Server Management](references/server-management.md).
## Choosing Your Approach
- **Local/source-available sites**: Read the source code first to write selectors directly
- **Unknown page layouts**: Use `getAISnapshot()` to discover elements — see [ARIA Snapshots](references/aria-snapshots.md)
- **Visual feedback**: Take screenshots to see what the user sees
## Writing Scripts
> **Run all scripts from `skills/dev-browser/` directory.** The `@/` import alias requires this.
Execute scripts inline using heredocs:
```bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect, waitForPageLoad } from "@/client.js";
const client = await connect();
const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });
await page.goto("https://example.com");
await waitForPageLoad(page);
console.log({ title: await page.title(), url: page.url() });
await client.disconnect();
EOF
```
Write to `tmp/` files only when the script needs reuse or is complex.
### Key Principles
1. **Small scripts**: Each script does ONE thing (navigate, click, fill, check)
2. **Evaluate state**: Log/return state at the end to decide next steps
3. **Descriptive page names**: Use `"checkout"`, `"login"`, not `"main"`
4. **Always disconnect**: `await client.disconnect()` — pages persist on server
5. **Plain JS in evaluate**: `page.evaluate()` runs in the browser — no TypeScript syntax
```typescript
// Correct: plain JavaScript in browser context
const text = await page.evaluate(() => {
return document.body.innerText;
});
// Wrong: TypeScript syntax breaks at runtime
const text = await page.evaluate(() => {
const el: HTMLElement = document.body; // Type annotation fails in browser!
return el.innerText;
});
```
## Workflow Loop
Follow this pattern for complex tasks:
1. **Write a script** to perform one action
2. **Run it** and observe the output
3. **Evaluate** — did it work? What's the current state?
4. **Decide** — is the task complete or do we need another script?
5. **Repeat** until done
## Inspecting Page State
### Screenshots
```typescript
await page.screenshot({ path: "tmp/screenshot.png" });
await page.screenshot({ path: "tmp/full.png", fullPage: true });
```
### ARIA Snapshots
Use `getAISnapshot()` to discover page elements without knowing selectors. Returns a YAML accessibility tree with `[ref=eN]` markers for interactive elements:
```typescript
const snapshot = await client.getAISnapshot("mypage");
console.log(snapshot);
// Interact with a discovered element
const element = await client.selectSnapshotRef("mypage", "e5");
await element.click();
```
For the full guide on interpreting snapshots, refs, and the discover-then-interact pattern, see [ARIA Snapshots](references/aria-snapshots.md).
## Waiting
```typescript
import { waitForPageLoad } from "@/client.js";
await waitForPageLoad(page); // After navigation
await page.waitForSelector(".results"); // For specific elements
await page.waitForURL("**/success"); // For specific URL
```
## Error Recovery
Page state persists after failures. Debug with:
```bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect } from "@/client.js";
const client = await connect();
const page = await client.page("mypage");
await page.screenshot({ path: "tmp/debug.png" });
console.log({
url: page.url(),
title: await page.title(),
bodyText: await page.textContent("body").then((t) => t?.slice(0, 200)),
});
await client.disconnect();
EOF
```
## Client API Quick Reference
```typescript
const client = await connect();
const page = await client.page("name"); // Get or create page
const page = await client.page("name", { viewport: {} }); // With viewport
const pages = await client.list(); // List all pages
await client.close("name"); // Close a page
await client.disconnect(); // Disconnect (pages persist)
const snapshot = await client.getAISnapshot("name"); // Accessibility tree
const el = await client.selectSnapshotRef("name", "e5"); // Element by ref
const info = await client.getServerInfo(); // Server mode info
```
The `page` object is a standard Playwright `Page`. For the full API reference with all parameters, return types, and options, see [Client API](references/client-api.md).
## Scraping
For large datasets (followers, posts, search results), intercept and replay network requests rather than scrolling the DOM. See [Scraping Guide](references/scraping.md).
## Reference Documentation
- **[Server Management](references/server-management.md)** — Starting, stopping, verifying readiness, troubleshooting
- **[Client API](references/client-api.md)** — Full API reference with all methods, parameters, and types
- **[ARIA Snapshots](references/aria-snapshots.md)** — Element discovery, ref system, interaction patterns
- **[Scraping Guide](references/scraping.md)** — Network interception, API replay, pagination
Related in Code Review
gstack
IncludedFast headless browser for QA testing and site dogfooding. Navigate pages, interact with elements, verify state, diff before/after, take annotated screenshots, test responsive layouts, forms, uploads, dialogs, and capture bug evidence. Use when asked to open or test a site, verify a deployment, dogfood a user flow, or file a bug with screenshots. (gstack)
startup-due-diligence
IncludedLegal due diligence review for seed-stage and Series A startups (US, Delaware C-Corp focus). Supports both investor and founder perspectives. Capabilities include: (1) Interactive document review and issue spotting; (2) Document request list generation; (3) Cap table and SAFE/convertible note analysis; (4) Red flag identification with severity ratings; (5) Diligence report generation. TRIGGERS: due diligence, DD, startup investment, cap table review, Series A, seed round, investor diligence, legal review startup, SAFE analysis, convertible note, 409A, founder vesting.
interview-master
IncludedThis skill should be used when the user asks to "generate interview questions", "prepare for interview", "optimize resume", "conduct mock interview", "analyze git commits for resume", "generate resume from code", "review my resume", or mentions interview preparation, career assistance, or extracting project experience from git history. Provides comprehensive interview and career development guidance for both job seekers and interviewers.
fix-issue
IncludedFixes GitHub issues using parallel analysis agents for root cause investigation, code exploration, and regression detection. Reads issue context from gh CLI, searches codebase and memory for related patterns, generates a fix with tests, and links the resolution back to the issue via PR. Includes prevention analysis to avoid recurrence. Use when debugging errors, resolving regressions, fixing bugs, or triaging issues.
sf-apex
IncludedGenerates and reviews Salesforce Apex code with 150-point scoring. TRIGGER when: user writes, reviews, or fixes Apex classes, triggers, test classes, batch/queueable/schedulable jobs, or touches .cls/.trigger files. DO NOT TRIGGER when: LWC JavaScript (use sf-lwc), Flow XML (use sf-flow), SOQL-only queries (use sf-soql), or non-Salesforce code.
swift-development
IncludedComprehensive Swift development for building, testing, and deploying iOS/macOS applications. Use when Claude needs to: (1) Build Swift packages or Xcode projects from command line, (2) Run tests with XCTest or Swift Testing framework, (3) Manage iOS simulators with simctl, (4) Handle code signing, provisioning profiles, and app distribution, (5) Format or lint Swift code with SwiftFormat/SwiftLint, (6) Work with Swift Package Manager (SPM), (7) Implement Swift 6 concurrency patterns (async/await, actors, Sendable), (8) Create SwiftUI views with MVVM architecture, (9) Set up Core Data or SwiftData persistence, or any other Swift/iOS/macOS development tasks.