Claude
Skills
Sign in
Back

browsing

Included with Lifetime
$97 forever

Use when you need direct browser control - teaches Chrome DevTools Protocol for controlling existing browser sessions, multi-tab management, form automation, and content extraction via use_browser MCP tool

AI Agents

What this skill does


# Browsing with Chrome Direct

## Overview

Control Chrome via DevTools Protocol using the `use_browser` MCP tool. Single unified interface with auto-starting Chrome.

**Announce:** "I'm using the browsing skill to control Chrome."

## When to Use

**Use this when:**
- Controlling authenticated sessions
- Managing multiple tabs in running browser
- Playwright MCP unavailable or excessive

**Use Playwright MCP when:**
- Need fresh browser instances
- Generating screenshots/PDFs
- Prefer higher-level abstractions

## Auto-Capture

Every DOM action (navigate, click, type, select, eval, keyboard_press, hover, drag_drop, double_click, right_click, file_upload) automatically saves:
- `{prefix}.png` — viewport screenshot
- `{prefix}.md` — page content as structured markdown
- `{prefix}.html` — full rendered DOM
- `{prefix}-console.txt` — browser console messages

Files are saved to the session directory with sequential prefixes (001-navigate, 002-click, etc.). You must check these before using extract or screenshot actions.

## The use_browser Tool

Single MCP tool with action-based interface. Chrome auto-starts on first use.

**Parameters:**
- `action` (required): Operation to perform
- `selector` (optional): CSS or XPath selector for element operations
- `payload` (optional): Action-specific data (string or object)
- `timeout` (optional): Timeout in ms for await operations (default: 5000)

**Active tab**: Every action operates on the current `activeTab`. Use `switch_tab` to change it.

## Actions Reference

### Navigation
- **navigate**: Navigate to URL
  - `payload`: URL string
  - Example: `{action: "navigate", payload: "https://example.com"}`

- **await_element**: Wait for element to appear
  - `selector`: CSS selector
  - `timeout`: Max wait time in ms
  - Example: `{action: "await_element", selector: ".loaded", timeout: 10000}`

- **await_text**: Wait for text to appear
  - `payload`: Text to wait for
  - Example: `{action: "await_text", payload: "Welcome"}`

### Interaction
- **click**: Click element
  - `selector`: CSS selector
  - Example: `{action: "click", selector: "button.submit"}`

- **type**: Text input
  - `selector`: Optional — clicks to focus first
  - `payload`: Text to type (`\t`=Tab, `\n`=Enter)
  - Example: `{action: "type", selector: "#email", payload: "[email protected]"}`

- **double_click**: Double-click element (fires dblclick event)
  - `selector`: CSS selector
  - Example: `{action: "double_click", selector: ".item"}`

- **right_click**: Right-click element (fires contextmenu event)
  - `selector`: CSS selector
  - Example: `{action: "right_click", selector: ".row"}`

- **select**: Select dropdown option
  - `selector`: CSS selector
  - `payload`: Option value(s)
  - Example: `{action: "select", selector: "select[name=state]", payload: "CA"}`

- **keyboard_press**: Press special keys (Tab, Enter, Escape, Arrow keys, F1-F12)
  - `payload`: Key name (string) or `{"key": "Tab", "modifiers": {"shift": true, "ctrl": false, "alt": false, "meta": false}}`
  - Example: `{action: "keyboard_press", payload: "Tab"}`
  - Example with modifiers: `{action: "keyboard_press", payload: {"key": "Tab", "modifiers": {"shift": true}}}`

### Mouse Actions (CDP-Level)
These use CDP Input.dispatchMouseEvent, bypassing synthetic event restrictions.

- **hover**: Move mouse over element (CSS :hover, tooltips, menus)
  - `selector`: CSS selector
  - Example: `{action: "hover", selector: ".menu-trigger"}`

- **drag_drop**: Drag element to target (native drag-and-drop via CDP)
  - `selector`: Source element
  - `payload`: Target selector or JSON coordinates `{"x":N,"y":N}`
  - Example: `{action: "drag_drop", selector: ".card", payload: ".column-2"}`

- **mouse_move**: Move mouse to coordinates
  - `payload`: JSON `{"x":N,"y":N}` (optional: `steps`, `fromX`, `fromY` for smooth movement)
  - Example: `{action: "mouse_move", payload: "{\"x\":100,\"y\":200}"}`

- **scroll**: Scroll via mouse wheel events
  - `payload`: Direction (up/down/left/right) or JSON `{"deltaX":N,"deltaY":N}`
  - `selector`: Optional — scroll within element
  - Example: `{action: "scroll", payload: "down"}`

### File Upload
- **file_upload**: Set files on input[type=file] elements (can't be done via JavaScript)
  - `selector`: File input element
  - `payload`: File path or JSON `{"files":["/path/a.pdf","/path/b.jpg"]}`
  - Example: `{action: "file_upload", selector: "#upload", payload: "/tmp/doc.pdf"}`

### Extraction
- **extract**: Get page content
  - `payload`: Format ('markdown'|'text'|'html')
  - `selector`: Optional - limit to element
  - Example: `{action: "extract", payload: "markdown"}`
  - Example: `{action: "extract", payload: "text", selector: "h1"}`

- **attr**: Get element attribute
  - `selector`: CSS selector
  - `payload`: Attribute name
  - Example: `{action: "attr", selector: "a.download", payload: "href"}`

- **eval**: Execute JavaScript
  - `payload`: JavaScript code
  - Example: `{action: "eval", payload: "document.title"}`

### Export
- **screenshot**: Capture screenshot of a specific element
  - `payload`: Filename
  - `selector`: Optional - screenshot specific element
  - Viewport screenshots are auto-captured after every DOM action. Use this only when you need a specific element.
  - Example: `{action: "screenshot", payload: "/tmp/chart.png", selector: ".chart"}`

### Tab Management
- **list_tabs**: List all open tabs
  - Example: `{action: "list_tabs"}`

- **new_tab**: Create new tab
  - Example: `{action: "new_tab"}`

- **close_tab**: Close the active tab
  - Example: `{action: "close_tab"}`

- **switch_tab**: Switch the active tab (sticky — stays until changed)
  - `payload`: Tab index (number), URL substring, or title substring
  - Example: `{action: "switch_tab", payload: 1}` (by index)
  - Example: `{action: "switch_tab", payload: "example.com"}` (by URL substring)
  - Example: `{action: "switch_tab", payload: "GitHub"}` (by title substring)

### Browser Mode Control
- **show_browser**: Make browser window visible (headed mode)
  - Example: `{action: "show_browser"}`
  - ⚠️ **WARNING**: Restarts Chrome, reloads pages via GET, loses POST state

- **hide_browser**: Switch to headless mode (invisible browser)
  - Example: `{action: "hide_browser"}`
  - ⚠️ **WARNING**: Restarts Chrome, reloads pages via GET, loses POST state

- **browser_mode**: Check current browser mode, port, and profile
  - Example: `{action: "browser_mode"}`
  - Returns: `{"headless": true|false, "mode": "headless"|"headed", "running": true|false, "port": 9222, "profile": "name", "profileDir": "/path"}`

### Profile Management
- **set_profile**: Change Chrome profile (must kill Chrome first)
  - Example: `{action: "set_profile", "payload": "browser-user"}`
  - ⚠️ **WARNING**: Chrome must be stopped first
  - **Side effect**: marks the profile as explicit, opting out of auto-disambiguation (see below)

- **get_profile**: Get current profile name and directory
  - Example: `{action: "get_profile"}`
  - Returns: `{"profile": "name", "profileDir": "/path"}`

**Default behavior**: Chrome starts in **headless mode** with **"superpowers-chrome" profile** on a **dynamically allocated port** (range 9222-12111). Override the port with `CHROME_WS_PORT`; override the profile with `CHROME_WS_PROFILE`.

**Auto-disambiguation across parallel MCPs**:
When two MCP servers start on the same host with the default profile, the first claims `superpowers-chrome` (port 9222) and later ones silently fall through to `superpowers-chrome-2` (port 9223), `superpowers-chrome-3`, etc. Each MCP drives its own Chrome with its own profile dir; they don't fight over `activeTab`. The bridge tracks ownership via a lock file at `~/.cache/superpowers/browser-profiles/<profile>.mcp.lock`; stale locks (dead PIDs) are reclaimed automatically.

To opt **out** of disambiguation — e.g., to intentionally share Chrome between a long-lived `chrome-ws` CLI session and your MCP — set the profile name explicitly:
- Env var: `CHROME_WS
Files: 51
Size: 292.9 KB
Complexity: 63/100
Category: AI Agents

Related in AI Agents