atr-browser
Control browser, automate browser interactions, navigate to URLs, click on webpages, fill forms, take screenshots, inspect webpages, web scraping with browser, test websites manually, or interact with web pages programmatically using ATR browser server mode. Pairs with the atr-computer skill when a workflow needs the OS desktop too (drag-and-drop from the file manager, system dialogs, native apps).
What this skill does
> **Companion skill — `atr-computer`** controls the OS desktop (mouse, keyboard, screen, windows, app launch, plus an in-process LLM agent via `atr computer ask`). Load both when: > > - The browser action needs an OS file picker, drag from the file manager, or anything outside the page (e.g. authentication via 1Password desktop app). > - You want a single high-level instruction to span web + desktop — `atr computer ask "<instruction>"` can call `atr browser` along the way. > > Browser daemon listens on port 9333; computer daemon on 9334. Both can run simultaneously. # ATR Browser Automation Skill This skill provides browser automation capabilities through ATR's browser server mode. The browser server runs as a daemon process and accepts CLI commands for browser control. ## Architecture ``` Claude Code --> atr CLI (client) --> ATR Server + Browser ``` The browser runs in visible (non-headless) mode by default for debugging and verification. ## Getting Started ### Step 1: Check Browser Status and Start if Needed Before any browser operations, verify the browser server is running: ```bash atr browser status ``` If the server is not running, start it: ```bash atr browser start ``` The server stores state at `~/.atr/browser.state` which allows subsequent commands to discover the endpoint automatically. ### Step 2: Navigate and Interact Once running, use navigation and interaction commands to control the browser. ## Command Categories ### Lifecycle Commands | Command | Description | |---------|-------------| | `atr browser start [--port PORT]` | Start browser daemon (default port: 9333) | | `atr browser stop` | Stop browser daemon | | `atr browser status` | Check if browser is running | ### Navigation Commands | Command | Description | |---------|-------------| | `atr browser navigate <url>` | Navigate to URL | | `atr browser back` | Go back in history | | `atr browser forward` | Go forward in history | | `atr browser reload` | Reload current page | ### Page Management Commands | Command | Description | |---------|-------------| | `atr browser new-page [url]` | Open new tab | | `atr browser list-pages` | List all tabs | | `atr browser select-page <index>` | Switch to tab (0-based) | | `atr browser close-page <index>` | Close tab | ### Interaction Commands | Command | Description | |---------|-------------| | `atr browser click <target> [--double]` | Click element (use --double for double-click) | | `atr browser fill <target> <value>` | Type into input field | | `atr browser hover <target>` | Hover over element | | `atr browser press-key <key>` | Press keyboard key (e.g., Enter, Tab, Control+A) | | `atr browser drag <from> <to>` | Drag element | | `atr browser wait <selector> [--timeout] [--visible]` | Wait for element to appear | | `atr browser scroll --selector "<sel>" [--y N] [--to-bottom]` | Scroll inside an element | | `atr browser download-images "<sel>" [--output-dir] [--fallback-screenshot]` | Download/screenshot images within elements | | `atr browser viewport [W H] [--preset mobile\|tablet\|desktop\|wide]` | Get or set viewport size | | `atr browser batch [--file F] [--on-error stop\|continue\|retry:N]` | Execute multiple commands from stdin/file | ### Recording Commands | Command | Description | |---------|-------------| | `atr browser record [--url URL] [-o FILE]` | Record browser interactions as a behavior test | ### Inspection Commands | Command | Description | |---------|-------------| | `atr browser snapshot [--verbose]` | Get page elements with UIDs | | `atr browser screenshot --file [--full] [-s SELECTOR]` | Capture screenshot (saves to /tmp/) | | `atr browser screenshot --file --selector-all "<sel>"` | Screenshot all matching elements | | `atr browser computed-styles "<selector>" [--properties]` | Get computed CSS styles for single element | | `atr browser computed-styles --selector-all "<sel>"` | Get computed CSS styles for all matching elements | | `atr browser computed-styles-diff "<sel>" --against N` | Compare styles between pages | | `atr browser text "<selector>" [--flat\|--links\|--headings]` | Extract text content | | `atr browser font-check "<font-family>"` | Check if font is loaded and rendering | | `atr browser clean-snapshot "<selector>" [--depth N] [--max-length N]` | Get cleaned DOM subtree (no noise/tracking attrs) | | `atr browser computed-styles --selector "h1" --selector "p"` | Batch computed styles for multiple selectors | | `atr browser computed-styles-diff --selector "h1" --selector "p" --against 0` | Batch style diff with overall score | | `atr browser html` | Get page HTML | | `atr browser url` | Get current URL | | `atr browser title` | Get page title | | `atr browser eval <script>` | Execute JavaScript | | `atr browser ask "<question>"` | Ask a question about the current page | **Screenshot Note:** Use `--file` to save screenshots to `/tmp/` with a timestamped filename (e.g., `/tmp/atr-screenshot-20240105-103045.png`). Add `--full` for full-page screenshots. Use `--selector` / `-s` to screenshot a specific element by CSS selector (e.g., `-s "header"`, `-s "#nav"`, `-s "main > section:nth-child(2)"`). Combine `--selector` with `--full` to capture an element's full scrollable height. Use `--selector-all` to screenshot every matching element as numbered PNGs — elements that fail or timeout are skipped (use `--timeout <ms>` to control per-element timeout, default 30s). Without `--file`, returns base64-encoded image data. ### Debugging Commands | Command | Description | |---------|-------------| | `atr browser console [--limit N]` | Get console messages (default: 50) | | `atr browser network [--limit N]` | Get network requests (default: 50) | | `atr browser errors` | Get failed requests | ## Workflow Pattern Follow this workflow for browser automation tasks: 1. **Ensure Server Running** ```bash atr browser status || atr browser start ``` 2. **Navigate to Target** ```bash atr browser navigate https://example.com ``` 3. **Inspect Page Elements** ```bash atr browser snapshot ``` This returns elements with unique IDs (UIDs) like `e0`, `e1`, etc. 4. **Interact with Elements** Target elements by: - Text content: `"Sign In"` - UID from snapshot: `e5` - CSS selector: `.submit-button` 5. **Verify Results** ```bash atr browser url atr browser title atr browser screenshot --file ``` 6. **Cleanup When Done** ```bash atr browser stop ``` ## Asking Questions About a Page Use `atr browser ask` when you need specific information from a page without flooding your context with raw HTML or snapshot data. A lightweight sub-agent inspects the page using multiple tools and returns a concise text answer. ```bash atr browser ask "What is the main heading on this page?" atr browser ask "How many items are in the navigation menu?" atr browser ask "Is there a login form on this page?" ``` Prefer `ask` over `html` or `snapshot` when: - You need a specific fact, not the full page structure - You want to keep your context clean for subsequent reasoning - The answer can be expressed as a short text response ## Recording Browser Interactions Use `atr browser record` to capture user interactions and output a `.test.txt` behavior test file. A floating overlay appears in the browser showing recorded steps in real time. ```bash # Record interactions on a page atr browser record --url https://example.com -o repro-steps.test.txt # Record on the already-open page atr browser record -o flow.test.txt ``` Stop recording with Ctrl+C in the terminal or the "Stop" button in the browser overlay. The recorder captures clicks, form fills, keyboard shortcuts, navigation, scrolling, and select changes. It handles shadow DOM (web components) and generates stable CSS selectors. Password fields are automatically masked. The output `.test.txt` file can be replayed with `atr run --behavior repro-steps.test.txt`. ## Using JSON Output Add `--json` flag for
Related in Code Review
gstack
IncludedFast headless browser for QA testing and site dogfooding. Navigate pages, interact with elements, verify state, diff before/after, take annotated screenshots, test responsive layouts, forms, uploads, dialogs, and capture bug evidence. Use when asked to open or test a site, verify a deployment, dogfood a user flow, or file a bug with screenshots. (gstack)
startup-due-diligence
IncludedLegal due diligence review for seed-stage and Series A startups (US, Delaware C-Corp focus). Supports both investor and founder perspectives. Capabilities include: (1) Interactive document review and issue spotting; (2) Document request list generation; (3) Cap table and SAFE/convertible note analysis; (4) Red flag identification with severity ratings; (5) Diligence report generation. TRIGGERS: due diligence, DD, startup investment, cap table review, Series A, seed round, investor diligence, legal review startup, SAFE analysis, convertible note, 409A, founder vesting.
interview-master
IncludedThis skill should be used when the user asks to "generate interview questions", "prepare for interview", "optimize resume", "conduct mock interview", "analyze git commits for resume", "generate resume from code", "review my resume", or mentions interview preparation, career assistance, or extracting project experience from git history. Provides comprehensive interview and career development guidance for both job seekers and interviewers.
fix-issue
IncludedFixes GitHub issues using parallel analysis agents for root cause investigation, code exploration, and regression detection. Reads issue context from gh CLI, searches codebase and memory for related patterns, generates a fix with tests, and links the resolution back to the issue via PR. Includes prevention analysis to avoid recurrence. Use when debugging errors, resolving regressions, fixing bugs, or triaging issues.
sf-apex
IncludedGenerates and reviews Salesforce Apex code with 150-point scoring. TRIGGER when: user writes, reviews, or fixes Apex classes, triggers, test classes, batch/queueable/schedulable jobs, or touches .cls/.trigger files. DO NOT TRIGGER when: LWC JavaScript (use sf-lwc), Flow XML (use sf-flow), SOQL-only queries (use sf-soql), or non-Salesforce code.
swift-development
IncludedComprehensive Swift development for building, testing, and deploying iOS/macOS applications. Use when Claude needs to: (1) Build Swift packages or Xcode projects from command line, (2) Run tests with XCTest or Swift Testing framework, (3) Manage iOS simulators with simctl, (4) Handle code signing, provisioning profiles, and app distribution, (5) Format or lint Swift code with SwiftFormat/SwiftLint, (6) Work with Swift Package Manager (SPM), (7) Implement Swift 6 concurrency patterns (async/await, actors, Sendable), (8) Create SwiftUI views with MVVM architecture, (9) Set up Core Data or SwiftData persistence, or any other Swift/iOS/macOS development tasks.