just-scrape

Included with Lifetime

$97 forever

Search, scrape, crawl, extract structured data, and monitor web pages via the ScrapeGraph AI CLI. Use when the user asks to search the web, scrape a webpage, grab content from a URL, extract JSON from a site, crawl documentation or site sections, monitor a page for changes, inspect request history, check ScrapeGraph credits, or validate API setup.

Backend & APIs

What this skill does


# just-scrape CLI

Search, scrape, crawl, extract structured JSON, and monitor page changes using the just-scrape CLI.

Run `just-scrape --help` or `just-scrape <command> --help` for full option details.

If the task is to integrate ScrapeGraph AI into application code, add `SGAI_API_KEY` to a project, or choose endpoint usage in product code, inspect the project first and use the ScrapeGraph AI SDK/API docs directly instead of this CLI skill.

## Prerequisites

Must be installed and authenticated. Check with `just-scrape validate` and `just-scrape credits`.

```bash
command -v just-scrape >/dev/null 2>&1 || npm install -g just-scrape@latest
just-scrape validate
just-scrape credits
```

- **API key**: Set `SGAI_API_KEY`, use a `.env` file, use `~/.scrapegraphai/config.json`, or complete the interactive prompt.
- **Credits**: Remaining ScrapeGraph AI credits. Each operation consumes credits.

Before doing real work, verify the setup with one small request:

```bash
mkdir -p .just-scrape
just-scrape scrape "https://example.com" --json > .just-scrape/install-check.json
```

```bash
just-scrape search "query" --num-results 3 --json > .just-scrape/search-check.json
```

## Workflow

Follow this escalation pattern:

1. **Search** - No specific URL yet. Find pages, answer questions, discover sources.
2. **Scrape** - Have a URL. Extract markdown, html, screenshots, links, images, summaries, or branding.
3. **Extract** - Need structured JSON from a known URL with an AI prompt and optional schema.
4. **Crawl** - Need bulk content from an entire site section.
5. **Monitor** - Need scheduled page-change tracking with optional webhook notifications.

| Need                        | Command    | When                                       |
| --------------------------- | ---------- | ------------------------------------------ |
| Find pages on a topic       | `search`   | No specific URL yet                        |
| Get a page's content        | `scrape`   | Have a URL, need one or more page formats  |
| AI-powered data extraction  | `extract`  | Need structured data from a known URL      |
| Bulk extract a site section | `crawl`    | Need many pages or docs sections           |
| Track changes over time     | `monitor`  | Need recurring scraping and webhooks       |
| Inspect prior requests      | `history`  | Need past request IDs, status, or payloads |
| Check credit balance        | `credits`  | Need remaining API credits                 |
| Validate API setup          | `validate` | Need health check and API key validation   |

For detailed command reference, run `just-scrape <command> --help`.

**Scrape vs extract:**

- Use `scrape` for raw page formats: `markdown`, `html`, `screenshot`, `branding`, `links`, `images`, `summary`.
- Use `scrape -f json -p "<prompt>"` or `extract -p "<prompt>"` for AI-structured output.
- Use `extract` when the task is only structured data. Use `scrape` when mixed formats are needed in one call.

**Avoid redundant fetches:**

- `search -p` can extract structured data from search results. Do not re-scrape those URLs unless results are incomplete.
- `crawl` already fetches per-page formats. Do not re-scrape every crawled URL unless a second pass is required.
- Check `.just-scrape/` for existing data before fetching again.

## Commands

### Search

```bash
just-scrape search "query"
just-scrape search "query" --num-results 10
just-scrape search "query" -p "Extract provider names and prices"
just-scrape search "query" -p "Extract provider names and prices" --schema '<json-schema>'
just-scrape search "query" --format html
just-scrape search "query" --country us
just-scrape search "query" --time-range past_week
```

Time ranges: `past_hour`, `past_24_hours`, `past_week`, `past_month`, `past_year`.

### Scrape

```bash
just-scrape scrape "<url>"
just-scrape scrape "<url>" -f markdown
just-scrape scrape "<url>" -f html
just-scrape scrape "<url>" -f markdown,html,links --json
just-scrape scrape "<url>" -f screenshot
just-scrape scrape "<url>" -f branding
just-scrape scrape "<url>" -f summary
just-scrape scrape "<url>" -f json -p "Extract all products"
just-scrape scrape "<url>" -f json -p "Extract all products" --schema '<json-schema>'
just-scrape scrape "<url>" --html-mode reader
just-scrape scrape "<url>" --mode js --stealth --scrolls 5
just-scrape scrape "<url>" --country DE
```

Formats: `markdown`, `html`, `screenshot`, `branding`, `links`, `images`, `summary`, `json`.

### Extract

```bash
just-scrape extract "<url>" -p "Extract product names and prices"
just-scrape extract "<url>" -p "Extract headlines and dates" --schema '<json-schema>'
just-scrape extract "<url>" -p "Extract visible items" --scrolls 5
just-scrape extract "<url>" -p "Extract account stats" --cookies "{\"session\":\"$SESSION_COOKIE\"}" --stealth
just-scrape extract "<url>" -p "Extract table rows" --headers "{\"Authorization\":\"Bearer $API_TOKEN\"}"
just-scrape extract "<url>" -p "Extract article data" --html-mode reader
just-scrape extract "<url>" -p "Extract localized prices" --country DE
```

Use `--schema` for a strict output shape.

### Crawl

```bash
just-scrape crawl "<url>"
just-scrape crawl "<url>" -f markdown,links
just-scrape crawl "<url>" --max-pages 50 --max-depth 3
just-scrape crawl "<url>" --max-links-per-page 20
just-scrape crawl "<url>" --allow-external
just-scrape crawl "<url>" --include-patterns '["^https://example\\.com/docs/.*"]'
just-scrape crawl "<url>" --exclude-patterns '[".*\\.pdf$"]'
just-scrape crawl "<url>" --mode js --stealth
```

Set `--max-pages`, `--max-depth`, and include/exclude patterns before broad crawls.

### Monitor

```bash
just-scrape monitor create --url "<url>" --interval 1h --name "Pricing tracker" -f markdown
just-scrape monitor create --url "<url>" --interval "0 * * * *" --webhook-url "$WEBHOOK_URL"
just-scrape monitor list
just-scrape monitor get --id <cronId>
just-scrape monitor update --id <cronId> --interval 30m
just-scrape monitor activity --id <cronId> --limit 50
just-scrape monitor pause --id <cronId>
just-scrape monitor resume --id <cronId>
just-scrape monitor delete --id <cronId>
```

Intervals accept cron expressions or shorthands such as `30m`, `1h`, and `1d`.

### History

```bash
just-scrape history
just-scrape history scrape
just-scrape history extract --json
just-scrape history crawl --page-size 100 --json
just-scrape history scrape <request-id> --json
```

Services: `scrape`, `extract`, `search`, `crawl`, `monitor`.

### Credits and Validate

```bash
just-scrape credits
just-scrape credits --json
just-scrape validate
just-scrape validate --json
```

## When to Load References

- **Searching the web or finding sources first** -> use `just-scrape search`
- **Scraping a known URL** -> use `just-scrape scrape`
- **AI-powered structured extraction from a known URL** -> use `just-scrape extract`
- **Bulk extraction from a docs section or site** -> use `just-scrape crawl`
- **Recurring page-change tracking** -> use `just-scrape monitor`
- **Install, auth, or setup problems** -> run `just-scrape validate` and inspect `SGAI_API_KEY`
- **Output handling and safe file-reading patterns** -> use `.just-scrape/` and incremental reads
- **Integrating ScrapeGraph AI into an app, adding `SGAI_API_KEY` to `.env`, or choosing endpoint usage in product code** -> use SDK/API docs, not this CLI flow

## Output & Organization

Unless the user specifies to return in context, write results to `.just-scrape/` with shell redirection. Add `.just-scrape/` to `.gitignore`. Always quote URLs - shell interprets `?` and `&` as special characters.

```bash
just-scrape search "react hooks" --json > .just-scrape/search-react-hooks.json
just-scrape scrape "<url>" --json > .just-scrape/page.json
just-scrape extract "<url>" -p "Extract title and author" --json > .just-scrape/extract-title-author.json
```

Naming conventions:

```text
.just-scrape/search-{query}.json
.just-scrape/{site}-{path}-scrape.json
.just-scrape/{site}-{path}-ext

Files: 1

Size: 11.8 KB

Complexity: 22/100

Category: Backend & APIs

Source: https://github.com/scrapegraphai/just-scrape/tree/main/skills/just-scrape

Related in Backend & APIs

jfrog

Included

Interact with the JFrog Platform via the JFrog CLI and REST/GraphQL APIs. Use this skill when the user wants to manage Artifactory repositories, upload or download artifacts, manage builds, configure permissions, manage users and groups, work with access tokens, configure JFrog CLI servers, search artifacts, manage properties, set up replication, manage JFrog Projects, run security audits or scans, look up CVE details, query exposures scan results from JFrog Advanced Security, manage release bundles and lifecycle operations, aggregate or export platform data, or perform any JFrog Platform administration task. Also use when the user mentions jf, jfrog, artifactory, xray, distribution, evidence, apptrust, onemodel, graphql, workers, mission control, curation, advanced security, exposures, or any JFrog product name.

Backend & APIsscripts

cupynumeric-migration-readiness

Included

Pre-migration readiness assessor for porting NumPy to cuPyNumeric. Use BEFORE substantial porting work begins when the user asks whether code will scale on GPU, whether they should migrate to cuPyNumeric, which NumPy patterns transfer cleanly, what must be refactored before porting, or mentions pre-port assessment, scaling analysis, or refactor planning. Inspect the user's source code, look up NumPy usage, cross-reference the cuPyNumeric API support manifest, and distinguish distributed-scaling-friendly patterns from blockers such as unsupported APIs, scalar synchronization, host round-trips, Python/object-heavy control flow, shape/data-dependent branching, and in-place mutation hazards. Produce a verdict of READY, LIGHT REFACTOR, SIGNIFICANT REFACTOR, or NOT RECOMMENDED, with concrete refactor pointers.

Backend & APIsscripts

alibabacloud-data-agent-skill

Included

Invoke Alibaba Cloud Apsara Data Agent for Analytics via CLI to perform natural language-driven data analysis on enterprise databases. Data Agent for Analytics is an intelligent data analysis agent developed by Alibaba Cloud Database team for enterprise users. It automatically completes requirement analysis, data understanding, analysis insights, and report generation based on natural language descriptions. This tool supports: discovering data resources (instances/databases/tables) managed in DMS, initiating query or deep analysis sessions, real-time progress tracking, and retrieving analysis conclusions and generated reports. Use this Skill when users need to query databases, analyze data trends, generate data reports, ask questions in natural language, or mention "Data Agent", "data analysis", "database query", "SQL analysis", "data insights".

Backend & APIsscripts

token-optimizer

Included

Reduce OpenClaw token usage and API costs through smart model routing, heartbeat optimization, budget tracking, and native 2026.2.15 features (session pruning, bootstrap size limits, cache TTL alignment). Use when token costs are high, API rate limits are being hit, or hosting multiple agents at scale. The 4 executable scripts (context_optimizer, model_router, heartbeat_optimizer, token_tracker) are local-only — no network requests, no subprocess calls, no system modifications. Reference files (PROVIDERS.md, config-patches.json) document optional multi-provider strategies that require external API keys and network access if you choose to use them. See SECURITY.md for full breakdown.

Backend & APIsscripts

resend-cli

Included

Use this skill when the task is specifically about operating Resend from an AI agent, terminal session, or CI job via the official resend CLI: installing/authenticating the CLI, sending/listing/updating/cancelling emails, batch sends, domains and DNS, webhooks and local listeners, inbound receiving, contacts, topics, segments, broadcasts, templates, API keys, profiles, or debugging Resend CLI/API failures. Trigger on mentions of Resend CLI, `resend`, `resend doctor`, `resend emails send`, `resend domains`, `resend webhooks listen`, `resend emails receiving`, or agent-friendly terminal automation.

Backend & APIsscripts

alibabacloud-odps-maxframe-coding

Included

Use this skill for MaxFrame SDK development and documentation navigation on Alibaba Cloud MaxCompute (ODPS). Helps answer MaxFrame API, concept, official example, and supported pandas API questions; create data processing programs; read/write MaxCompute tables; debug jobs (remote or local); and build custom DPE runtime images. Trigger when users mention MaxFrame, MaxCompute with MaxFrame, ODPS table processing, DPE runtime, MaxFrame docs/examples, DataFrame/Tensor operations, or GPU runtime setup. Works for both English and Chinese queries about Alibaba Cloud data processing with MaxFrame.

Backend & APIsscripts