wandb-plot
Download and generate plots from Weights & Biases runs. Use when you need to: - List projects you have access to - List runs in a W&B project - Inspect available metrics for a run - Download existing plot images from a run - Generate line plots from metric history (loss, accuracy, etc.)
What this skill does
# W&B Plot Skill
## MANDATORY Setup (Run First)
**IMPORTANT:** Before running ANY script, you MUST execute this setup block to ensure the correct working directory and virtual environment.
```bash
# Determine skill directory (Claude Code plugin or Codex/local)
if [ -n "${CLAUDE_PLUGIN_ROOT}" ]; then
SKILL_DIR="${CLAUDE_PLUGIN_ROOT}/skills/wandb-plot"
elif [ -d "${HOME}/.codex/wandb-plot-skill/skills/wandb-plot" ]; then
SKILL_DIR="${HOME}/.codex/wandb-plot-skill/skills/wandb-plot"
else
SKILL_DIR="$(pwd)"
fi
cd "$SKILL_DIR"
# Create/activate venv and install deps (uv preferred, pip fallback)
if [ ! -d ".venv" ]; then
if command -v uv &> /dev/null; then
uv venv .venv && . .venv/bin/activate && uv pip install -e .
else
python3 -m venv .venv && . .venv/bin/activate && pip install -e .
fi
else
. .venv/bin/activate
fi
```
After setup completes, all `python3 scripts/*.py` commands will work correctly from this directory.
## Prereqs
- Auth: set `WANDB_API_KEY` environment variable (recommended) or run `wandb login`.
## Tools (Scripts)
### `scripts/list_projects.py`
**Inputs**
- `--entity <entity>` (optional; defaults to current user or org)
- `--limit <n>` (optional, default: 100)
- `--json` (optional)
**Output**
- Stdout table (default) or JSON list (with `--json`), where each item includes:
- `name`, `entity`, `description`, `created_at`, `url`
### `scripts/list_runs.py`
**Inputs**
- `<entity/project>` (required)
- `--state <state>` (optional)
- `--limit <n>` (optional, default: 100)
- `--json` (optional)
**Output**
- Stdout table (default) or JSON list (with `--json`), where each item includes:
- `id`, `name`, `state`, `created_at`, `summary_metrics`, `tags`
### `scripts/list_metrics.py`
**Inputs**
- `<entity/project>` (required)
- `<run_id>` (required; run id or run name)
- `--include-system` (optional; include `_step`, `_timestamp`, etc.)
- `--json` (optional)
**Output**
- Stdout table (default) or JSON dict (with `--json`) keyed by metric name.
- Each metric entry includes `type`, `count`, `non_null_count`, and for numeric metrics: `min`, `max`, `mean`, `std`.
### `scripts/download_plots.py`
**Inputs**
- `<entity/project>` (required)
- `<run_id>` (required)
- `--pattern "<glob>"` (optional; defaults to common image paths)
- `--output <dir>` (optional; overrides default output location)
- `--force` (optional; re-download if file exists)
**Output**
- Writes downloaded images to the output directory (flat filenames).
- Updates/creates `metadata.json` in the same directory.
- Stdout lists downloaded/skipped files; returns an empty list (and prints “No plot files found…”) when nothing matches.
### `scripts/generate_plots.py`
**Inputs**
- `<entity/project>` (required)
- `<run_id>` (required; comma-separated for multiple runs)
- `--metrics "<m1,m2,...>"` (required; metric names as shown by `list_metrics.py`)
- `--all-metrics` (optional; plot all metrics)
- `--full-res` (optional; uses full `scan_history`)
- `--smooth <n>` (optional; rolling average window)
- `--output <dir>` (optional)
- `--ema-weight <w>` (optional; default: 0.99)
- `--viewport-scale <n>` (optional; default: 1000)
- `--no-ema` (optional; disable EMA smoothing)
- `--group-by-prefix` (optional; group outputs by metric prefix)
- `--include-system` (optional; include system metrics like `_step` and `system/*` with `--all-metrics`)
**Output**
- Writes `<metric>.png` for each generated plot plus `metadata.json` to the output directory.
- Stdout lists generated files; missing metrics raise an error listing available metrics.
## Workflow
```bash
python3 scripts/list_projects.py --limit 10
python3 scripts/list_runs.py <entity/project> --limit 10
python3 scripts/list_metrics.py <entity/project> <run_id>
python3 scripts/download_plots.py <entity/project> <run_id>
python3 scripts/generate_plots.py <entity/project> <run_id> --metrics loss,accuracy
python3 scripts/generate_plots.py <entity/project> run1,run2 --metrics loss --ema-weight 0.99 --viewport-scale 1000
python3 scripts/generate_plots.py <entity/project> run1,run2 --metrics rewards/total_mean,rewards/total_std --output /path/to/folder --group-by-prefix
python3 scripts/generate_plots.py <entity/project> run1,run2 --all-metrics --output /path/to/folder --group-by-prefix
```
## Outputs
Default output directory:
```
wandb_plots/<entity>_<project>/<run_name>_<run_id>/
- *.png
- metadata.json
```
If `download_plots.py` finds no images, fall back to `generate_plots.py`.
Related in Data & Analytics
clawarr-suite
IncludedComprehensive management for self-hosted media stacks (Sonarr, Radarr, Lidarr, Readarr, Prowlarr, Bazarr, Overseerr, Plex, Tautulli, SABnzbd, Recyclarr, Unpackerr, Notifiarr, Maintainerr, Kometa, FlareSolverr). Deep library exploration, analytics, dashboard generation, content management, request handling, subtitle management, indexer control, download monitoring, quality profile sync, library cleanup automation, notification routing, collection/overlay management, and media tracker integration (Trakt, Letterboxd, Simkl).
querying-soql
IncludedSOQL query generation, optimization, and analysis with 100-point scoring. Use this skill when the user needs SOQL/SOSL authoring or optimization: natural-language-to-query generation, relationship queries, aggregates, query-plan analysis, and performance or safety improvements for Salesforce queries. TRIGGER when: user writes, optimizes, or debugs SOQL/SOSL queries, touches .soql files, or asks about relationship queries, aggregates, or query performance. DO NOT TRIGGER when: bulk data operations (use handling-sf-data), Apex DML logic (use generating-apex), or report/dashboard queries.
app-store-optimization
IncludedApp Store Optimization (ASO) toolkit for researching keywords, analyzing competitor rankings, generating metadata suggestions, and improving app visibility on Apple App Store and Google Play Store. Use when the user asks about ASO, app store rankings, app metadata, app titles and descriptions, app store listings, app visibility, or mobile app marketing on iOS or Android. Supports keyword research and scoring, competitor keyword analysis, metadata optimization, A/B test planning, launch checklists, and tracking ranking changes.
habit-flow
IncludedAI-powered atomic habit tracker with natural language logging, streak tracking, smart reminders, and coaching. Use for creating habits, logging completions naturally ("I meditated today"), viewing progress, and getting personalized coaching.
app-store-optimization
IncludedApp Store Optimization (ASO) toolkit for researching keywords, analyzing competitor rankings, generating metadata suggestions, and improving app visibility on Apple App Store and Google Play Store. Use when the user asks about ASO, app store rankings, app metadata, app titles and descriptions, app store listings, app visibility, or mobile app marketing on iOS or Android. Supports keyword research and scoring, competitor keyword analysis, metadata optimization, A/B test planning, launch checklists, and tracking ranking changes.
visualizing-data
IncludedBuilds dashboards, reports, and data-driven interfaces requiring charts, graphs, or visual analytics. Provides systematic framework for selecting appropriate visualizations based on data characteristics and analytical purpose. Includes 24+ visualization types organized by purpose (trends, comparisons, distributions, relationships, flows, hierarchies, geospatial), accessibility patterns (WCAG 2.1 AA compliance), colorblind-safe palettes, and performance optimization strategies. Use when creating visualizations, choosing chart types, displaying data graphically, or designing data interfaces.