Claude
Skills
Sign in
Back

physical-ai-defect-image-generation

Included with Lifetime
$97 forever

Use when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.

Image & Videoscriptsassets

What this skill does


# Physical AI Defect Image Generation Workflow Orchestrator


## Table of Contents

- [Supported Flows](#supported-flows)
- [Disambiguation](#disambiguation-handle-vague-requests-before-committing) (full table in `references/disambiguation.md`)
- [Step 0: Select Flow, Cookbook, and Gather Inputs](#step-0-select-flow-cookbook-and-gather-inputs)
- [Common Preconditions](#common-preconditions-all-flows) (long-form in `references/preconditions.md`)
- [Flow walkthroughs](#flow-walkthroughs) (one entry per flow; details in `references/flows/`)
- [OSMO Monitoring](#osmo-monitoring)
- [Supporting files](#supporting-files)

End-to-end orchestration of defect image generation, augmentation, and labeling pipelines for AOI (Automated Optical Inspection) datasets. Every flow has a canonical OSMO workflow YAML in `assets/configs/` that chains all steps non-interactively. Use-case cookbooks in `assets/cookbooks/` provide PCBA usd2roi/image-edit configs and AnomalyGen training configs for PCBA, metal surface, and glass inspection. This skill governs flow selection, data handoffs, and submit commands; component internals live in each component's `SKILL.md`.

## Supported Flows

| Flow | Entry point | OSMO YAML | Steps | Use cases |
|------|-------------|-----------|-------|-----------|
| **Day 0 — Texture Defects** | CAD scene USD (`pcba_target.yaml` ships in the cookbook) | `texture_defect_generation_day0.yaml` | usd2roi (scan_grid + per-cell ROI crops) → image-edit augmentation (`nvidia/Qwen-Image-Edit-NVPCB-OVSL2SL`) → finetune-or-passthrough → infer (anomalygen labels inline, **including missing-component**) | PCBA |
| **Day 0 — Good Image** *(usd2roi + Image-Edit)* | CAD scene USD + per-board `pcba_target.yaml` / `day0_image.yaml` / `day0_crop.yaml` | `good_image_generation.yaml` | usd2roi-render (scan_grid + per-cell ROI crop) → Qwen Image-Edit (OVSL2SL appearance transfer) | PCBA clean-image set (ChangeNet golden halves, finetune positives, real-photo pairing) |
| **Day 0 — Structural Defects** | CAD scene USD + per-board `pcba_target.yaml` | `structural_defect_generation.yaml` | isaac-render (pose defects: shift / tombstone / sideflip) + per-component crop (single pod) → Qwen Image-Edit (OVSL2SL lighting transfer; pose geometry preserved) | PCBA pose-defect set; ChangeNet defect halves |
| **Day 1 — Infer + Label (real-photo alignment, DEFAULT)** | CAD-derived USD + real PCBA photo (both ship in `datasets/pcb/assets`) | `texture_defect_generation_day1_real_alignment.yaml` | usd2roi day-1 render → MI register → per-ROI crop → yq-render config → finetune-or-passthrough → infer (anomalygen labels inline) | **Default PCBA Day 1.** Raw AOI screenshot of any usd2roi-supported board |
| **Day 1 — Infer + Label (manual ROI)** | Pre-captured clean images + ROI masks (NGC artifact or user upload) | `texture_defect_generation_day1_manual_roi.yaml` | yq-render config → finetune-or-passthrough → infer (anomalygen labels inline) | Metal surface, glass (no USD/real-photo flow); PCBA **only when user explicitly asks** for pre-captured ROI experimentation |
| **Finetune Only** | Labeled anomaly URL artifact | `finetune.yaml` | yq-render config → finetune (validate_dataset → prep_testcase → torchrun) | Any use case; produces checkpoint for Day 0 or Day 1. Requires raw training data under `<dig_url_root>/datasets/<usecase>/raw` (see `assets/configs/setup/setup_<usecase>.yaml`). |

All flows run on OSMO. Day 0 flows require `image_edit_endpoint` (Qwen Image-Edit OVSL2SL — existing URL or local deploy from `references/nim/`); Finetune Only has no external endpoints.

### Pick the right workflow for the user's defect class

| Defect class | Workflow | Mechanism |
|---|---|---|
| Clean / good / scan-grid / `normal_img + cad_mask` pairs | `good_image_generation.yaml` | usd2roi-render + Qwen Image-Edit |
| Texture defects (solder bridge, scratch, discoloration) **AND missing-component** (handled natively by AnomalyGen, NOT structural) | `texture_defect_generation_day0.yaml` | Qwen Image-Edit + AnomalyGen AMP/SDG |
| Structural / pose defects (tombstone, shift, sideflip) | `structural_defect_generation.yaml` | IsaacSim pose perturbation |
| Day 1 inference + labeling on a real image | `texture_defect_generation_day1_real_alignment.yaml` (PCBA default) or `texture_defect_generation_day1_manual_roi.yaml` (metal/glass; PCBA only when user explicitly asks for pre-captured ROI / skip-alignment) | usd2roi day-1 registration (real-alignment) or direct inference (manual-ROI) |

ChangeNet golden/defect pairs: submit `good_image_generation.yaml` + `structural_defect_generation.yaml` with the same `--set name=` (two-submission pairing convention).

> **Day 0 and Day 1 share the same downstream shape**: a Jinja-gated `finetune-job` (omitted when `use_pretrained_checkpoint=true`) feeding `anomaly-infer`. Day 0 prepends `usd2roi-render` + `augment-image-edit`; Day 1 starts from `<dig_url_root>/datasets/<usecase>/raw`. Per-stage detail: each flow's walkthrough.

### User intent → knob mapping

**Every OV flow is two-stage**: `crop_max_emit=N` caps the *final* per-cell crops (stage 2); `render_patches=N` caps *raw* scan-grid patches (stage 1, each yielding multiple crops). **DO NOT auto-map "generate N images" → `render_patches=N`** (wrong stage). `crop_max_emit` does not exist on `structural_defect_generation.yaml` (one crop per component — use `render_patches`) or `texture_defect_generation_day1_real_alignment.yaml` (narrow via the cookbook's `crop.classes` whitelist). Full knob table, smoke-test recipes, defaults, caveats: `references/knob_mapping.md`.

### Structural-defect sizing (no `crop_max_emit` knob exists)

Structural output is **non-linear in `render_patches`** — doubling frames adds ~1.6–1.7× crops, not 2×. Don't use `crop_max_emit` (no effect) or `render_patches=0` (fails). Validated yield table + target-size formula: `references/flows/structural_defect_generation.md` §"Sizing the output". For ambiguous "generate N images", surface the calibration table via `AskUserQuestion`.

---

## Disambiguation: handle vague requests before committing

Underspecified prompts ("generate me some images", "run the PCBA flow", "give me defects") **must not** be resolved by silently assuming a flow / usecase / knob mapping. When intent is ambiguous, pause and present candidate interpretations via `AskUserQuestion` (2–4 mutually exclusive options) before submitting. Disambiguate the load-bearing choices: **which flow, which use case, what stage a count refers to, finetune vs. passthrough**.

Settled defaults you should NOT disambiguate: PCBA Day 1 → real-alignment; board → `0603_H100`; image-edit endpoint → local cluster service (`references/nim/`); `use_pretrained_checkpoint=true`; Day 1 real-alignment `default_spatial_dependency=cad` (fall back to `free` only when CAD masks are unavailable, see `references/flows/texture_defect_generation_day1_real_alignment.md`).

**`dig_url_root` is the one exception — NO silent default.** First-time (no memory entry), MUST elicit via `AskUserQuestion` before any submit / `osmo data upload` / `preflight_urls.sh`. `s3://osmo-workflows/dig` is a *suggestion to confirm*, never auto-picked (~80 GB+ lands there). Later runs may reuse the remembered value silently. See Step 0 + memory rules (§4).

**Full trigger table, prompt construction, and when-NOT-to-ask exceptions: `references/disambiguation.md`** — load before assembling `AskUserQuestion` options for any vague request.

---

## Step 0: Select Flow, Cookbook, and Gather Inputs

**Before this step**, if the request is vague (e.g. "generate me images", "run the PCBA flow", "give me defects"), pause and run the disambiguation cheat sheet above — present candidate interpretations via `AskUserQuestion` and let the user pick. Don't auto-pick a load-bearing default the user didn't actually choose.

### First-time gate

If memory has no entries for this user, ASK the up-front preference questions in ONE `AskUserQuestion` c

Related in Image & Video