deepstream-import-vision-model
NVIDIA DeepStream model-import guidance for bringing vision models from Hugging Face or NVIDIA NGC into DeepStream pipelines with export, TensorRT build, and benchmark steps.
What this skill does
# DeepStream Import Vision Model
When this skill is active, **read the relevant reference document before starting each phase**. Do not rely on memory — reference documents contain exact script paths, bash variable conventions, log filename contracts, and critical parsing rules.
**Current scope:** Object detection models only. Fail fast on classification, segmentation, or other architectures detected in `config.json`.
## Pipeline Overview
| Step | Phase | Reference | What it does |
|------|-------|-----------|--------------|
| 1–3 | Model Acquire | [references/model-acquire.md](references/model-acquire.md) | Browse HF/NGC, detect format, download ONNX or export SafeTensors |
| 4–5 | Engine Build | [references/engine-build.md](references/engine-build.md) | Build dynamic TRT engine, run trtexec BS=1 and BS=MAX_BS |
| 6–7 | DS Pipeline | [references/pipeline-run.md](references/pipeline-run.md) | Custom bbox parser, nvinfer config, single-stream + multi-stream benchmarks |
| 8 | Report | [references/report-generation.md](references/report-generation.md) | 5 charts, HTML, PDF benchmark report |
Run the full pipeline autonomously without pausing for confirmation at each step.
## Pre-flight Checks
Run before starting:
```bash
# 1. GPU and drivers
nvidia-smi
# 2. TensorRT version match (must match between builder and DS runtime)
trtexec 2>&1 | head -3
dpkg -l | grep libnvinfer-bin
# 3. Shared Python venv — create once, reuse across all models
mkdir -p build
VENV=build/.venv_optimum
if [ ! -x "$VENV/bin/python3" ]; then
python3 -m venv "$VENV"
"$VENV/bin/pip" install --upgrade pip -q
"$VENV/bin/pip" install "optimum[exporters]>=1.20,<2.0" "torch<2.12" \
transformers onnxruntime matplotlib numpy markdown -q
fi
# 4. System tools
which wkhtmltopdf || apt-get install -y wkhtmltopdf
which mediainfo || apt-get install -y mediainfo
which deepstream-app # required for KITTI dump (Step 6g) and benchmark perf-measurement (Step 7c); shipped with DeepStream SDK
# 5. Sample video — only check default path when user has not provided a custom DS_VIDEO
if [ -z "$DS_VIDEO" ]; then
[ -f /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.mp4 ] || \
echo "WARNING: sample_720p.mp4 not found. Install DeepStream samples or set DS_VIDEO=/path/to/your.mp4"
fi
```
## Mandatory Output Structure
Create once `MODEL_NAME` is known (Step 1). Never dump files flat.
```
models/{model_name}/
model/ <- ONNX file(s)
parser/ <- .cpp, Makefile, .so
config/ <- nvinfer config, ds-app config, labels.txt
scripts/ <- run helper scripts
benchmarks/
engines/ <- _dynamic_b{MAX_BS}.engine, timing.cache, build logs
b1/ <- trtexec BS=1 log
b{MAX_BS}/ <- trtexec BS=MAX_BS log
ds/ <- DS benchmark logs
reports/ <- benchmark_report.md, .html, .pdf, benchmark_data.json
charts/ <- chart_*.png (5 charts)
samples/ <- output .mp4 or .ogv (theoraenc fallback), test frames
kitti_output/ <- KITTI detection .txt files
```
```bash
mkdir -p models/$MODEL_NAME/{model,parser,config,scripts,benchmarks/engines,benchmarks/ds,reports/charts,samples/kitti_output}
```
## Critical Rules
1. **Engine naming** — always `{model}_dynamic_b{MAX_BS}.engine`. Never bare `model_dynamic.engine`.
2. **batch_size == num_streams** — in DS runs, `batch-size` and stream count are always equal.
3. **Log filenames are fixed** — `trtexec_b1.log`, `trtexec_b${MAX_BS}.log`, `ds_s${N}_run1.log`, `ds_s${N}_run2.log`. No timestamps. Report generation reads exact paths.
4. **Parser zero-init** — always `NvDsInferObjectDetectionInfo obj = {};`. Required for DS 9.0 OBB support; bare `obj;` leaves `rotation_angle` uninitialized, causing tilted bounding boxes.
5. **KITTI validation gate** — do NOT proceed to Step 7 if KITTI frame count is zero or detection rate < 90%.
6. **Shared venv** — `build/.venv_optimum` reused across all models. Never create per-model venvs.
7. **trtexec `--noDataTransfers`** — GPU-only compute matches DeepStream's GPU-to-GPU data flow.
8. **Report HTML+PDF** — always use `skills/deepstream-import-vision-model/scripts/report/md-to-html-pdf.py`. Never write a custom HTML generator or call `wkhtmltopdf` directly.
9. **Object detection only** — reject non-detection architectures from `config.json` before building anything.
10. **Encoder fallback (MANDATORY)** — `x264enc` and `openh264enc` are **prohibited**. On NVENC-unavailable systems, use `theoraenc + oggmux` (LGPL; ships in gst-plugins-base; output is `.ogv`). If `theoraenc`/`oggmux` are absent, skip video creation (`DS_SINGLE_STREAM_MODE=skipped`). Report which mode was used: `nvv4l2h264enc` / `theoraenc-fallback` / `skipped`.
11. **Video source (MANDATORY)** — default is always `sample_720p.mp4` (1280×720). Never autonomously substitute `sample_1080p_h264.mp4` or any other file. Only use a different video when the user explicitly provides a path (via `DS_VIDEO` env var or script argument).
## Pipeline Timing
Wrap every step:
```bash
STEP_START=$(date +%s.%N)
# ... step commands ...
STEP_END=$(date +%s.%N)
STEP_DURATION=$(echo "$STEP_END - $STEP_START" | bc)
echo "[Step N] completed in ${STEP_DURATION}s"
```
Track `PIPELINE_START` (before Step 1) and `PIPELINE_END` (after Step 8). Report all durations in the benchmark report.
## Report Output (MANDATORY — all 3 formats)
1. `benchmark_report.md` — markdown source (12 mandatory sections)
2. `benchmark_report.html` — styled HTML (charts base64-inlined, no local file access)
3. `benchmark_report_{model_name}.pdf` — via `md-to-html-pdf.py`; verify charts are embedded by counting `data:image/png` occurrences in the HTML output: `grep -o 'data:image/png' benchmark_report.html | wc -l` should equal 5
Run charts and report scripts with the shared venv active: `source build/.venv_optimum/bin/activate`.
## Reference Documents
**IMPORTANT**: Read the relevant reference before starting each phase. Do NOT generate code from memory.
| Document | Use When |
|----------|----------|
| [references/model-acquire.md](references/model-acquire.md) | Steps 1–3: HF/NGC URL parsing, format detection, ONNX download, SafeTensors export, label extraction |
| [references/engine-build.md](references/engine-build.md) | Steps 4–5: trtexec engine build, benchmarks, PEAK_GPU_STREAMS derivation, iterative scaling |
| [references/pipeline-run.md](references/pipeline-run.md) | Steps 6–7: custom bbox parser, nvinfer config, single-stream validation, KITTI dump, multi-stream benchmark |
| [references/report-generation.md](references/report-generation.md) | Step 8: benchmark_data.json, 5 charts, 12-section markdown report, HTML + PDF |
## Scripts
Located in `scripts/`.
| Script | Phase | Purpose |
|--------|-------|---------|
| `model/hf-list-files.sh` | 1–3 | List HuggingFace repo files |
| `model/hf-download-config.sh` | 1–3 | Download config.json from HF |
| `model/ngc-list-files.sh` | 1–3 | List NGC model files |
| `model/ngc-download.sh` | 1–3 | Download NGC model archive |
| `model/safetensors-to-onnx.sh` | 1–3 | Export SafeTensors → ONNX via optimum-cli |
| `model/inspect-onnx.py` | 1–5 | Inspect ONNX input/output shapes |
| `model/make-static-batch-onnx.py` | 4–5 | Bake batch dim into ONNX |
| `model/cleanup.sh` | Any | Remove staging dirs, preserve shared venv |
| `engine/benchmark-trtexec.sh` | 4–5 | Run trtexec with standard flags |
| `deepstream/ds-single-stream.sh` | 6–7 | Single-stream visual validation (NVENC primary; theoraenc+oggmux fallback; skip if neither) |
| `deepstream/ds-sweep.sh` | 6–7 | 2-phase batch size sweep |
| `deepstream/benchmark-ds.sh` | 6–7 | Fixed-stream DS benchmark |
| `deepstream/ds-kitti-dump.sh` | 6–7 | KITTI detection dump via deepstream-app |
| `deepstream/ds-perf-run.sh` | 7 | Step 7c two-run benchmark — wraps `deepstream-app` with `enable-perf-measurement=1`, writes fixed-name log for the report parser |
| `deepstream/extractRelated in AI Agents
skill-development
IncludedComprehensive meta-skill for creating, managing, validating, auditing, and distributing Claude Code skills and slash commands (unified in v2.1.3+). Provides skill templates, creation workflows, validation patterns, audit checklists, naming conventions, YAML frontmatter guidance, progressive disclosure examples, and best practices lookup. Use when creating new skills, validating existing skills, auditing skill quality, understanding skill architecture, needing skill templates, learning about YAML frontmatter requirements, progressive disclosure patterns, tool restrictions (allowed-tools), skill composition, skill naming conventions, troubleshooting skill activation issues, creating custom slash commands, configuring command frontmatter, using command arguments ($ARGUMENTS, $1, $2), bash execution in commands, file references in commands, command namespacing, plugin commands, MCP slash commands, Skill tool configuration, or deciding between skills vs slash commands. Delegates to docs-management skill for official documentation.
reprompter
IncludedTransform messy prompts into well-structured, effective prompts — single or multi-agent. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags and best practices, "reprompter teams", "repromptception", "run with quality", "smart run", "smart agents", multi-agent tasks, audits, parallel work, anything going to agent teams. Don't use when: simple Q&A, pure chat, immediate execution-only tasks. See "Don't Use When" section for details. Outputs: Structured XML/Markdown prompt, quality score (before/after), optional team brief + per-agent sub-prompts, agent team output files. Success criteria: Single mode quality score ≥ 7/10; Repromptception per-agent prompt quality score 8+/10; all required sections present, actionable and specific.
adaptive-compaction
IncludedAdaptive add-on policy and recovery layer that decides WHEN to compact, prune, snapshot, or fork -- replacing fixed-percent auto-compaction across Claude Code, Codex, and MCP-capable hosts. Trigger on auto-compact timing or damage: "when should I compact", "is it safe to compact now or start a fresh session", "auto-compact fires too early/mid-task", "switching to an unrelated task but the window still has space", "context rot", "answers get worse the longer the session runs", "the agent forgot the plan or my decisions after it summarized", "add a layer on top that manages context without changing the agent", raising autoCompactWindow to give the policy room, or installing/tuning a cross-tool compaction policy or PreCompact hook -- even when "compaction" is never said but the problem is context-window pressure or post-summarization memory loss. Do NOT use to summarize a conversation, build RAG, write a summarization prompt (decides WHEN not HOW), or answer max-context-length trivia.
agent-skill-creator
IncludedCreate cross-platform agent skills from workflow descriptions. Activates when users ask to create an agent, automate a repetitive workflow, create a custom skill, or need advanced agent creation. Triggers on phrases like create agent for, automate workflow, create skill for, every day I have to, daily I need to, turn process into agent, need to automate, create a cross-platform skill, validate this skill, export this skill, migrate this skill. Supports single skills, multi-agent suites, transcript processing, template-based creation, interactive configuration, cross-platform export, and spec validation.
llm-wiki
IncludedUse when building or maintaining a persistent personal knowledge base (second brain) in Obsidian where an LLM incrementally ingests sources, updates entity/concept pages, maintains cross-references, and keeps a synthesis current. Triggers include "second brain", "Obsidian wiki", "personal knowledge management", "ingest this paper/article/book", "build a research wiki", "compound knowledge", "Memex", or whenever the user wants knowledge to accumulate across sessions instead of being re-derived by RAG on every query.
skill-master
IncludedAgent Skills authoring, evaluation, and optimization. Create, edit, validate, benchmark, and improve skills following the agentskills.io specification. Use when designing SKILL.md files, structuring skill folders (references, scripts, assets), ingesting external documentation into skills, running trigger evals, benchmarking skill quality, optimizing descriptions, or performing blind A/B comparisons. Keywords: agentskills.io, SKILL.md, skill authoring, eval, benchmark, trigger optimization.