hyperframes
Create video compositions, animations, title cards, overlays, captions, voiceovers, audio-reactive visuals, and scene transitions in HyperFrames HTML. Use when asked to build any HTML-based video content, add captions or subtitles synced to audio, generate text-to-speech narration, create audio-reactive animation (beat sync, glow, pulse driven by music), add animated text highlighting (marker sweeps, hand-drawn circles, burst lines, scribble, sketchout), or add transitions between scenes (crossfades, wipes, reveals, shader transitions). Covers composition authoring, timing, media, and the full video production workflow. For CLI commands (init, lint, preview, render, transcribe, tts) see the hyperframes-cli skill.
What this skill does
# HyperFrames
HTML is the source of truth for video. A composition is an HTML file with `data-*` attributes for timing, a GSAP timeline for animation, and CSS for appearance. The framework handles clip visibility, media playback, and timeline sync.
## Open Design integration (load-bearing for this surface)
When this skill runs inside Open Design (i.e. `$OD_PROJECT_DIR` is set), the
output flow is fixed: only the rendered `.mp4` should land in the project
root. Composition source files (`hyperframes.json`, `meta.json`,
`index.html`, assets) belong inside a hidden cache directory so they don't
clutter the user's FileViewer or the chat's "produced files" chips.
**Render workflow inside OD — fast path**:
For most OD requests ("test video", "5s product reveal", "demo clip"),
do NOT write the composition HTML from scratch. Use HyperFrames'
built-in scaffold and edit only what the prompt actually changes. The
"author from scratch" path costs minutes of model output and silent
chat-tool time; the scaffold path costs seconds.
```bash
# 1. Pick a hidden cache slot. Dotfile prefix → OD's project file
# listing skips it, so the source files never clutter the chat.
COMP_REL=".hyperframes-cache/$(date +%s)-$(openssl rand -hex 2)"
COMP="$OD_PROJECT_DIR/$COMP_REL"
# 2. Get an immediately-renderable scaffold (hyperframes.json,
# meta.json, index.html with GSAP CDN + window.__timelines.main
# already registered). This runs in your shell — pure file copy,
# no Chrome, no network beyond the npx cache.
npx hyperframes init "$COMP" --example blank --skip-skills --non-interactive
# 3. Edit ONLY $COMP/index.html — change `data-duration` on the root
# if you need a non-default length, swap the placeholder palette
# in <style>, add 1–3 clip <div>s for text/imagery, and append the
# matching GSAP tweens inside the existing
# `window.__timelines["main"] = gsap.timeline({paused:true})` block.
# Keep edits minimal; the scaffold is already valid HF.
# 4. Dispatch render through the OD daemon. Do NOT run `npx hyperframes
# render` from this shell — the daemon runs it for you in an
# unsandboxed process. (Many agent CLIs, Claude Code in particular,
# wrap Bash in macOS sandbox-exec under which puppeteer's Chrome
# subprocess hangs partway through frame capture. The daemon process
# is unsandboxed, so renders complete reliably.)
#
# The dispatcher returns within ~1s with a {taskId}; drive the
# render to completion by looping `"$OD_NODE_BIN" "$OD_BIN" media wait <taskId>` calls.
# Each call long-polls up to 25s (well under your shell tool's
# default 30s cap) and exits 0/2/5 to signal done/running/failed.
out=$("$OD_NODE_BIN" "$OD_BIN" media generate \
--project "$OD_PROJECT_ID" \
--surface video \
--model hyperframes-html \
--output "<descriptive-name>.mp4" \
--composition-dir "$COMP_REL")
ec=$?
task_id=$(printf '%s\n' "$out" | tail -1 | jq -r '.taskId // empty')
since=$(printf '%s\n' "$out" | tail -1 | jq -r '.nextSince // 0')
while [ "$ec" -eq 2 ] && [ -n "$task_id" ]; do
out=$("$OD_NODE_BIN" "$OD_BIN" media wait "$task_id" --since "$since")
ec=$?
since=$(printf '%s\n' "$out" | tail -1 | jq -r '.nextSince // '"$since")
done
[ "$ec" -ne 0 ] && { echo "$out" >&2; exit "$ec"; }
```
Each `generate` and each `wait` call lasts at most ~25s, so the agent
shell tool's default ~30s cap never fires. Progress lines from HF
(`Capturing frame N/M`) stream to stderr live throughout the loop.
When the render finishes, the last stdout line is
`{"file": { "name": "<output>", "size": …, "kind": "video", … }}` —
quote `file.name` in your reply so the user knows what was produced.
**Skip the Visual Identity Gate inside OD.** The HARD-GATE section
below (under "Approach") tells you to read DESIGN.md / visual-style.md
or stop and ask 3 mood questions before writing any composition. That
gate is for standalone HF projects. **OD projects already have their
own design-system layer** — the user picked their visual direction at
project creation time. For an OD test render, default to: dark canvas
(#0b0b0f), one warm accent (#ffb76b), one cool accent (#7da4ff),
restrained motion. Only ask for stylistic input if the user's prompt
is too vague to even pick a subject (very rare).
When to skip the scaffold and write from scratch: only when the user
explicitly asks for something the blank template clearly can't host
(e.g. multi-composition timelines, audio-reactive overlays, captions
synced to a TTS track they've already generated). For everything else,
init + edit is the default path.
The lighter HF subcommands you CAN still run from your own shell
(they don't need to spawn Chrome):
- `npx hyperframes lint "$COMP"` — validate composition before dispatch
- `npx hyperframes transcribe <audio>` — generate captions
- `npx hyperframes tts <text>` — generate narration
Reserve the daemon dispatch for `render`/`inspect`/`preview` (anything
Chrome-bound). After authoring the composition under `.hyperframes-cache/`,
render it by calling `"$OD_NODE_BIN" "$OD_BIN" media generate --surface video --model hyperframes-html --composition-dir <rel>`.
The daemon runs the Chrome-bound HyperFrames render outside your shell
sandbox and streams progress back to you. Do not run `npx hyperframes render`
yourself.
**Do NOT** drop `hyperframes.json` / `meta.json` / `index.html` in the
project root; OD's file listing scans recursively and the user would see
three unrelated files appear in the chat.
For CLI options beyond `render` (lint, preview, transcribe, tts, inspect,
benchmark) call them directly from your shell tool when the task warrants
it (e.g., generate TTS audio into the cache before referencing it from
the composition).
## Approach
Before writing HTML, think at a high level:
1. **What** — what should the viewer experience? Identify the narrative arc, key moments, and emotional beats.
2. **Structure** — how many compositions, which are sub-compositions vs inline, what tracks carry what (video, audio, overlays, captions).
3. **Timing** — which clips drive the duration, where do transitions land, what's the pacing.
4. **Layout** — build the end-state first. See "Layout Before Animation" below.
5. **Animate** — then add motion using the rules below.
For small edits (fix a color, adjust timing, add one element), skip straight to the rules.
### Visual Identity Gate
<HARD-GATE>
Before writing ANY composition HTML, you MUST have a visual identity defined. Do NOT write compositions with default or generic colors.
Check in this order:
1. **DESIGN.md exists in the project?** → Read it. Use its exact colors, fonts, motion rules, and "What NOT to Do" constraints.
2. **visual-style.md exists?** → Read it. Apply its `style_prompt_full` and structured fields. (Note: `visual-style.md` is a project-specific file. `visual-styles.md` is the style library with 8 named presets — different files.)
3. **User named a style** (e.g., "Swiss Pulse", "dark and techy", "luxury brand")? → Read [visual-styles.md](./visual-styles.md) for the 8 named presets. Generate a minimal DESIGN.md with: `## Style Prompt` (one paragraph), `## Colors` (3-5 hex values with roles), `## Typography` (1-2 font families), `## What NOT to Do` (3-5 anti-patterns).
4. **None of the above?** → Ask 3 questions before writing any HTML:
- What's the mood? (explosive / cinematic / fluid / technical / chaotic / warm)
- Light or dark canvas?
- Any specific brand colors, fonts, or visual references?
Then generate a minimal DESIGN.md from the answers.
Every composition must trace its palette and typography back to a DESIGN.md, visual-style.md, or explicit user direction. If you're reaching for `#333`, `#3b82f6`, or `Roboto` — you skipped this step.
</HARD-GATE>
For motion defaults, sizing, entrance patterns, and easing — follow [house-style.md](./house-style.md). The house style handles HOW things move. The DESIGN.md handles WHAT things look like.
## Layout Before Animation
PosRelated in Image & Video
watch
IncludedWatch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.
physical-ai-defect-image-generation
IncludedUse when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.
accelint-react-best-practices
IncludedReact performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.
elevenlabs-agents
IncludedBuild conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication
humanizer
IncludedHumanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.
generating-mermaid-diagrams
IncludedSalesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.