build-zoom-video-sdk-app
Reference skill for Zoom Video SDK. Use after routing to a custom-session workflow when the user needs full control over the video experience rather than an actual Zoom meeting.
What this skill does
# /build-zoom-video-sdk-app
Background reference for fully custom video-session products. Prefer `plan-zoom-product` first when the boundary between Meeting SDK and Video SDK is still unclear.
Build custom video experiences powered by Zoom's infrastructure.
## Hard Routing Guardrail (Read First)
- If the user asks for custom real-time video app behavior (topic/session join, custom rendering, attach/detach), route to Video SDK.
- Do not switch to REST meeting endpoints for Video SDK join flows.
- Video SDK does not use Meeting IDs, `join_url`, or Meeting SDK join payload fields (`meetingNumber`, `passWord`).
## Meeting SDK vs Video SDK
| Feature | Meeting SDK | Video SDK |
|---------|-------------|-----------|
| UI | Default Zoom UI or Custom UI | **Fully custom UI** (you build it) |
| Experience | Zoom meetings | Video sessions |
| Branding | Limited customization | **Full branding control** |
| Features | Full Zoom features | Core video features |
## UI Options (Web)
Video SDK gives you **full control over the UI**:
| Option | Description |
|--------|-------------|
| **UI Toolkit** | Pre-built React components (low-code) |
| **Custom UI** | Build your own UI using the SDK APIs |
## Prerequisites
- Zoom Video SDK credentials from Marketplace
- SDK Key and Secret
- Web development environment
> **Need help with OAuth or signatures?** See the **[zoom-oauth](../oauth/SKILL.md)** skill for authentication flows.
> **Need pre-join diagnostics on web?** Use **[probe-sdk](../probe-sdk/SKILL.md)** before Video SDK `join()` to reduce first-minute failures.
> **Start troubleshooting fast:** Use the **[5-Minute Runbook](RUNBOOK.md)** before deep debugging.
## Quick Start (Web)
### NPM Usage (Bundler like Vite/Webpack)
```javascript
import ZoomVideo from '@zoom/videosdk';
const client = ZoomVideo.createClient();
await client.init('en-US', 'Global', { patchJsMedia: true });
await client.join(topic, signature, userName, password);
// IMPORTANT: getMediaStream() ONLY works AFTER join()
const stream = client.getMediaStream();
await stream.startVideo();
await stream.startAudio();
```
### CDN Usage (No Bundler)
> **WARNING: Ad blockers block `source.zoom.us`**. Self-host the SDK to avoid issues.
```bash
# Download SDK locally
curl "https://source.zoom.us/videosdk/zoom-video-1.12.0.min.js" -o js/zoom-video-sdk.min.js
```
```html
<script src="js/zoom-video-sdk.min.js"></script>
```
```javascript
// CDN exports as WebVideoSDK, NOT ZoomVideo
// Must use .default property
const ZoomVideo = WebVideoSDK.default;
const client = ZoomVideo.createClient();
await client.init('en-US', 'Global', { patchJsMedia: true });
await client.join(topic, signature, userName, password);
// IMPORTANT: getMediaStream() ONLY works AFTER join()
const stream = client.getMediaStream();
await stream.startVideo();
await stream.startAudio();
```
### ES Module with CDN (Race Condition Fix)
When using `<script type="module">` with CDN, SDK may not be loaded yet:
```javascript
// Wait for SDK to load before using
function waitForSDK(timeout = 10000) {
return new Promise((resolve, reject) => {
if (typeof WebVideoSDK !== 'undefined') {
resolve();
return;
}
const start = Date.now();
const check = setInterval(() => {
if (typeof WebVideoSDK !== 'undefined') {
clearInterval(check);
resolve();
} else if (Date.now() - start > timeout) {
clearInterval(check);
reject(new Error('SDK failed to load'));
}
}, 100);
});
}
// Usage
await waitForSDK();
const ZoomVideo = WebVideoSDK.default;
const client = ZoomVideo.createClient();
```
## SDK Lifecycle (CRITICAL ORDER)
The SDK has a strict lifecycle. Violating it causes silent failures.
```
1. Create client: client = ZoomVideo.createClient()
2. Initialize: await client.init('en-US', 'Global', options)
3. Join session: await client.join(topic, signature, userName, password)
4. Get stream: stream = client.getMediaStream() ← ONLY AFTER JOIN
5. Start media: await stream.startVideo() / await stream.startAudio()
```
**Common Mistake (Silent Failure):**
```javascript
// ❌ WRONG: Getting stream before joining
const client = ZoomVideo.createClient();
await client.init('en-US', 'Global');
const stream = client.getMediaStream(); // Returns undefined!
await client.join(...);
// ✅ CORRECT: Get stream after joining
const client = ZoomVideo.createClient();
await client.init('en-US', 'Global');
await client.join(...);
const stream = client.getMediaStream(); // Works!
```
## Video Rendering (Event-Driven)
**The SDK is event-driven.** You must listen for events and render videos accordingly.
### Use `attachVideo()` NOT `renderVideo()`
```javascript
import { VideoQuality } from '@zoom/videosdk';
// Start your camera
await stream.startVideo();
// Attach video - returns element to append to DOM
const element = await stream.attachVideo(userId, VideoQuality.Video_360P);
container.appendChild(element);
// Detach when done
await stream.detachVideo(userId);
```
### Required Events
```javascript
// When other participant's video turns on/off
client.on('peer-video-state-change', async (payload) => {
const { action, userId } = payload;
if (action === 'Start') {
const el = await stream.attachVideo(userId, VideoQuality.Video_360P);
container.appendChild(el);
} else {
await stream.detachVideo(userId);
}
});
// When participants join/leave
client.on('user-added', (payload) => { /* check bVideoOn */ });
client.on('user-removed', (payload) => { stream.detachVideo(payload.userId); });
```
See [web/references/web.md](web/references/web.md) for complete event handling patterns.
## Key Concepts
| Concept | Description |
|---------|-------------|
| Session | Video session (not a meeting) |
| Topic | Session identifier (any string you choose) |
| Signature | JWT for authorization |
| MediaStream | Audio/video stream control |
## Session Creation Model
**Important**: Video SDK sessions are created **just-in-time**, not in advance.
| Aspect | Video SDK | Meeting SDK |
|--------|-----------|-------------|
| Pre-creation | NOT required | Create meeting via API first |
| Session start | First participant joins with topic | Join existing meeting ID |
| Topic | Any string (you define it) | Meeting ID from API |
| Scheduling | N/A - sessions are ad-hoc | Meetings can be scheduled |
### How Sessions Work
1. **No pre-creation needed**: Sessions don't exist until someone joins
2. **Topic = Session ID**: Any participants joining with the same `topic` string join the same session
3. **First join creates it**: The session is created when the first participant joins
4. **No meeting ID**: There's no numeric meeting ID like in Zoom Meetings
```javascript
// Session is created on-the-fly when first user joins
// Any string can be the topic - it becomes the session identifier
await client.join('my-custom-session-123', signature, 'User Name');
// Other participants join the SAME session by using the SAME topic
await client.join('my-custom-session-123', signature, 'Another User');
```
### Signature Endpoint Setup
The signature endpoint must be accessible from your frontend without CORS issues.
**Option 1: Same-Origin Proxy (Recommended)**
```nginx
# Nginx config
location /api/ {
proxy_pass http://YOUR_BACKEND_HOST:3005/api/;
proxy_http_version 1.1;
proxy_set_header Host $host;
}
```
```javascript
// Frontend uses relative URL (same origin)
const response = await fetch('/api/signature', { ... });
```
**Option 2: CORS Configuration**
```javascript
// Express.js backend
const cors = require('cors');
app.use(cors({
origin: ['https://your-domain.com'],
credentials: true
}));
```
**WARNING:** Mixed content (HTTPS page → HTTP API) will be blocked by browsers.
## Use Cases
| Use Case | Description |
|----------|-------------|
| [Video SDK BYOS (Bring Your Own Storage)](../general/use-cases/video-sdk-bring-your-own-storage.Related in Image & Video
watch
IncludedWatch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.
physical-ai-defect-image-generation
IncludedUse when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.
accelint-react-best-practices
IncludedReact performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.
elevenlabs-agents
IncludedBuild conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication
humanizer
IncludedHumanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.
generating-mermaid-diagrams
IncludedSalesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.