video-sdk/web
Zoom Video SDK for Web - JavaScript/TypeScript integration for browser-based video sessions, real-time communication, screen sharing, recording, and live transcription
What this skill does
# Zoom Video SDK - Web Development
Expert guidance for developing with the Zoom Video SDK on Web. This SDK enables custom video applications in the browser with real-time video/audio, screen sharing, cloud recording, live streaming, chat, and live transcription.
This skill is for **custom video sessions**, not embedded Zoom meetings.
If the user wants a custom UI for a real Zoom meeting, route to
[../../meeting-sdk/web/component-view/SKILL.md](../../meeting-sdk/web/component-view/SKILL.md).
**Official Documentation**: https://developers.zoom.us/docs/video-sdk/web/
**API Reference**: https://marketplacefront.zoom.us/sdk/custom/web/modules.html
**Sample Repository**: https://github.com/zoom/videosdk-web-sample
## Quick Links
**New to Video SDK? Follow this path:**
1. **[SDK Architecture Pattern](concepts/sdk-architecture-pattern.md)** - Universal 3-step pattern for ANY feature
2. **[Session Join Pattern](examples/session-join-pattern.md)** - Complete working code to join a session
3. **[Video Rendering](examples/video-rendering.md)** - Display video with attachVideo()
4. **[Event Handling](examples/event-handling.md)** - Required events for video/audio
**Reference:**
- **[Singleton Hierarchy](concepts/singleton-hierarchy.md)** - 4-level SDK navigation map
- **[API Reference](references/web-reference.md)** - Methods, events, error codes
- **[SKILL.md](SKILL.md)** - Complete documentation navigation
- **[../../probe-sdk/SKILL.md](../../probe-sdk/SKILL.md)** - Optional browser/device/network readiness diagnostics before join
**Having issues?**
- Video not showing → [Video Rendering](examples/video-rendering.md) (use attachVideo, not renderVideo)
- getMediaStream() returns undefined → Call AFTER join() completes
- Quick diagnostics → [Common Issues](troubleshooting/common-issues.md)
## SDK Overview
The Zoom Video SDK for Web is a JavaScript library that provides:
- **Session Management**: Join/leave video SDK sessions
- **Video/Audio**: Start/stop camera and microphone
- **Screen Sharing**: Share screens or browser tabs
- **Cloud Recording**: Record sessions to Zoom cloud
- **Live Streaming**: Stream to RTMP endpoints
- **Chat**: In-session messaging
- **Command Channel**: Custom command messaging
- **Live Transcription**: Real-time speech-to-text
- **Subsessions**: Breakout room support
- **Whiteboard**: Collaborative whiteboard features
- **Virtual Background**: Blur or custom image backgrounds
## Prerequisites
### System Requirements
- **Modern Browser**: Chrome 80+, Firefox 75+, Safari 14+, Edge 80+
- **Video SDK Credentials**: SDK Key and Secret from [Marketplace](https://marketplace.zoom.us/)
- **JWT Token**: Server-side generated signature
### Browser Feature Requirements
```javascript
// Check browser compatibility before init
const compatibility = ZoomVideo.checkSystemRequirements();
console.log('Audio:', compatibility.audio);
console.log('Video:', compatibility.video);
console.log('Screen:', compatibility.screen);
// Check feature support
const features = ZoomVideo.checkFeatureRequirements();
console.log('Supported:', features.supportFeatures);
console.log('Unsupported:', features.unSupportFeatures);
```
### Optional Pre-Join Diagnostics (Recommended for Reliability)
Use Probe SDK as a readiness gate before `client.join(...)` when you need to reduce failed starts:
1. Run diagnostics with [../../probe-sdk/SKILL.md](../../probe-sdk/SKILL.md).
2. Evaluate policy (`allow`, `warn`, `block`).
3. Start Video SDK join only when policy allows.
Cross-skill flow: [../../general/use-cases/probe-sdk-preflight-readiness-gate.md](../../general/use-cases/probe-sdk-preflight-readiness-gate.md)
## Installation
### NPM (Recommended)
```bash
npm install @zoom/videosdk
```
```javascript
import ZoomVideo from '@zoom/videosdk';
```
### CDN (Fallback Strategy Recommended)
> **Note**: Some networks/ad blockers can block `source.zoom.us`. If you see flaky loads, first try allowlisting the domain in your environment. If needed, consider a fallback (mirror/self-host) only if it's permitted for your use case and you can keep versions in sync.
```bash
# Download SDK locally
curl "https://source.zoom.us/videosdk/zoom-video-2.3.12.min.js" -o public/js/zoom-video-sdk.min.js
```
```html
<!-- Use local copy instead of CDN -->
<script src="js/zoom-video-sdk.min.js"></script>
```
```javascript
// CDN exports as WebVideoSDK, NOT ZoomVideo
const ZoomVideo = WebVideoSDK.default;
```
## Quick Start
```javascript
import ZoomVideo from '@zoom/videosdk';
// 1. Create client (singleton - returns same instance)
const client = ZoomVideo.createClient();
// 2. Initialize SDK
await client.init('en-US', 'Global', { patchJsMedia: true });
// 3. Join session
await client.join(topic, signature, userName, password);
// 4. CRITICAL: Get stream AFTER join
const stream = client.getMediaStream();
// 5. Start media
await stream.startVideo();
await stream.startAudio();
// 6. Attach video to DOM
const videoElement = await stream.attachVideo(userId, VideoQuality.Video_360P);
document.getElementById('video-container').appendChild(videoElement);
```
## SDK Lifecycle (CRITICAL ORDER)
The SDK has a strict lifecycle. Violating it causes **silent failures**.
```
1. Create client: client = ZoomVideo.createClient()
2. Initialize: await client.init('en-US', 'Global', options)
3. Join session: await client.join(topic, signature, userName, password)
4. Get stream: stream = client.getMediaStream() ← ONLY AFTER JOIN
5. Start media: await stream.startVideo() / await stream.startAudio()
```
**Common Mistake:**
```javascript
// WRONG: Getting stream before joining
const stream = client.getMediaStream(); // Returns undefined!
await client.join(...);
// CORRECT: Get stream after joining
await client.join(...);
const stream = client.getMediaStream(); // Works!
```
## Critical Gotchas and Best Practices
### getMediaStream() ONLY Works After join()
The #1 issue that causes video/audio to fail:
```javascript
// WRONG
const stream = client.getMediaStream(); // undefined!
await client.join(...);
// CORRECT
await client.join(...);
const stream = client.getMediaStream(); // Works
```
### Use attachVideo() NOT renderVideo()
`renderVideo()` is **deprecated**. Use `attachVideo()` which returns a VideoPlayer element:
```javascript
import { VideoQuality } from '@zoom/videosdk';
// CORRECT: attachVideo returns element to append
const videoElement = await stream.attachVideo(userId, VideoQuality.Video_360P);
document.getElementById('video-container').appendChild(videoElement);
// WRONG: renderVideo is deprecated
await stream.renderVideo(canvas, userId, ...); // Don't use!
```
### Video Rendering is Event-Driven (CRITICAL)
You MUST listen for events to properly render participant videos:
```javascript
// When another participant's video state changes
client.on('peer-video-state-change', async (payload) => {
const { action, userId } = payload;
if (action === 'Start') {
// Participant turned on video - attach it
const element = await stream.attachVideo(userId, VideoQuality.Video_360P);
container.appendChild(element);
} else if (action === 'Stop') {
// Participant turned off video - detach it
await stream.detachVideo(userId);
}
});
// When participants join/leave
client.on('user-added', (payload) => {
// New participant joined - check if their video is on
const users = client.getAllUser();
// Render videos for users with bVideoOn === true
});
client.on('user-removed', (payload) => {
// Participant left - clean up their video element
stream.detachVideo(payload[0].userId);
});
```
### Peer Video on Mid-Session Join
**Existing participants' videos won't auto-render when you join mid-session.**
```javascript
// After joining, render existing participants' videos
const renderExistingVideos = async () => {
await new Promise(resolve => setTimeout(resolve, 500));
const users = client.getAllUser();
const curRelated in Image & Video
watch
IncludedWatch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.
physical-ai-defect-image-generation
IncludedUse when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.
accelint-react-best-practices
IncludedReact performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.
elevenlabs-agents
IncludedBuild conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication
humanizer
IncludedHumanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.
generating-mermaid-diagrams
IncludedSalesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.