video-sdk/web
Expert guidance for building browser-based video sessions with the Zoom Video SDK for Web (@zoom/videosdk v2.4.0) in React, Vue, Angular, Svelte, or vanilla TypeScript. Use this skill whenever the user is implementing or debugging any in-browser real-time communication feature — joining/leaving a session, capturing or rendering audio/video, gallery or active-speaker views, virtual backgrounds, screen sharing with annotation, in-session chat or command channel, recording, subsessions, live streaming, PSTN/SIP dial-out, PTZ cameras, quality stats, WebAssembly/SharedArrayBuffer setup, CSP/COOP/COEP headers, JWT session tokens, or resolving SDK error codes. Trigger even when the user doesn't explicitly say "Zoom" — signals include `@zoom/videosdk`, `ZoomVideo.createClient`, `client.getMediaStream`, `stream.startVideo`, `attachVideo`, "video conferencing", "video call app", "video SDK", "render remote video", or debugging black/green video tiles, audio that won't start, or `OperationBlockedByBrowserPolicy` errors. Prefer this skill over generic WebRTC advice whenever `@zoom/videosdk` is in play.
What this skill does
# Zoom Video SDK Web (v2.4.0)
Expert guidance for building video sessions with Zoom Video SDK for Web.
This skill is for **custom video sessions**, not embedded Zoom meetings.
If the user wants a custom UI for a real Zoom meeting, route to [../../meeting-sdk/web/component-view/SKILL.md](../../meeting-sdk/web/component-view/SKILL.md).
Use [../../probe-sdk/SKILL.md](../../probe-sdk/SKILL.md) as an optional browser/device/network readiness gate before `client.join(...)`.
## Quick Start
### Installation
```bash
bun install @zoom/videosdk --save
# or
npm install @zoom/videosdk --save
```
### Basic Session Setup
```typescript
import ZoomVideo from "@zoom/videosdk";
// 1. Create client
const client = ZoomVideo.createClient();
// 2. Initialize SDK
await client.init("en-US", "Global", { patchJsMedia: true });
// 3. Join session (requires JWT token from your server)
await client.join(sessionName, jwtToken, userName, sessionPassword);
// 4. Get media stream for audio/video control
const stream = client.getMediaStream();
```
## Core API Reference
### ZoomVideo (Static Methods)
| Method | Description |
| ---------------------------------- | -------------------------------------------------------- |
| `createClient()` | Create VideoClient instance (singleton) |
| `checkSystemRequirements()` | Check browser compatibility → `{ audio, video, screen }` |
| `checkFeatureRequirements()` | Get supported/unsupported features list |
| `getDevices(skipPermissionCheck?)` | Enumerate media devices |
| `createLocalAudioTrack(deviceId?)` | Create local audio track for preview |
| `createLocalVideoTrack(deviceId?)` | Create local video track for preview |
| `destroyClient()` | Destroy client instance |
| `preloadDependentAssets(path?)` | Preload WebAssembly/Worker assets |
### VideoClient Methods
#### Session Management
```typescript
// Initialize before joining
await client.init(language, dependentAssets, options?);
// Join session
await client.join(topic, token, userName, password?, idleTimeoutMins?);
// Leave or end session
await client.leave(end?); // end=true ends for all (host only)
```
#### User Management
```typescript
client.getCurrentUserInfo(): Participant;
client.getAllUser(): Participant[];
client.getUser(userId): Participant | undefined;
client.getSessionHost(): Participant | undefined;
// Host/Manager actions
client.makeHost(userId); // Transfer host
client.makeManager(userId); // Promote to manager
client.revokeManager(userId); // Revoke manager
client.removeUser(userId); // Remove participant
client.changeName(name, userId?);
```
#### Feature Clients
```typescript
client.getMediaStream(); // Audio/Video/Screen share
client.getChatClient(); // In-session chat
client.getCommandClient(); // Custom signaling
client.getRecordingClient(); // Cloud recording
client.getSubsessionClient(); // Breakout rooms
client.getLiveTranscriptionClient(); // Captions
client.getLiveStreamClient(); // RTMP streaming
client.getWhiteboardClient(); // Whiteboard
```
### MediaStream (Audio/Video Control)
#### Audio
```typescript
const stream = client.getMediaStream();
// Start audio (requires user gesture)
await stream.startAudio({
mute?: boolean,
speakerOnly?: boolean,
backgroundNoiseSuppression?: boolean,
microphoneId?: string,
speakerId?: string,
});
// Control
await stream.muteAudio();
await stream.unmuteAudio();
stream.stopAudio();
// Device management
stream.getMicList(): MediaDevice[];
stream.getSpeakerList(): MediaDevice[];
await stream.switchMicrophone(deviceId);
await stream.switchSpeaker(deviceId);
```
#### Video
```typescript
// Start video (simple call, no options needed)
await stream.startVideo();
// Or with options
await stream.startVideo({
cameraId?: string,
hd?: boolean, // 720p
fullHd?: boolean, // 1080p
mirrored?: boolean,
virtualBackground?: { imageUrl: string | 'blur' | undefined, cropped?: boolean },
});
// CRITICAL: Attach video to DOM
// 1. Container MUST be a <video-player-container> custom element
// 2. attachVideo returns an element - append it to the container
// 3. Do NOT pass container as third parameter
const videoElement = await stream.attachVideo(userId, VideoQuality.Video_720P);
container.appendChild(videoElement);
// Stop video
await stream.stopVideo();
// Detach video (cleanup)
stream.detachVideo(userId);
// Device management
stream.getCameraList(): MediaDevice[];
await stream.switchCamera(deviceId);
```
#### Screen Share
```typescript
// Start sharing
await stream.startShareScreen({
broadcastToSubsession: boolean,
optimizedForSharedVideo: boolean,
secondaryCameraId: string, // Share secondary camera
});
// Stop sharing
await stream.stopShareScreen();
// View others' share
await stream.startShareView(canvas, userId);
stream.stopShareView();
```
### Events
```typescript
// Connection
client.on("connection-change", (payload) => {
// payload.state: 'Connected' | 'Reconnecting' | 'Closed' | 'Fail'
});
// Users
client.on("user-added", (participants: Participant[]) => {});
client.on("user-updated", (participants: Participant[]) => {});
client.on("user-removed", (participants: Participant[]) => {});
// Audio
client.on("current-audio-change", (payload) => {
// payload.action: 'join' | 'leave' | 'muted' | 'unmuted'
});
client.on("active-speaker", (payload) => {
// payload.activeSpeaker: { oderId: number, oderId?: number }[]
});
// Video
client.on("video-active-change", (payload) => {
// payload.userId, payload.state: 'Active' | 'Inactive'
});
client.on("peer-video-state-change", (payload) => {
// payload.userId, payload.action: 'Start' | 'Stop'
});
// Screen Share
client.on("active-share-change", (payload) => {
// payload.userId, payload.state: 'Active' | 'Inactive'
});
client.on("peer-share-state-change", (payload) => {
// payload.userId, payload.action: 'Start' | 'Stop'
});
// Chat
client.on("chat-on-message", (payload) => {
// payload.message, payload.sender, payload.timestamp
});
// Device
client.on("device-change", () => {
// Re-enumerate devices
});
```
## Framework-Specific Implementation Guides
For complete implementation examples with full project setup, hooks/composables/services, and components:
- **[references/react.md](references/react.md)** - React 19 + Vite 7 + TypeScript + shadcn/ui implementation
- **[references/vue.md](references/vue.md)** - Vue 3 + Vite 7 + TypeScript + shadcn-vue implementation
- **[references/angular.md](references/angular.md)** - Angular 21 + Standalone Components + Signals implementation
- **[references/svelte.md](references/svelte.md)** - Svelte 5 + Runes + Vite 7 + TypeScript implementation
### Quick Setup Requirements
**All frameworks MUST configure:**
1. **COOP/COEP Headers** (for SharedArrayBuffer support):
```typescript
// vite.config.ts
server: {
headers: {
'Cross-Origin-Opener-Policy': 'same-origin',
'Cross-Origin-Embedder-Policy': 'require-corp',
},
}
```
2. **TypeScript Custom Elements** (for video rendering):
```typescript
// src/types/zoom-elements.d.ts
declare namespace JSX {
interface IntrinsicElements {
"video-player-container": React.DetailedHTMLProps<
React.HTMLAttributes<HTMLElement>,
HTMLElement
>;
}
}
```
### Core Implementation Pattern
All implementations should follow this pattern:
1. **Client Management** - Singleton client with init/join/leave lifecycle
2. **Media Stream** - Audio/video/screen share controls with state sync
3. **Participants** - User list with video state change listeners
4. **Video Rendering** - Use `video-player-container` + `attachVideo()`
5. **Event Handling** - Connection, audio, video, chat events
6. **Error Handling** - Map SDKRelated in Image & Video
watch
IncludedWatch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.
physical-ai-defect-image-generation
IncludedUse when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.
accelint-react-best-practices
IncludedReact performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.
elevenlabs-agents
IncludedBuild conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication
humanizer
IncludedHumanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.
generating-mermaid-diagrams
IncludedSalesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.