video-sdk/web

Included with Lifetime

$97 forever

Expert guidance for building browser-based video sessions with the Zoom Video SDK for Web (@zoom/videosdk v2.4.0) in React, Vue, Angular, Svelte, or vanilla TypeScript. Use this skill whenever the user is implementing or debugging any in-browser real-time communication feature — joining/leaving a session, capturing or rendering audio/video, gallery or active-speaker views, virtual backgrounds, screen sharing with annotation, in-session chat or command channel, recording, subsessions, live streaming, PSTN/SIP dial-out, PTZ cameras, quality stats, WebAssembly/SharedArrayBuffer setup, CSP/COOP/COEP headers, JWT session tokens, or resolving SDK error codes. Trigger even when the user doesn't explicitly say "Zoom" — signals include `@zoom/videosdk`, `ZoomVideo.createClient`, `client.getMediaStream`, `stream.startVideo`, `attachVideo`, "video conferencing", "video call app", "video SDK", "render remote video", or debugging black/green video tiles, audio that won't start, or `OperationBlockedByBrowserPolicy` errors. Prefer this skill over generic WebRTC advice whenever `@zoom/videosdk` is in play.

Image & Video

What this skill does


# Zoom Video SDK Web (v2.4.0)

Expert guidance for building video sessions with Zoom Video SDK for Web.

This skill is for **custom video sessions**, not embedded Zoom meetings.
If the user wants a custom UI for a real Zoom meeting, route to [../../meeting-sdk/web/component-view/SKILL.md](../../meeting-sdk/web/component-view/SKILL.md).
Use [../../probe-sdk/SKILL.md](../../probe-sdk/SKILL.md) as an optional browser/device/network readiness gate before `client.join(...)`.

## Quick Start

### Installation

```bash
bun install @zoom/videosdk --save
# or
npm install @zoom/videosdk --save
```

### Basic Session Setup

```typescript
import ZoomVideo from "@zoom/videosdk";

// 1. Create client
const client = ZoomVideo.createClient();

// 2. Initialize SDK
await client.init("en-US", "Global", { patchJsMedia: true });

// 3. Join session (requires JWT token from your server)
await client.join(sessionName, jwtToken, userName, sessionPassword);

// 4. Get media stream for audio/video control
const stream = client.getMediaStream();
```

## Core API Reference

### ZoomVideo (Static Methods)

| Method                             | Description                                              |
| ---------------------------------- | -------------------------------------------------------- |
| `createClient()`                   | Create VideoClient instance (singleton)                  |
| `checkSystemRequirements()`        | Check browser compatibility → `{ audio, video, screen }` |
| `checkFeatureRequirements()`       | Get supported/unsupported features list                  |
| `getDevices(skipPermissionCheck?)` | Enumerate media devices                                  |
| `createLocalAudioTrack(deviceId?)` | Create local audio track for preview                     |
| `createLocalVideoTrack(deviceId?)` | Create local video track for preview                     |
| `destroyClient()`                  | Destroy client instance                                  |
| `preloadDependentAssets(path?)`    | Preload WebAssembly/Worker assets                        |

### VideoClient Methods

#### Session Management

```typescript
// Initialize before joining
await client.init(language, dependentAssets, options?);

// Join session
await client.join(topic, token, userName, password?, idleTimeoutMins?);

// Leave or end session
await client.leave(end?); // end=true ends for all (host only)
```

#### User Management

```typescript
client.getCurrentUserInfo(): Participant;
client.getAllUser(): Participant[];
client.getUser(userId): Participant | undefined;
client.getSessionHost(): Participant | undefined;

// Host/Manager actions
client.makeHost(userId);      // Transfer host
client.makeManager(userId);   // Promote to manager
client.revokeManager(userId); // Revoke manager
client.removeUser(userId);    // Remove participant
client.changeName(name, userId?);
```

#### Feature Clients

```typescript
client.getMediaStream(); // Audio/Video/Screen share
client.getChatClient(); // In-session chat
client.getCommandClient(); // Custom signaling
client.getRecordingClient(); // Cloud recording
client.getSubsessionClient(); // Breakout rooms
client.getLiveTranscriptionClient(); // Captions
client.getLiveStreamClient(); // RTMP streaming
client.getWhiteboardClient(); // Whiteboard
```

### MediaStream (Audio/Video Control)

#### Audio

```typescript
const stream = client.getMediaStream();

// Start audio (requires user gesture)
await stream.startAudio({
  mute?: boolean,
  speakerOnly?: boolean,
  backgroundNoiseSuppression?: boolean,
  microphoneId?: string,
  speakerId?: string,
});

// Control
await stream.muteAudio();
await stream.unmuteAudio();
stream.stopAudio();

// Device management
stream.getMicList(): MediaDevice[];
stream.getSpeakerList(): MediaDevice[];
await stream.switchMicrophone(deviceId);
await stream.switchSpeaker(deviceId);
```

#### Video

```typescript
// Start video (simple call, no options needed)
await stream.startVideo();

// Or with options
await stream.startVideo({
  cameraId?: string,
  hd?: boolean,           // 720p
  fullHd?: boolean,       // 1080p
  mirrored?: boolean,
  virtualBackground?: { imageUrl: string | 'blur' | undefined, cropped?: boolean },
});

// CRITICAL: Attach video to DOM
// 1. Container MUST be a <video-player-container> custom element
// 2. attachVideo returns an element - append it to the container
// 3. Do NOT pass container as third parameter
const videoElement = await stream.attachVideo(userId, VideoQuality.Video_720P);
container.appendChild(videoElement);

// Stop video
await stream.stopVideo();

// Detach video (cleanup)
stream.detachVideo(userId);

// Device management
stream.getCameraList(): MediaDevice[];
await stream.switchCamera(deviceId);
```

#### Screen Share

```typescript
// Start sharing
await stream.startShareScreen({
  broadcastToSubsession: boolean,
  optimizedForSharedVideo: boolean,
  secondaryCameraId: string, // Share secondary camera
});

// Stop sharing
await stream.stopShareScreen();

// View others' share
await stream.startShareView(canvas, userId);
stream.stopShareView();
```

### Events

```typescript
// Connection
client.on("connection-change", (payload) => {
  // payload.state: 'Connected' | 'Reconnecting' | 'Closed' | 'Fail'
});

// Users
client.on("user-added", (participants: Participant[]) => {});
client.on("user-updated", (participants: Participant[]) => {});
client.on("user-removed", (participants: Participant[]) => {});

// Audio
client.on("current-audio-change", (payload) => {
  // payload.action: 'join' | 'leave' | 'muted' | 'unmuted'
});
client.on("active-speaker", (payload) => {
  // payload.activeSpeaker: { oderId: number, oderId?: number }[]
});

// Video
client.on("video-active-change", (payload) => {
  // payload.userId, payload.state: 'Active' | 'Inactive'
});
client.on("peer-video-state-change", (payload) => {
  // payload.userId, payload.action: 'Start' | 'Stop'
});

// Screen Share
client.on("active-share-change", (payload) => {
  // payload.userId, payload.state: 'Active' | 'Inactive'
});
client.on("peer-share-state-change", (payload) => {
  // payload.userId, payload.action: 'Start' | 'Stop'
});

// Chat
client.on("chat-on-message", (payload) => {
  // payload.message, payload.sender, payload.timestamp
});

// Device
client.on("device-change", () => {
  // Re-enumerate devices
});
```

## Framework-Specific Implementation Guides

For complete implementation examples with full project setup, hooks/composables/services, and components:

- **[references/react.md](references/react.md)** - React 19 + Vite 7 + TypeScript + shadcn/ui implementation
- **[references/vue.md](references/vue.md)** - Vue 3 + Vite 7 + TypeScript + shadcn-vue implementation
- **[references/angular.md](references/angular.md)** - Angular 21 + Standalone Components + Signals implementation
- **[references/svelte.md](references/svelte.md)** - Svelte 5 + Runes + Vite 7 + TypeScript implementation

### Quick Setup Requirements

**All frameworks MUST configure:**

1. **COOP/COEP Headers** (for SharedArrayBuffer support):

```typescript
// vite.config.ts
server: {
  headers: {
    'Cross-Origin-Opener-Policy': 'same-origin',
    'Cross-Origin-Embedder-Policy': 'require-corp',
  },
}
```

2. **TypeScript Custom Elements** (for video rendering):

```typescript
// src/types/zoom-elements.d.ts
declare namespace JSX {
  interface IntrinsicElements {
    "video-player-container": React.DetailedHTMLProps<
      React.HTMLAttributes<HTMLElement>,
      HTMLElement
    >;
  }
}
```

### Core Implementation Pattern

All implementations should follow this pattern:

1. **Client Management** - Singleton client with init/join/leave lifecycle
2. **Media Stream** - Audio/video/screen share controls with state sync
3. **Participants** - User list with video state change listeners
4. **Video Rendering** - Use `video-player-container` + `attachVideo()`
5. **Event Handling** - Connection, audio, video, chat events
6. **Error Handling** - Map SDK

Files: 31

Size: 451.3 KB

Complexity: 77/100

Category: Image & Video

Source: https://github.com/zoom/zoom-plugin/tree/main/skills/video-sdk/web

Related in Image & Video

watch

Included

Watch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.

Image & Videoscriptsfeatured

physical-ai-defect-image-generation

Included

Use when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.

Image & Videoscripts

accelint-react-best-practices

Included

React performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.

Image & Videoscripts

elevenlabs-agents

Included

Build conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication

Image & Videoscripts

humanizer

Included

Humanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.

Image & Videoscripts

generating-mermaid-diagrams

Included

Salesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.

Image & Videoscripts