video-sdk/linux

Included with Lifetime

$97 forever

Zoom Video SDK for Linux - C++ headless bots, raw audio/video capture/injection, Qt/GTK integration, Docker support

Image & Video

What this skill does


# Zoom Video SDK - Linux Development

Expert guidance for developing with the Zoom Video SDK on Linux. Build headless bots, raw media capture/injection applications, and custom UI integrations with Qt/GTK.

**Official Documentation**: https://developers.zoom.us/docs/video-sdk/linux/
**API Reference**: https://marketplacefront.zoom.us/sdk/custom/linux/
**Sample Repository**: https://github.com/zoom/videosdk-linux-raw-recording-sample

## Quick Links

**New to Video SDK? Follow this path:**

1. **[SDK Architecture Pattern](concepts/sdk-architecture-pattern.md)** - Universal 3-step pattern for ANY feature
2. **[Session Join Pattern](examples/session-join-pattern.md)** - Complete working code to join a session
3. **[Raw Data vs Canvas](concepts/raw-data-vs-canvas.md)** - **CRITICAL**: Linux has NO Canvas API - raw data ONLY
4. **[Raw Video Capture](examples/raw-video-capture.md)** - Capture and process YUV420 frames

**Reference:**
- **[Singleton Hierarchy](concepts/singleton-hierarchy.md)** - 5-level SDK navigation map
- **[API Reference](references/linux-reference.md)** - Complete API documentation
- **[Qt/GTK Integration](examples/qt-gtk-integration.md)** - UI framework patterns
- **[Troubleshooting](troubleshooting/common-issues.md)** - Quick diagnostics
- **[SKILL.md](SKILL.md)** - Complete documentation navigation

**Having issues?**
- PulseAudio setup → [PulseAudio Guide](troubleshooting/pulseaudio-setup.md)
- Qt dependencies → [Qt Dependencies](troubleshooting/qt-dependencies.md)
- Build errors → [Build Errors Guide](troubleshooting/build-errors.md)

## Key Differences from Windows/macOS

| Feature | Linux | Windows/Mac |
|---------|-------|-------------|
| **Canvas API** | ❌ Not available | ✅ Available |
| **Raw Data Pipe** | ✅ **ONLY option** | ✅ Available |
| **UI Integration** | Qt, GTK, SDL2, OpenGL | Win32/WinForms/WPF, Cocoa |
| **Headless Support** | ✅ Excellent (Docker) | Limited |
| **Audio** | PulseAudio required | Native |
| **Virtual Devices** | ✅ Required for headless | Optional |

## SDK Overview

The Zoom Video SDK for Linux is a C++ library optimized for:
- **Headless Bots**: Docker/WSL support, no display required
- **Raw Data Access**: Capture YUV420 video, PCM audio
- **Raw Data Injection**: Virtual camera/mic for custom media
- **Screen Sharing**: Capture or inject share data
- **Cloud Recording**: Record sessions to Zoom cloud
- **Live Streaming**: Stream to RTMP endpoints
- **Live Transcription**: Real-time speech-to-text
- **Qt/GTK Integration**: Full UI framework support

## Prerequisites

### System Requirements

- **OS**: Ubuntu 20.04+, Debian 11+, or compatible
- **Architecture**: x64 (recommended), ARM64
- **Compiler**: GCC 9+, Clang 10+
- **CMake**: 3.14 or later
- **Qt5**: Bundled with SDK (do NOT install system Qt5)

### Dependencies

```bash
sudo apt update
sudo apt install -y build-essential gcc cmake libglib2.0-dev liblzma-dev \
    libxcb-image0 libxcb-keysyms1 libxcb-xfixes0 libxcb-xkb1 libxcb-shape0 \
    libxcb-shm0 libxcb-randr0 libxcb-xtest0 libgbm1 libxtst6 libgl1 libnss3 \
    libasound2 libpulse0

# For headless Linux
sudo apt install -y pulseaudio

# PulseAudio configuration (CRITICAL for audio)
mkdir -p ~/.config
echo "[General]" > ~/.config/zoomus.conf
echo "system.audio.type=default" >> ~/.config/zoomus.conf

# Log directory
mkdir -p ~/.zoom/logs
```

## Quick Start

```cpp
#include "zoom_video_sdk_api.h"
#include "zoom_video_sdk_interface.h"
#include "zoom_video_sdk_delegate_interface.h"

USING_ZOOM_VIDEO_SDK_NAMESPACE

// 1. Create SDK
IZoomVideoSDK* sdk = CreateZoomVideoSDKObj();

// 2. Initialize
ZoomVideoSDKInitParams init_params;
init_params.domain = "https://zoom.us";
init_params.enableLog = true;
init_params.logFilePrefix = "bot";
init_params.videoRawDataMemoryMode = ZoomVideoSDKRawDataMemoryModeHeap;
init_params.shareRawDataMemoryMode = ZoomVideoSDKRawDataMemoryModeHeap;
init_params.audioRawDataMemoryMode = ZoomVideoSDKRawDataMemoryModeHeap;

sdk->initialize(init_params);

// 3. Add delegate
sdk->addListener(myDelegate);

// 4. Join session
ZoomVideoSDKSessionContext ctx;
ctx.sessionName = "my-session";
ctx.userName = "Linux Bot";
ctx.token = "jwt-token";
ctx.audioOption.connect = true;
ctx.audioOption.mute = false;
ctx.videoOption.localVideoOn = false;

// For headless: Virtual audio speaker
ctx.virtualAudioSpeaker = new VirtualSpeaker();

IZoomVideoSDKSession* session = sdk->joinSession(ctx);
```

See **[Session Join Pattern](examples/session-join-pattern.md)** for complete code.

## Key Features

| Feature | Linux Support | Guide |
|---------|---------------|-------|
| **Session Management** | ✅ Full | [Session Join](examples/session-join-pattern.md) |
| **Raw Video (YUV420)** | ✅ ONLY rendering option | [Raw Video](examples/raw-video-capture.md) |
| **Raw Audio (PCM)** | ✅ Full | [Raw Audio](examples/raw-audio-capture.md) |
| **Virtual Camera/Mic** | ✅ Full | [Virtual Devices](examples/virtual-audio-video.md) |
| **Cloud Recording** | ✅ Full | [Recording](examples/cloud-recording.md) |
| **Live Streaming** | ✅ Full | [Live Stream](examples/live-streaming.md) |
| **Live Transcription** | ✅ Full | [Transcription](examples/transcription.md) |
| **Command Channel** | ✅ Full | [Commands](examples/command-channel.md) |
| **Chat** | ✅ Full | [Chat](examples/chat.md) |
| **Qt Integration** | ✅ Recommended | [Qt/GTK](examples/qt-gtk-integration.md) |
| **GTK Integration** | ✅ Supported | [Qt/GTK](examples/qt-gtk-integration.md) |
| **Docker/Headless** | ✅ Excellent | [Virtual Devices](examples/virtual-audio-video.md) |

## Critical Gotchas

### ⚠️ CRITICAL #1: No Canvas API on Linux

**Problem**: Linux SDK does NOT have Canvas API like Windows/Mac.

**Solution**: You MUST use Raw Data Pipe and implement your own rendering.

See: **[Raw Data vs Canvas](concepts/raw-data-vs-canvas.md)**

### ⚠️ CRITICAL #2: PulseAudio Required for Audio

**Problem**: SDK requires PulseAudio for raw audio functions.

**Solution**:
```bash
sudo apt install -y pulseaudio
mkdir -p ~/.config
echo "[General]" > ~/.config/zoomus.conf
echo "system.audio.type=default" >> ~/.config/zoomus.conf
```

See: **[PulseAudio Setup](troubleshooting/pulseaudio-setup.md)**

### ⚠️ CRITICAL #3: Qt5 Dependencies

**Problem**: SDK requires Qt5 libraries (bundled, NOT system Qt5).

**Solution**:
```bash
# Copy from SDK package
cp -r samples/qt_libs/Qt/lib/* lib/zoom_video_sdk/

# Create symlinks
cd lib/zoom_video_sdk
for lib in libQt5*.so.5; do ln -sf $lib ${lib%.5}; done
```

See: **[Qt Dependencies](troubleshooting/qt-dependencies.md)**

### ⚠️ CRITICAL #4: Heap Memory Mode

Always use heap mode for raw data:

```cpp
init_params.videoRawDataMemoryMode = ZoomVideoSDKRawDataMemoryModeHeap;
init_params.shareRawDataMemoryMode = ZoomVideoSDKRawDataMemoryModeHeap;
init_params.audioRawDataMemoryMode = ZoomVideoSDKRawDataMemoryModeHeap;
```

### ⚠️ CRITICAL #5: Virtual Audio for Headless

**Problem**: Docker/headless environments have no audio devices.

**Solution**: Use virtual audio speaker and mic.

```cpp
session_context.virtualAudioSpeaker = new VirtualSpeaker();
session_context.virtualAudioMic = new VirtualMic();
```

See: **[Virtual Audio/Video](examples/virtual-audio-video.md)**

## Sample Repositories

### Official Samples

| Repository | Description |
|-----------|-------------|
| **[raw-recording-sample](https://github.com/zoom/videosdk-linux-raw-recording-sample)** | Raw audio/video capture |
| **[qt-quickstart](https://github.com/tanchunsiong/videosdk-linux-qt-quickstart)** | Qt6 UI integration |
| **[gtk-quickstart](https://github.com/tanchunsiong/videosdk-linux-gtk-quickstart)** | GTK3 UI integration |

### Sample Architecture

```
Headless Bot (Docker):
┌──────────────────────────────────┐
│  Virtual Audio Speaker/Mic       │
├──────────────────────────────────┤
│  Raw Data Processing             │
│  - YUV420 → File/Stream   


## Merged from video-sdk/linux/SKILL.md

# Zoom Video SDK Linux - Complete Documentation Index

##

Files: 21

Size: 161.2 KB

Complexity: 69/100

Category: Image & Video

Source: https://github.com/anthropics/knowledge-work-plugins/tree/main/partner-built/zoom-plugin/skills/video-sdk/linux

Related in Image & Video

watch

Included

Watch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.

Image & Videoscriptsfeatured

physical-ai-defect-image-generation

Included

Use when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.

Image & Videoscripts

accelint-react-best-practices

Included

React performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.

Image & Videoscripts

elevenlabs-agents

Included

Build conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication

Image & Videoscripts

humanizer

Included

Humanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.

Image & Videoscripts

generating-mermaid-diagrams

Included

Salesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.

Image & Videoscripts