Claude
Skills
Sign in
Back

livekit-voice-agent

Included with Lifetime
$97 forever

Guide for building production-ready LiveKit voice AI agents with multi-agent workflows and intelligent handoffs. Use when creating real-time voice agents that need to transfer control between specialized agents, implement supervisor escalation, or build complex conversational systems.

Image & Videoscriptsassets

What this skill does


# LiveKit Voice Agent with Multi-Agent Handoffs

Build production-ready voice AI agents using LiveKit Agents framework with support for multi-agent workflows, intelligent handoffs, and specialized agent capabilities.

---

## Overview

LiveKit Agents enables building real-time multimodal AI agents with voice capabilities. This skill helps you create sophisticated voice systems where multiple specialized agents can seamlessly hand off conversations based on context, user needs, or business logic.

### Key Capabilities

- **Multi-Agent Workflows**: Chain multiple specialized agents with different instructions, tools, and models
- **Intelligent Handoffs**: Transfer control between agents using function tools
- **Context Preservation**: Maintain conversation state and user data across agent transitions
- **Flexible Architecture**: Support for lateral handoffs (peer agents), escalations (human operators), and returns
- **Production Ready**: Built-in testing, Docker deployment, and monitoring support

---

## Architecture Patterns

### Core Components

1. **AgentSession**: Orchestrates the overall interaction, manages shared services (VAD, STT, LLM, TTS), and holds shared userdata
2. **Agent Classes**: Individual agents with specific instructions, function tools, and optional model overrides
3. **Handoff Mechanism**: Function tools that return new agent instances to transfer control
4. **Shared Context**: UserData dataclass that persists information across agent handoffs

### Workflow Structure

```
┌─────────────────────────────────────────────────┐
│           AgentSession (Orchestrator)           │
│  ├─ Shared VAD, STT, TTS, LLM services         │
│  ├─ Shared UserData context                    │
│  └─ Agent lifecycle management                  │
└─────────────────────────────────────────────────┘
                      │
        ┌─────────────┼─────────────┐
        ▼             ▼             ▼
   ┌─────────┐  ┌─────────┐  ┌─────────┐
   │ Agent A │  │ Agent B │  │ Agent C │
   │ ├─Instructions │ ├─Instructions │ ├─Instructions
   │ ├─Tools    │ ├─Tools    │ ├─Tools
   │ └─Handoff  │ └─Handoff  │ └─Handoff
   └─────────┘  └─────────┘  └─────────┘
```

---

## Implementation Process

### Phase 1: Research and Planning

#### 1.1 Study LiveKit Documentation

**Load core documentation:**
- LiveKit Agents Overview: Use WebFetch to load `https://docs.livekit.io/agents/`
- Building Voice Agents: `https://docs.livekit.io/agents/build/`
- Workflows Guide: `https://docs.livekit.io/agents/build/workflows/`
- Testing Framework: `https://docs.livekit.io/agents/build/testing/`

**Study example implementations:**
- Agent Starter Template: `https://github.com/livekit-examples/agent-starter-python`
- Multi-Agent Example: `https://github.com/livekit-examples/multi-agent-python`
- Voice Agent Examples: `https://github.com/livekit/agents/tree/main/examples/voice_agents`

**Load reference documentation:**
- [📋 Agent Best Practices](./reference/agent_best_practices.md)
- [🏗️ Multi-Agent Patterns](./reference/multi_agent_patterns.md)
- [🧪 Testing Guide](./reference/testing_guide.md)

#### 1.2 Define Your Use Case

Determine your agent workflow:

**Customer Support Pattern:**
```
Greeting Agent → Triage Agent → Technical Support → Escalation Agent
```

**Sales Pipeline Pattern:**
```
Intro Agent → Qualification Agent → Demo Agent → Account Executive Handoff
```

**Service Workflow Pattern:**
```
Reception Agent → Information Gathering → Specialist Agent → Confirmation Agent
```

**Plan your agents:**
- List each agent needed
- Define the role and instructions for each
- Identify handoff triggers and conditions
- Specify tools needed per agent
- Determine if agents need different models (STT/LLM/TTS)

#### 1.3 Design Shared Context

Create a dataclass to store information that persists across agents:

```python
from dataclasses import dataclass, field

@dataclass
class ConversationData:
    """Shared context across all agents"""
    user_name: str = ""
    user_email: str = ""
    issue_category: str = ""
    collected_details: list[str] = field(default_factory=list)
    escalation_needed: bool = False
    # Add fields relevant to your use case
```

---

### Phase 2: Implementation

#### 2.1 Set Up Project Structure

Use the provided template as a starting point:

```
your-agent-project/
├── src/
│   ├── agent.py              # Main entry point
│   ├── agents/
│   │   ├── __init__.py
│   │   ├── intro_agent.py    # Initial agent
│   │   ├── specialist_agent.py
│   │   └── escalation_agent.py
│   ├── models/
│   │   └── shared_data.py    # UserData dataclass
│   └── tools/
│       └── custom_tools.py   # Business-specific tools
├── tests/
│   └── test_agent.py         # pytest tests
├── pyproject.toml            # Dependencies with uv
├── .env.example              # Environment variables template
├── Dockerfile                # Container definition
└── README.md
```

**Use the quick start script or copy template files:**
- See [⚡ Quick Start Script](./scripts/quickstart.sh) for automated setup
- Or manually copy files from `./templates/` directory

#### 2.2 Initialize Project

**Install uv package manager:**
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

**Create project with dependencies:**
```bash
# Initialize project
uv init your-agent-project
cd your-agent-project

# Add dependencies
uv add "livekit-agents>=1.3.3"
uv add "livekit-plugins-openai"      # For OpenAI LLM & TTS
uv add "livekit-plugins-deepgram"    # For Deepgram STT
uv add "livekit-plugins-silero"      # For Silero VAD
uv add "python-dotenv"               # For environment variables

# Add testing dependencies
uv add --dev "pytest"
uv add --dev "pytest-asyncio"
```

**Set up environment variables:**
```bash
# Copy from template
cp .env.example .env

# Edit with your credentials
# LIVEKIT_URL=wss://your-livekit-server.com
# LIVEKIT_API_KEY=your-api-key
# LIVEKIT_API_SECRET=your-api-secret
# OPENAI_API_KEY=your-openai-key
# DEEPGRAM_API_KEY=your-deepgram-key
```

#### 2.3 Implement Core Infrastructure

**Create main entry point (src/agent.py):**

Load the complete template: [🚀 Main Entry Point Template](./templates/main_entry_point.py)

Key patterns:
- Use `prewarm()` to load static resources (VAD models) before sessions start
- Initialize `AgentSession[YourDataClass]` with shared services
- Start with your initial agent in the entrypoint
- Use `@server.rtc_session()` decorator for the main handler

**Example structure:**
```python
from livekit import rtc
from livekit.agents import (
    Agent,
    AgentSession,
    JobContext,
    JobProcess,
    WorkerOptions,
    cli,
)
from livekit.plugins import openai, deepgram, silero
import logging
from dotenv import load_dotenv

from agents.intro_agent import IntroAgent
from models.shared_data import ConversationData

load_dotenv()
logger = logging.getLogger("voice-agent")


def prewarm(proc: JobProcess):
    """Load static resources before sessions start"""
    # Load VAD model once and reuse across sessions
    proc.userdata["vad"] = silero.VAD.load()


async def entrypoint(ctx: JobContext):
    """Main agent entry point"""
    logger.info("Starting voice agent session")

    # Get prewarmed VAD
    vad = ctx.proc.userdata["vad"]

    # Initialize session with shared services
    session = AgentSession[ConversationData](
        vad=vad,
        stt=deepgram.STT(model="nova-2-general"),
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=openai.TTS(voice="alloy"),
        userdata=ConversationData(),
    )

    # Connect to room
    await ctx.connect()

    # Start with intro agent
    intro_agent = IntroAgent()

    # Run session (handles all handoffs automatically)
    await session.start(agent=intro_agent, room=ctx.room)


if __name__ == "__main__":
    cli.run_app(
        WorkerOptions(
            entrypoint_fnc=entrypoint,
            prewarm_fnc=prewarm,
        )
    )
```

#### 2.4 Implement Agent Classes

**Agent structure:**

Eac

Related in Image & Video