elevenlabs-agents

Included with Lifetime

$97 forever

Build conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication

Image & Videoscriptsassets

What this skill does


# ElevenLabs Agents Platform

## Overview

ElevenLabs Agents Platform is a comprehensive solution for building production-ready conversational AI voice agents. The platform coordinates four core components:

1. **ASR (Automatic Speech Recognition)** - Converts speech to text (32+ languages, sub-second latency)
2. **LLM (Large Language Model)** - Reasoning and response generation (GPT, Claude, Gemini, custom models)
3. **TTS (Text-to-Speech)** - Converts text to speech (5000+ voices, 31 languages, low latency)
4. **Turn-Taking Model** - Proprietary model that handles conversation timing and interruptions

### 🚨 Package Updates (November 2025)

ElevenLabs migrated to new scoped packages in August 2025:

**DEPRECATED (Do not use):**
- `@11labs/react` → **DEPRECATED**
- `@11labs/client` → **DEPRECATED**

**Current packages:**
```bash
npm install @elevenlabs/[email protected]        # React SDK
npm install @elevenlabs/[email protected]       # JavaScript SDK
npm install @elevenlabs/[email protected] # React Native SDK
npm install @elevenlabs/[email protected] # Base SDK
npm install -g @elevenlabs/[email protected]  # CLI
```

If you have old packages installed, uninstall them first:
```bash
npm uninstall @11labs/react @11labs/client
```

### When to Use This Skill

Use this skill when:
- Building voice-enabled customer support agents
- Creating interactive voice response (IVR) systems
- Developing conversational AI applications
- Integrating telephony (Twilio, SIP trunking)
- Implementing voice chat in web/mobile apps
- Configuring agents via CLI ("agents as code")
- Setting up RAG/knowledge bases for agents
- Integrating MCP (Model Context Protocol) servers
- Building HIPAA/GDPR-compliant voice systems
- Optimizing LLM costs with caching strategies

### Platform Capabilities

**Design & Configure**:
- Multi-step workflows with visual builder
- System prompt engineering (6-component framework)
- 5000+ voices across 31 languages
- Pronunciation dictionaries (IPA/CMU formats)
- Speed control (0.7x-1.2x)
- RAG-powered knowledge bases
- Dynamic variables and personalization

**Connect & Deploy**:
- React SDK (`@elevenlabs/react`)
- JavaScript SDK (`@elevenlabs/client`)
- React Native SDK (`@elevenlabs/react-native`)
- Swift SDK (iOS/macOS)
- Embeddable widget
- Telephony integration (Twilio, SIP)
- Scribe (Real-Time Speech-to-Text) - Beta

**Operate & Optimize**:
- Automated testing (scenario, tool call, load)
- Conversation analysis and evaluation
- Analytics dashboard (resolution rates, sentiment, compliance)
- Privacy controls (GDPR, HIPAA, SOC 2)
- Cost optimization (LLM caching, model swapping, burst pricing)
- CLI for "agents as code" workflow

---

## 1. Quick Start (3 Integration Paths)

### Path A: React SDK (Embedded Voice Chat)

For building voice chat interfaces in React applications.

**Installation**:
```bash
npm install @elevenlabs/react zod
```

**Basic Example**:
```typescript
import { useConversation } from '@elevenlabs/react';
import { z } from 'zod';

export default function VoiceChat() {
  const { startConversation, stopConversation, status } = useConversation({
    // Public agent (no API key needed)
    agentId: 'your-agent-id',

    // OR private agent (requires API key)
    apiKey: process.env.NEXT_PUBLIC_ELEVENLABS_API_KEY,

    // OR signed URL (server-generated, most secure)
    signedUrl: '/api/elevenlabs/auth',

    // Client-side tools (browser functions)
    clientTools: {
      updateCart: {
        description: "Update the shopping cart",
        parameters: z.object({
          item: z.string(),
          quantity: z.number()
        }),
        handler: async ({ item, quantity }) => {
          console.log('Updating cart:', item, quantity);
          return { success: true };
        }
      }
    },

    // Event handlers
    onConnect: () => console.log('Connected'),
    onDisconnect: () => console.log('Disconnected'),
    onEvent: (event) => {
      switch (event.type) {
        case 'transcript':
          console.log('User said:', event.data.text);
          break;
        case 'agent_response':
          console.log('Agent replied:', event.data.text);
          break;
      }
    },

    // Regional compliance (GDPR, data residency)
    serverLocation: 'us' // 'us' | 'global' | 'eu-residency' | 'in-residency'
  });

  return (
    <div>
      <button onClick={startConversation}>Start Conversation</button>
      <button onClick={stopConversation}>Stop</button>
      <p>Status: {status}</p>
    </div>
  );
}
```

### Path B: CLI ("Agents as Code")

For managing agents via code with version control and CI/CD.

**Installation**:
```bash
npm install -g @elevenlabs/agents-cli
# or
pnpm install -g @elevenlabs/agents-cli
```

**Workflow**:
```bash
# 1. Authenticate
elevenlabs auth login

# 2. Initialize project (creates agents.json, tools.json, tests.json)
elevenlabs agents init

# 3. Create agent from template
elevenlabs agents add "Support Agent" --template customer-service

# 4. Configure in agent_configs/support-agent.json

# 5. Push to platform
elevenlabs agents push --env dev

# 6. Test
elevenlabs agents test "Support Agent"

# 7. Deploy to production
elevenlabs agents push --env prod
```

**Project Structure Created**:
```
your_project/
├── agents.json              # Agent registry
├── tools.json               # Tool configurations
├── tests.json               # Test configurations
├── agent_configs/           # Individual agent files
├── tool_configs/            # Tool configuration files
└── test_configs/            # Test configuration files
```

### Path C: API (Programmatic Agent Management)

For creating agents dynamically (multi-tenant, SaaS platforms).

**Installation**:
```bash
npm install elevenlabs
```

**Example**:
```typescript
import { ElevenLabsClient } from 'elevenlabs';

const client = new ElevenLabsClient({
  apiKey: process.env.ELEVENLABS_API_KEY
});

// Create agent
const agent = await client.agents.create({
  name: 'Support Bot',
  conversation_config: {
    agent: {
      prompt: {
        prompt: "You are a helpful customer support agent.",
        llm: "gpt-4o",
        temperature: 0.7
      },
      first_message: "Hello! How can I help you today?",
      language: "en"
    },
    tts: {
      model_id: "eleven_turbo_v2_5",
      voice_id: "your-voice-id"
    }
  }
});

console.log('Agent created:', agent.agent_id);
```

---

## 2. Agent Configuration

### System Prompt Architecture (6 Components)

ElevenLabs recommends structuring agent prompts using 6 components:

#### 1. Personality
Define the agent's identity, role, and character traits.

**Example**:
```
You are Alex, a friendly and knowledgeable customer support specialist at TechCorp.
You have 5 years of experience helping customers solve technical issues.
You're patient, empathetic, and always maintain a positive attitude.
```

#### 2. Environment
Describe the communication context (phone, web chat, video call).

**Example**:
```
You're speaking with customers over the phone. Communication is voice-only.
Customers may have background noise or poor connection quality.
Speak clearly and occasionally use thoughtful pauses for emphasis.
```

#### 3. Tone
Specify formality, speech patterns, humor, and verbosity.

**Example**:
```
Tone: Professional yet warm. Use contractions ("I'm" instead of "I am") to sound natural.
Avoid jargon unless the customer uses it first. Keep responses concise (2-3 sentences max).
Use encouraging phrases like "I'll be happy to help with that" and "Let's get this sorted for you."
```

#### 4. Goal
Define objectives and success criteria.

**Example**:
```
Primary Goal: Resolve customer technical issues on the first call.
Secondary Goals:
- Verify customer identity securely
- Document issue details accurately
- Offer proactive solutions
- End calls with confirmation that the issue is resolved

Success Criteria: Customer verbally confirms their issue is resolved.
```

#### 5. Guardrails
Set boundaries, prohibited topics,

Files: 23

Size: 138.2 KB

Complexity: 100/100

Category: Image & Video

Source: https://github.com/ovachiever/droid-tings/tree/main/skills/elevenlabs-agents

Related in Image & Video

watch

Included

Watch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.

Image & Videoscriptsfeatured

physical-ai-defect-image-generation

Included

Use when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.

Image & Videoscripts

accelint-react-best-practices

Included

React performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.

Image & Videoscripts

humanizer

Included

Humanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.

Image & Videoscripts

generating-mermaid-diagrams

Included

Salesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.

Image & Videoscripts

ponyflash

Included

Generate images, videos, speech audio, and music using the PonyFlash Python SDK. Also handle local media editing with FFmpeg, including clip, concat, transcode, extract audio, frame capture, subtitle capability checks, and ASS subtitle prep. Use when the user asks to create, generate, produce, edit, trim, merge, concatenate, transcode, subtitle, or render AI-generated media content.

Image & Videoscripts