elevenlabs-agents
Build conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication
What this skill does
# ElevenLabs Agents Platform ## Overview ElevenLabs Agents Platform is a comprehensive solution for building production-ready conversational AI voice agents. The platform coordinates four core components: 1. **ASR (Automatic Speech Recognition)** - Converts speech to text (32+ languages, sub-second latency) 2. **LLM (Large Language Model)** - Reasoning and response generation (GPT, Claude, Gemini, custom models) 3. **TTS (Text-to-Speech)** - Converts text to speech (5000+ voices, 31 languages, low latency) 4. **Turn-Taking Model** - Proprietary model that handles conversation timing and interruptions ### ๐จ Package Updates (November 2025) ElevenLabs migrated to new scoped packages in August 2025: **DEPRECATED (Do not use):** - `@11labs/react` โ **DEPRECATED** - `@11labs/client` โ **DEPRECATED** **Current packages:** ```bash npm install @elevenlabs/[email protected] # React SDK npm install @elevenlabs/[email protected] # JavaScript SDK npm install @elevenlabs/[email protected] # React Native SDK npm install @elevenlabs/[email protected] # Base SDK npm install -g @elevenlabs/[email protected] # CLI ``` If you have old packages installed, uninstall them first: ```bash npm uninstall @11labs/react @11labs/client ``` ### When to Use This Skill Use this skill when: - Building voice-enabled customer support agents - Creating interactive voice response (IVR) systems - Developing conversational AI applications - Integrating telephony (Twilio, SIP trunking) - Implementing voice chat in web/mobile apps - Configuring agents via CLI ("agents as code") - Setting up RAG/knowledge bases for agents - Integrating MCP (Model Context Protocol) servers - Building HIPAA/GDPR-compliant voice systems - Optimizing LLM costs with caching strategies ### Platform Capabilities **Design & Configure**: - Multi-step workflows with visual builder - System prompt engineering (6-component framework) - 5000+ voices across 31 languages - Pronunciation dictionaries (IPA/CMU formats) - Speed control (0.7x-1.2x) - RAG-powered knowledge bases - Dynamic variables and personalization **Connect & Deploy**: - React SDK (`@elevenlabs/react`) - JavaScript SDK (`@elevenlabs/client`) - React Native SDK (`@elevenlabs/react-native`) - Swift SDK (iOS/macOS) - Embeddable widget - Telephony integration (Twilio, SIP) - Scribe (Real-Time Speech-to-Text) - Beta **Operate & Optimize**: - Automated testing (scenario, tool call, load) - Conversation analysis and evaluation - Analytics dashboard (resolution rates, sentiment, compliance) - Privacy controls (GDPR, HIPAA, SOC 2) - Cost optimization (LLM caching, model swapping, burst pricing) - CLI for "agents as code" workflow --- ## 1. Quick Start (3 Integration Paths) ### Path A: React SDK (Embedded Voice Chat) For building voice chat interfaces in React applications. **Installation**: ```bash npm install @elevenlabs/react zod ``` **Basic Example**: ```typescript import { useConversation } from '@elevenlabs/react'; import { z } from 'zod'; export default function VoiceChat() { const { startConversation, stopConversation, status } = useConversation({ // Public agent (no API key needed) agentId: 'your-agent-id', // OR private agent (requires API key) apiKey: process.env.NEXT_PUBLIC_ELEVENLABS_API_KEY, // OR signed URL (server-generated, most secure) signedUrl: '/api/elevenlabs/auth', // Client-side tools (browser functions) clientTools: { updateCart: { description: "Update the shopping cart", parameters: z.object({ item: z.string(), quantity: z.number() }), handler: async ({ item, quantity }) => { console.log('Updating cart:', item, quantity); return { success: true }; } } }, // Event handlers onConnect: () => console.log('Connected'), onDisconnect: () => console.log('Disconnected'), onEvent: (event) => { switch (event.type) { case 'transcript': console.log('User said:', event.data.text); break; case 'agent_response': console.log('Agent replied:', event.data.text); break; } }, // Regional compliance (GDPR, data residency) serverLocation: 'us' // 'us' | 'global' | 'eu-residency' | 'in-residency' }); return ( <div> <button onClick={startConversation}>Start Conversation</button> <button onClick={stopConversation}>Stop</button> <p>Status: {status}</p> </div> ); } ``` ### Path B: CLI ("Agents as Code") For managing agents via code with version control and CI/CD. **Installation**: ```bash npm install -g @elevenlabs/agents-cli # or pnpm install -g @elevenlabs/agents-cli ``` **Workflow**: ```bash # 1. Authenticate elevenlabs auth login # 2. Initialize project (creates agents.json, tools.json, tests.json) elevenlabs agents init # 3. Create agent from template elevenlabs agents add "Support Agent" --template customer-service # 4. Configure in agent_configs/support-agent.json # 5. Push to platform elevenlabs agents push --env dev # 6. Test elevenlabs agents test "Support Agent" # 7. Deploy to production elevenlabs agents push --env prod ``` **Project Structure Created**: ``` your_project/ โโโ agents.json # Agent registry โโโ tools.json # Tool configurations โโโ tests.json # Test configurations โโโ agent_configs/ # Individual agent files โโโ tool_configs/ # Tool configuration files โโโ test_configs/ # Test configuration files ``` ### Path C: API (Programmatic Agent Management) For creating agents dynamically (multi-tenant, SaaS platforms). **Installation**: ```bash npm install elevenlabs ``` **Example**: ```typescript import { ElevenLabsClient } from 'elevenlabs'; const client = new ElevenLabsClient({ apiKey: process.env.ELEVENLABS_API_KEY }); // Create agent const agent = await client.agents.create({ name: 'Support Bot', conversation_config: { agent: { prompt: { prompt: "You are a helpful customer support agent.", llm: "gpt-4o", temperature: 0.7 }, first_message: "Hello! How can I help you today?", language: "en" }, tts: { model_id: "eleven_turbo_v2_5", voice_id: "your-voice-id" } } }); console.log('Agent created:', agent.agent_id); ``` --- ## 2. Agent Configuration ### System Prompt Architecture (6 Components) ElevenLabs recommends structuring agent prompts using 6 components: #### 1. Personality Define the agent's identity, role, and character traits. **Example**: ``` You are Alex, a friendly and knowledgeable customer support specialist at TechCorp. You have 5 years of experience helping customers solve technical issues. You're patient, empathetic, and always maintain a positive attitude. ``` #### 2. Environment Describe the communication context (phone, web chat, video call). **Example**: ``` You're speaking with customers over the phone. Communication is voice-only. Customers may have background noise or poor connection quality. Speak clearly and occasionally use thoughtful pauses for emphasis. ``` #### 3. Tone Specify formality, speech patterns, humor, and verbosity. **Example**: ``` Tone: Professional yet warm. Use contractions ("I'm" instead of "I am") to sound natural. Avoid jargon unless the customer uses it first. Keep responses concise (2-3 sentences max). Use encouraging phrases like "I'll be happy to help with that" and "Let's get this sorted for you." ``` #### 4. Goal Define objectives and success criteria. **Example**: ``` Primary Goal: Resolve customer technical issues on the first call. Secondary Goals: - Verify customer identity securely - Document issue details accurately - Offer proactive solutions - End calls with confirmation that the issue is resolved Success Criteria: Customer verbally confirms their issue is resolved. ``` #### 5. Guardrails Set boundaries, prohibited topics,
Related in Image & Video
watch
IncludedWatch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.
physical-ai-defect-image-generation
IncludedUse when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.
accelint-react-best-practices
IncludedReact performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.
humanizer
IncludedHumanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.
generating-mermaid-diagrams
IncludedSalesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.
ponyflash
IncludedGenerate images, videos, speech audio, and music using the PonyFlash Python SDK. Also handle local media editing with FFmpeg, including clip, concat, transcode, extract audio, frame capture, subtitle capability checks, and ASS subtitle prep. Use when the user asks to create, generate, produce, edit, trim, merge, concatenate, transcode, subtitle, or render AI-generated media content.