agents
Build voice AI agents with ElevenLabs. Use when creating voice assistants, customer service bots, interactive voice characters, or any real-time voice conversation experience.
What this skill does
# ElevenLabs Agents Platform
Build voice AI agents with natural conversations, multiple LLM providers, custom tools, and easy web embedding.
> **Setup:** See [Installation Guide](references/installation.md) for CLI and SDK setup.
## Quick Start with CLI
The ElevenLabs CLI is the recommended way to create and manage agents:
```bash
# Install CLI and authenticate
npm install -g @elevenlabs/cli
elevenlabs auth login
# Initialize project and create an agent
elevenlabs agents init
elevenlabs agents add "My Assistant" --template complete
# Push to ElevenLabs platform
elevenlabs agents push
```
**Available templates:** `complete`, `minimal`, `voice-only`, `text-only`, `customer-service`, `assistant`
### Python
```python
from elevenlabs import ElevenLabs
client = ElevenLabs()
agent = client.conversational_ai.agents.create(
name="My Assistant",
conversation_config={
"agent": {
"first_message": "Hello! How can I help?",
"language": "en",
"prompt": {
"prompt": "You are a helpful assistant. Be concise and friendly.",
"llm": "gemini-2.0-flash",
"temperature": 0.7
}
},
"tts": {"voice_id": "JBFqnCBsd6RMkjVDRZzb"}
}
)
```
### JavaScript
```javascript
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
const client = new ElevenLabsClient();
const agent = await client.conversationalAi.agents.create({
name: "My Assistant",
conversationConfig: {
agent: {
firstMessage: "Hello! How can I help?",
language: "en",
prompt: {
prompt: "You are a helpful assistant.",
llm: "gemini-2.0-flash",
temperature: 0.7
}
},
tts: { voiceId: "JBFqnCBsd6RMkjVDRZzb" }
}
});
```
### cURL
```bash
curl -X POST "https://api.elevenlabs.io/v1/convai/agents/create" \
-H "xi-api-key: $ELEVENLABS_API_KEY" -H "Content-Type: application/json" \
-d '{"name": "My Assistant", "conversation_config": {"agent": {"first_message": "Hello!", "language": "en", "prompt": {"prompt": "You are helpful.", "llm": "gemini-2.0-flash"}}, "tts": {"voice_id": "JBFqnCBsd6RMkjVDRZzb"}}}'
```
## Starting Conversations
### Temporary LiveKit WebSocket Pin
Until the ElevenLabs LiveKit server supports `/rtc/v1`, browser clients using WebRTC can fail or stall during the underlying LiveKit WebSocket handshake with `livekit-client` versions newer than `2.16.1`. For React, Next.js, Electron, or other `@elevenlabs/client` / `@elevenlabs/react` integrations that use `connectionType: "webrtc"` or hit `wss://livekit.rtc.elevenlabs.io/rtc/v1`, add this temporary pin to `package.json`:
```json
{
"overrides": {
"livekit-client": "2.16.1"
}
}
```
Use the pin when the app logs `/rtc/v1` 404s, `v1 RTC path not found`, or `could not establish pc connection` during session startup. This is a LiveKit server compatibility workaround for WebRTC sessions, not the ElevenLabs `connectionType: "websocket"` transport. Remove it after the upstream LiveKit server or SDK issue is fixed.
**Server-side (Python):** Get signed URL for client connection:
```python
signed_url = client.conversational_ai.conversations.get_signed_url(
agent_id="your-agent-id",
environment="staging",
)
```
**Client-side (JavaScript):**
```javascript
import { Conversation } from "@elevenlabs/client";
const conversation = await Conversation.startSession({
agentId: "your-agent-id",
environment: "staging",
onMessage: (msg) => console.log("Agent:", msg.message),
onUserTranscript: (t) => console.log("User:", t.message),
onError: (e) => console.error(e)
});
```
**React Hook:** Wrap hook consumers in `ConversationProvider`. Prefer granular hooks such as
`useConversationControls` and `useConversationStatus` for session controls and UI state;
`useConversation` remains available as the convenience all-in-one hook. Pass provider-level
callbacks such as `onError` when you want React to handle conversation errors in one place.
```typescript
import {
ConversationProvider,
useConversationControls,
useConversationStatus,
} from "@elevenlabs/react";
function Agent({ signedUrl }: { signedUrl: string }) {
const { startSession, endSession } = useConversationControls();
const { status } = useConversationStatus();
if (status === "connected") {
return <button onClick={endSession}>End conversation</button>;
}
return (
<button onClick={() => startSession({ signedUrl })}>
Start conversation
</button>
);
}
function App({ signedUrl }: { signedUrl: string }) {
return (
<ConversationProvider
onError={(error) => console.error("Conversation error:", error)}
>
<Agent signedUrl={signedUrl} />
</ConversationProvider>
);
}
```
## Configuration
| Provider | Models |
|----------|--------|
| OpenAI | `gpt-5.5`, `gpt-5.5-2026-04-23`, `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.4-nano`, `gpt-5.4-2026-03-05`, `gpt-5.4-mini-2026-03-17`, `gpt-5.4-nano-2026-03-17`, `gpt-5`, `gpt-5-mini`, `gpt-5-nano`, `gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`, `gpt-4o`, `gpt-4o-mini`, `gpt-4-turbo` |
| Anthropic | `claude-opus-4-7`, `claude-sonnet-4-6`, `claude-sonnet-4-5`, `claude-sonnet-4`, `claude-haiku-4-5`, `claude-3-7-sonnet`, `claude-3-5-sonnet`, `claude-3-haiku` |
| Google | `gemini-3.1-flash-lite-preview`, `gemini-3.1-pro-preview`, `gemini-3-pro-preview`, `gemini-3-flash-preview`, `gemini-2.5-flash`, `gemini-2.5-flash-lite`, `gemini-2.0-flash`, `gemini-2.0-flash-lite` |
| ElevenLabs | `glm-45-air-fp8`, `qwen3-30b-a3b`, `qwen36-35b-a3b`, `qwen35-35b-a3b`, `qwen35-397b-a17b`, `gpt-oss-120b` |
| Custom | `custom-llm` (bring your own endpoint) |
Use `GET /v1/convai/llm/list` to inspect the current model catalog, including deprecation state, token/context limits, capability flags such as image-input support, and model-specific reasoning effort support.
**Popular voices:** `JBFqnCBsd6RMkjVDRZzb` (George), `EXAVITQu4vr4xnSDxMaL` (Sarah), `onwK4e9ZLuTAKqWW03F9` (Daniel), `XB0fDUnXU5powFXDhCwa` (Charlotte)
**Turn eagerness:** `patient` (waits longer for user to finish), `normal`, or `eager` (responds quickly)
See [Agent Configuration](references/agent-configuration.md) for all options.
## System Prompt Structure
Section the prompt with markdown headings — the model prioritizes and interprets instructions more reliably ([prompting guide](https://elevenlabs.io/docs/eleven-agents/best-practices/prompting-guide)):
```
# Personality – named character, 2-3 traits
# Environment – where they work, who they talk to
# Tone – vocal style as 4-5 bullets
# Goal – what success looks like (numbered for multi-step flows)
```
Keep instructions short and action-based. Mark critical steps with "This step is important." For critical refusal/safety rules, include concise instructions in the prompt and also configure independent custom Guardrails via `platform_settings.guardrails` (see [Guardrails](#guardrails)).
## Tools
Extend agents with webhook, client, or built-in system tools. Tools are defined inside `conversation_config.agent.prompt`:
Workspace environment variables can resolve per-environment server tool URLs, headers, and auth connections, and runtime system variables such as `{{system__conversation_history}}` can pass full conversation context into tool calls when needed.
```python
"prompt": {
"prompt": "You are a helpful assistant that can check the weather.",
"llm": "gemini-2.0-flash",
"tools": [
# Webhook: server-side API call
{"type": "webhook", "name": "get_weather", "description": "Get weather",
"api_schema": {"url": "https://api.example.com/weather", "method": "POST",
"request_body_schema": {"type": "object", "properties": {"location": {"type": "string"}}, "required": ["location"]}}},
# Client: runs in the browser
{"type": "client", "name": "show_product", "description": "Display a product",
"parameters": {"type": "object"Related in Image & Video
watch
IncludedWatch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.
physical-ai-defect-image-generation
IncludedUse when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.
accelint-react-best-practices
IncludedReact performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.
elevenlabs-agents
IncludedBuild conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication
humanizer
IncludedHumanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.
generating-mermaid-diagrams
IncludedSalesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.