skill-elevenlabs-tts-tool

Included with Lifetime

$97 forever

ElevenLabs text-to-speech CLI tool guide

Image & Video

What this skill does


# When to use
- Converting text to speech with ElevenLabs API
- Exploring available voices and models
- Managing TTS subscriptions and usage
- Integrating TTS into workflows and pipelines

# ElevenLabs TTS Tool Skill

## Purpose

Comprehensive guide for the `elevenlabs-tts-tool` CLI - a professional command-line interface for ElevenLabs text-to-speech synthesis. Provides both direct audio playback and file output with support for 42+ premium voices and multiple models.

## When to Use This Skill

**Use this skill when:**
- Converting text to speech for notifications, audiobooks, or content creation
- Exploring and comparing different voice characteristics
- Managing ElevenLabs subscription quotas and usage
- Building voice-enabled workflows and automation
- Integrating TTS into Claude Code hooks or other tools

**Do NOT use this skill for:**
- Direct ElevenLabs API programming (use SDK docs instead)
- Custom voice cloning (requires ElevenLabs web interface)
- Real-time streaming TTS (tool focuses on file/playback generation)

## CLI Tool: elevenlabs-tts-tool

Professional text-to-speech CLI tool built with Python 3.13+, uv, and the ElevenLabs SDK.

### Installation

```bash
# Clone repository
git clone https://github.com/dnvriend/elevenlabs-tts-tool.git
cd elevenlabs-tts-tool

# Install globally with uv
uv tool install .

# Verify installation
elevenlabs-tts-tool --version
```

### Prerequisites

- **Python**: 3.13 or higher
- **API Key**: ElevenLabs API key (get from https://elevenlabs.io/app/settings/api-keys)
- **Environment Variable**: `export ELEVENLABS_API_KEY='your-api-key'`

### Quick Start

```bash
# Set API key
export ELEVENLABS_API_KEY='your-api-key'

# Basic text-to-speech
elevenlabs-tts-tool synthesize "Hello world"

# Use different voice
elevenlabs-tts-tool synthesize "Hello" --voice adam

# Save to file
elevenlabs-tts-tool synthesize "Text" --output speech.mp3
```

## Progressive Disclosure

<details>
<summary><strong>📖 Core Commands (Click to expand)</strong></summary>

### synthesize - Convert Text to Speech

Convert text to speech using ElevenLabs API. Supports direct playback or file output.

**Usage:**
```bash
elevenlabs-tts-tool synthesize [TEXT] [OPTIONS]
```

**Arguments:**
- `TEXT`: Text to synthesize (optional if --stdin used)
- `--stdin, -s`: Read text from stdin instead of argument
- `--voice, -v NAME`: Voice name or ID (default: rachel)
- `--model, -m ID`: Model ID (default: eleven_turbo_v2_5)
- `--output, -o PATH`: Save to audio file instead of playing
- `--format, -f FORMAT`: Output format (default: mp3_44100_128)

**Examples:**
```bash
# Basic usage - play through speakers
elevenlabs-tts-tool synthesize "Hello world"

# Use different voice
elevenlabs-tts-tool synthesize "Hello" --voice adam

# Use specific model
elevenlabs-tts-tool synthesize "Hello" --model eleven_multilingual_v2

# Emotional expression (requires eleven_v3 model)
elevenlabs-tts-tool synthesize "[happy] Welcome to our service!" --model eleven_v3

# Multiple emotions
elevenlabs-tts-tool synthesize "[excited] Great news! [cheerfully] Your project is approved!" --model eleven_v3

# Add pauses with SSML
elevenlabs-tts-tool synthesize "Point one <break time=\"0.5s\" /> Point two <break time=\"0.5s\" /> Point three."

# Read from stdin
echo "Text from pipeline" | elevenlabs-tts-tool synthesize --stdin

# Save to file
elevenlabs-tts-tool synthesize "Text" --output speech.mp3

# Pipeline integration
cat document.txt | elevenlabs-tts-tool synthesize --stdin --output audiobook.mp3
```

**Output:**
Plays audio through default speakers or saves to specified file format.

**Available Formats:**
- `mp3_44100_128` (default): MP3, 44.1kHz, 128kbps
- `mp3_44100_64`: MP3, 44.1kHz, 64kbps
- `mp3_22050_32`: MP3, 22.05kHz, 32kbps
- `pcm_44100`: PCM WAV, 44.1kHz (requires Pro tier)

---

### list-voices - Show Available Voices

List all available ElevenLabs voices with characteristics.

**Usage:**
```bash
elevenlabs-tts-tool list-voices
```

**Examples:**
```bash
# List all voices
elevenlabs-tts-tool list-voices

# Filter by gender
elevenlabs-tts-tool list-voices | grep female
elevenlabs-tts-tool list-voices | grep male

# Filter by accent
elevenlabs-tts-tool list-voices | grep British
elevenlabs-tts-tool list-voices | grep American

# Filter by age
elevenlabs-tts-tool list-voices | grep young
elevenlabs-tts-tool list-voices | grep middle_aged

# Combine filters
elevenlabs-tts-tool list-voices | grep "female.*young.*British"
```

**Output:**
```
Voice           Gender     Age          Accent          Description
====================================================================================================
rachel          female     young        American        Calm and friendly American voice...
adam            male       middle_aged  American        Deep, authoritative American male...
charlotte       female     middle_aged  British         Smooth, professional British voice...
...
====================================================================================================
Total: 42 voices available
```

**Popular Voices:**
- **rachel**: Calm, friendly American female (default)
- **adam**: Deep, authoritative American male
- **charlotte**: Professional British female
- **josh**: Young, casual American male
- **bella**: Expressive Italian female

---

### list-models - Show TTS Models

List all available ElevenLabs TTS models with characteristics and use cases.

**Usage:**
```bash
elevenlabs-tts-tool list-models
```

**Examples:**
```bash
# List all models
elevenlabs-tts-tool list-models

# Filter by status
elevenlabs-tts-tool list-models | grep stable
elevenlabs-tts-tool list-models | grep deprecated

# Find low-latency models
elevenlabs-tts-tool list-models | grep -i "ultra-low"

# Find multilingual models
elevenlabs-tts-tool list-models | grep -i "multilingual"
```

**Output:**
Comprehensive model information including:
- Model ID and version
- Quality and latency characteristics
- Language support (mono vs multilingual)
- Character limits
- Best use cases
- Special features (emotions, etc.)

**Key Models:**
- **eleven_turbo_v2_5**: Fast, high-quality (default, best value)
- **eleven_flash_v2_5**: Ultra-low latency (real-time applications)
- **eleven_multilingual_v2**: 29 languages, production quality
- **eleven_v3**: Most expressive with emotion tags (alpha, 2x cost)

**Cost Multipliers:**
- Turbo/Flash models: 1x cost
- Multilingual v2: 1x cost
- v3 models: 2x cost (half the minutes/tokens)

---

### info - Show Subscription Info

Display subscription tier, character usage, quota limits, and historical usage.

**Usage:**
```bash
elevenlabs-tts-tool info [--days N]
```

**Arguments:**
- `--days, -d N`: Number of days of historical usage to display (default: 7)

**Examples:**
```bash
# View subscription with last 7 days of usage
elevenlabs-tts-tool info

# View last 30 days of usage
elevenlabs-tts-tool info --days 30

# Quick quota check (1 day)
elevenlabs-tts-tool info --days 1

# Check usage before long generation
elevenlabs-tts-tool info --days 1 && elevenlabs-tts-tool synthesize "Long text..."
```

**Output Information:**
- Subscription tier and status
- Character usage (used/limit/remaining)
- Quota reset date
- Historical usage breakdown by day
- Average daily usage
- Projected monthly usage
- Warnings when approaching quota limits

**Use Cases:**
- Monitor character quota consumption
- Track usage patterns over time
- Plan when to upgrade subscription tier
- Avoid hitting quota limits unexpectedly
- Identify high-usage periods

---

### update-voices - Update Voice Table

Fetch latest voices from ElevenLabs API and update local lookup table.

**Usage:**
```bash
elevenlabs-tts-tool update-voices [--output PATH]
```

**Arguments:**
- `--output, -o PATH`: Output file path (default: ~/.config/elevenlabs-tts-tool/voices_lookup.json)

**Examples:**
```bash
# Update default voice lookup (user config directory)
elevenlabs-tts-tool update-voices

# Save to

Files: 1

Size: 20.9 KB

Complexity: 23/100

Category: Image & Video

Source: https://github.com/dnvriend/elevenlabs-tts-tool/tree/main/plugins/elevenlabs-tts-tool/skills/elevenlabs-tts-tool

Related in Image & Video

watch

Included

Watch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.

Image & Videoscriptsfeatured

physical-ai-defect-image-generation

Included

Use when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.

Image & Videoscripts

accelint-react-best-practices

Included

React performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.

Image & Videoscripts

elevenlabs-agents

Included

Build conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication

Image & Videoscripts

humanizer

Included

Humanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.

Image & Videoscripts

generating-mermaid-diagrams

Included

Salesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.

Image & Videoscripts