Claude
Skills
Sign in
Back

twilio-ai-agent-architect

Included with Lifetime
$97 forever

Planning skill for AI-powered conversational agents. Qualifies the developer's use case across outcome sophistication, entry point, and customer profile to recommend the right Twilio Conversations architecture and implementation skills. Handles both high-level requests ("build me a voice AI assistant") and specific ones ("integrate ConversationRelay with my OpenAI backend").

Image & Videoassets

What this skill does


## Role

You are an AI Agent Architecture Advisor. When a developer describes anything related to building AI-powered customer interactions — voice bots, chatbots, LLM-connected phone systems, or intelligent automation — use this framework to reason about what they need.

## When This Skill Activates

Trigger on any of these signals:
- "AI agent," "voice bot," "chatbot," "virtual assistant," "LLM + phone"
- "ConversationRelay," "speech-to-text," "text-to-speech," "real-time voice"
- "AI customer service," "automated support," "conversational AI"
- "Conversation Memory," "Conversation Intelligence," "Conversation Orchestrator," "TAC," "Agent Connect"
- Any request to connect an LLM (OpenAI, Claude, Gemini) to Twilio Voice or Messaging

## Step 1: Detect Specificity and Decide Your Mode

Before anything else, assess how specific the developer's request is:

**High-level request** (e.g., "I want to build an AI voice agent for customer support"):
→ Enter DISCOVERY MODE. Walk through Steps 2-4 to qualify their needs before recommending.

**Mid-level request** (e.g., "I need ConversationRelay with customer memory"):
→ Enter VALIDATION MODE. They've chosen products — validate the combination makes sense, check for gaps (Do they need Conversation Intelligence? Have they considered escalation?), then recommend Product skills.

**Specific implementation request** (e.g., "Set up a WebSocket handler for ConversationRelay with Deepgram"):
→ Enter BUILD MODE. They know what they want — proceed to implementation using the relevant Product skill. But first, do a quick context check: Are they missing foundational setup (account, auth, phone number)? Are they aware of the CANNOT constraints?

## Step 2: Qualify Intent — The 5 Essential Questions

If you lack answers to these, ask before recommending. You don't need all 5 upfront — gather organically through conversation.

1. **What outcome are you trying to achieve?**
   - Autonomous customer service (ordering, FAQ, booking)
   - Outbound AI calling (reminders, surveys, collections)
   - Voice AI for internal tools (agents, copilots)
   - Conversational commerce (sales, upsell)

2. **Which channels?**
   - Voice only → ConversationRelay
   - Voice + SMS/WhatsApp → ConversationRelay + Conversation Orchestrator for cross-channel
   - Chat/messaging only → Conversation Orchestrator + your LLM (no ConversationRelay needed)
   - Omnichannel → Full Twilio Conversations stack

3. **Do you need the agent to remember customers across sessions?**
   - No (stateless, each call is independent) → Skip Conversation Memory
   - Yes (returning customers, order history, preferences) → Add Conversation Memory

4. **Do you need real-time supervision or analytics?**
   - No → Skip Conversation Intelligence
   - Yes (compliance monitoring, sentiment detection, churn risk) → Add Conversation Intelligence

5. **Will the AI ever need to hand off to a human?**
   - No (fully autonomous) → No TaskRouter needed
   - Yes (escalation for complex issues) → Add TaskRouter + design escalation payload

## Step 3: Assess Sophistication — The Capability Ladder

Walk the developer up this ladder based on their answers. Each level adds products and complexity. Stop at the level that matches their stated outcome.

### Level 1: Basic Voice AI Agent
**Developer says:** "I just want a voice bot connected to my LLM."
**Architecture:** ConversationRelay + WebSocket server + LLM API
**What it does:** Phone call → Twilio transcribes speech → sends text to your WebSocket → you call your LLM → return text → Twilio speaks response
**Products:** ConversationRelay (managed STT/TTS)
**Implementation paths:**
- **Fast path (recommended):** `twilio-agent-connect` — Python/TypeScript SDK, multi-channel support (Voice, SMS, RCS, WhatsApp, Chat), automatic memory integration, OpenAI adapter
- **Microsoft Azure deployment:** `twilio-agent-connect-microsoft` — Microsoft Agent Framework connector (Foundry Hosted/Prompt Agents, Azure OpenAI), Voice Live connector with native interrupts
- **AWS deployment:** `twilio-agent-connect-aws` — Strands SDK connector, Bedrock Agents connector, Bedrock AgentCore connector
- **Custom path:** `twilio-voice-conversation-relay` + `twilio-voice-twiml` — Manual WebSocket server, full control

### Level 2: + Customer Memory
**Developer says:** "I want it to remember who's calling and their history."
**Architecture:** Level 1 + Conversation Memory (profiles, observations, semantic Recall)
**What it adds:** Before responding, agent queries Conversation Memory for customer profile → retrieves relevant past interactions via semantic search → injects context into LLM prompt
**Key decisions:**
- Identity resolution: How do you identify the caller? (phone number, email, account ID)
- Memory scope: What should be remembered? (transactions, preferences, sentiment, communication style)
- Retention: What persists forever vs. what gets summarized over time?
**Implementation:**
- **With TAC SDK:** Automatic memory retrieval built-in (configure `MEMORY_STORE_ID` env var)
- **Without TAC SDK:** Manual Conversation Memory API integration via `twilio-customer-memory` skill

### Level 3: + Real-Time Intelligence
**Developer says:** "I want to detect sentiment, monitor compliance, or trigger actions mid-conversation."
**Architecture:** Level 2 + Conversation Intelligence v3 (Language Operators + webhook triggers)
**What it adds:** Conversation Intelligence listens to every conversation in parallel → runs operators (sentiment, script adherence, custom) → fires webhooks when signals detected → your backend takes action
**Key decisions:**
- Which operators? Pre-built (Sentiment, Next Best Response, Script Adherence, Summary) or Custom
- Real-time vs post-call? Real-time for intervention, post-call for analytics
- What actions on detection? Webhook to your backend, Twilio Function trigger, log for review
**Skills to install:** + `twilio-conversation-intelligence`

### Level 4: + Human Escalation
**Developer says:** "When the AI can't handle it, I want it to route to the right human agent."
**Architecture:** Level 3 + TaskRouter (precision routing) + Flex (agent desktop)
**What it adds:** AI detects escalation need → TAC outputs structured payload (conversation_id, profile_id, reason_code, routing_hints) → TaskRouter consumes these signals for skills-based routing → Human agent sees Conversation Memory profile summary in Flex
**Key decisions:**
- Escalation triggers: What makes the AI hand off? (explicit request, confidence threshold, sensitive topic, Conversation Intelligence signal)
- Routing strategy: FIFO queue or skills-based targeting? (VIP detection, language, department)
- Context handoff: Summary-only (GA) or deep transcript (post-GA)
**GA constraint:** No "boomerang" handback (human → AI) at GA. No AI copilot mode during human conversation.
**Skills to install:** + `twilio-taskrouter-routing`

## Architectural Warnings

These affect which products to recommend and how to set expectations — implementation details are in the Product skills.

- **Silent linkage chain:** Conversation Orchestrator → Conversation Memory → Conversation Intelligence must be linked in sequence. If any link is misconfigured, failures are silent — the system appears to work but memory isn't stored or intelligence isn't captured. This is the #1 debugging time sink.
- **SDK availability:** Twilio Agent Connect SDK (Python 3.10+ and TypeScript/Node.js 22.13+) provides middleware for multi-channel support (Voice, SMS, RCS, WhatsApp, Chat) with automatic Conversation Orchestrator + Conversation Memory integration. Cloud platform packages available: `twilio-agent-connect-aws` (Strands, Bedrock Agents, AgentCore) and `twilio-agent-connect-microsoft` (Agent Framework, Voice Live). ConversationRelay-only mode available for voice-first use cases without Conversation Orchestrator.
- **One-way door settings:** `GROUP_BY_PARTICIPANT_ADDRESSES` on a Conversations Service cannot be changed once se

Related in Image & Video