speak-core-workflow-a
Execute Speak primary workflow: AI Conversation Practice with real-time feedback. Use when implementing conversation practice features, building AI tutor interactions, or core language learning dialogue systems. Trigger with phrases like "speak conversation practice", "speak AI tutor", "speak dialogue", "primary speak workflow".
What this skill does
# Speak Core Workflow A: AI Conversation Practice
## Overview
Primary workflow for Speak: AI-powered conversation practice with real-time pronunciation feedback and adaptive tutoring. Speak uses GPT-4o for conversation generation and OpenAI's Realtime API for speech processing, delivering sub-second response times.
## Prerequisites
- Completed `speak-install-auth` setup
- Valid API credentials configured
- Audio handling capabilities (microphone or pre-recorded files)
## Instructions
### Step 1: Start a Conversation Session
```typescript
import { SpeakClient } from '@speak/language-sdk';
const client = new SpeakClient({
apiKey: process.env.SPEAK_API_KEY!,
appId: process.env.SPEAK_APP_ID!,
language: 'es',
});
// Start a restaurant ordering scenario in Spanish
const session = await client.startConversation({
scenario: 'ordering-food',
language: 'es',
level: 'intermediate',
nativeLanguage: 'en',
maxTurns: 10,
feedbackDetail: 'phoneme', // 'word' or 'phoneme'
});
console.log('Session started:', session.id);
console.log('AI Tutor:', session.firstPrompt.text);
// "Bienvenido al restaurante. Soy tu camarero. Que le gustaria ordenar?"
```
### Step 2: Send Student Responses
```typescript
// Submit audio for pronunciation scoring
const turn1 = await client.sendTurn(session.id, {
audioPath: './recordings/student-response-1.wav',
});
console.log('Tutor:', turn1.tutorText);
console.log('Pronunciation:', turn1.pronunciationScore); // 0-100
console.log('Grammar:', turn1.corrections);
// [{original: "yo quiero", suggestion: "quisiera", note: "More polite form for ordering"}]
console.log('Vocabulary:', turn1.vocabularyNotes);
// ["camarero = waiter", "ordenar = to order"]
// Or submit text (skips pronunciation scoring)
const turn2 = await client.sendTurn(session.id, {
text: 'Quisiera una ensalada y un vaso de agua, por favor.',
});
```
### Step 3: Conversation Loop with Progress Tracking
```typescript
async function runConversationLesson(
client: SpeakClient,
scenario: string,
language: string,
level: string,
) {
const session = await client.startConversation({
scenario, language, level, nativeLanguage: 'en',
});
const turns: TurnResult[] = [];
let isComplete = false;
while (!isComplete && turns.length < 10) {
// Display tutor prompt
const prompt = turns.length === 0
? session.firstPrompt.text
: turns[turns.length - 1].tutorText;
console.log(`\nTutor: ${prompt}`);
// Get student audio (mic input or file)
const audioPath = await recordStudentAudio();
// Submit and get feedback
const turn = await client.sendTurn(session.id, { audioPath });
turns.push(turn);
// Show feedback
if (turn.pronunciationScore < 60) {
console.log(`Pronunciation needs work: ${turn.pronunciationScore}/100`);
console.log('Try again with this phrase.');
}
if (turn.corrections.length > 0) {
console.log('Grammar notes:', turn.corrections.map(c => c.note).join('; '));
}
isComplete = turn.sessionComplete;
}
// End session and get summary
const summary = await client.endSession(session.id);
return summary;
}
```
### Step 4: Multi-Topic Session
```typescript
const topics = ['greetings', 'directions', 'ordering-food', 'shopping'];
const results: SessionSummary[] = [];
for (const topic of topics) {
console.log(`\n=== ${topic.toUpperCase()} ===`);
const summary = await runConversationLesson(client, topic, 'es', 'intermediate');
results.push(summary);
console.log(`Score: ${summary.avgPronunciationScore}/100`);
}
// Overall progress report
console.log('\n=== Session Report ===');
console.table(results.map(r => ({
topic: r.scenario,
pronunciation: r.avgPronunciationScore,
grammar: r.grammarAccuracy + '%',
newWords: r.newWords.length,
duration: r.durationMinutes + 'min',
})));
```
### Topic Categories
| Category | Scenarios | Level |
|----------|-----------|-------|
| Daily Life | greetings, introductions, weather | Beginner |
| Travel | directions, hotel, airport, transport | Beginner-Intermediate |
| Food & Drink | ordering-food, grocery, cooking | Intermediate |
| Business | meeting, presentation, negotiation | Intermediate-Advanced |
| Social | party, dating, opinions, debate | Advanced |
## Output
- Conversation session with AI tutor
- Real-time pronunciation feedback (0-100 score)
- Grammar corrections and suggestions
- Vocabulary notes for new words
- Session summary with progress metrics
## Error Handling
| Error | Cause | Solution |
|-------|-------|----------|
| Session timeout | Exceeded 30 min | Auto-end with summary, start new session |
| Audio processing failed | Invalid format | Convert to WAV 16kHz mono |
| Tutor not responding | API latency | Implement 10s timeout with retry |
| Recognition failed | Poor audio quality | Prompt user to re-record in quiet environment |
## Resources
- [Speak Website](https://speak.com)
- [Speak GPT-4 Blog](https://speak.com/blog/speak-gpt-4)
- [OpenAI Realtime API](https://platform.openai.com/docs/guides/realtime)
- [Live Roleplays](https://speak.com/blog/live-roleplays)
## Next Steps
For pronunciation-focused training, see `speak-core-workflow-b`.
## Examples
**Quick test**: Start a `greetings` scenario with `level: 'beginner'`, send 3 text responses, end session, and review the summary scores.
**Full lesson**: Run 4 topics in sequence, track pronunciation improvement across topics, and generate a progress report.
Related in General
modeling-omnistudio-epc-catalog
IncludedSalesforce Industries CME EPC product-modeling skill for Product2-based catalog creation. Use when creating EPC products, configuring product attributes, building offer bundles with Product Child Items, or reviewing EPC DataPack JSON metadata for product catalog changes. TRIGGER when: user creates or updates Product2 EPC records, AttributeAssignment payloads, AttributeMetadata/AttributeDefaultValues, Offer bundles, or ProductChildItem relationships. DO NOT TRIGGER when: designing OmniScripts/FlexCards/Integration Procedures (use building-omnistudio-omniscript, building-omnistudio-flexcard, or building-omnistudio-integration-procedure), implementing Apex business logic (use generating-apex), or troubleshooting deployment pipelines (use deploying-metadata).
relationship-science-coach
IncludedUse this skill for direct, practical adult relationship coaching: couples conflict, repair, trust, marriage, dating, flirting, attachment patterns, emotional connection, sex, desire differences, eroticism, kink negotiation, affection, love languages, breakups, and long-term passion. Draw on Gottman, EFT and Hold Me Tight, attachment science, modern sex research, Perel, Nagoski, Kerner, Schnarch, Love and Stosny, and flexible love-language tools. Be concrete and low-hedge. Redirect only for imminent danger, abuse, coercive control, minors, non-consent, self-harm, stalking, or medical/legal/psychiatric decisions.
building-sf-integrations
IncludedSalesforce integration architecture and runtime plumbing with 120-point scoring. Use this skill to set up Named Credentials, External Credentials, External Services, REST/SOAP callout patterns, Platform Events, and Change Data Capture. TRIGGER when: user sets up Named Credentials, External Services, REST/SOAP callouts, Platform Events, CDC, or touches .namedCredential-meta.xml files. DO NOT TRIGGER when: Connected App/OAuth config (use configuring-connected-apps), Apex-only logic (use generating-apex), or data import/export (use handling-sf-data).
venue-templates
IncludedAccess comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.
let-fate-decide
IncludedDraws the 12 Houses of the Zodiac Tarot spread to inject entropy into planning when prompts are vague, ambiguous, or casually delegated. Interprets the spread to guide next steps. Use when the user says 'let fate decide', 'YOLO', 'whatever', 'idk', or other nonchalant phrases, makes Yu-Gi-Oh references, or when you are about to arbitrarily pick between multiple reasonable approaches. Prefer over ask-questions-if-underspecified when the user's tone is casual or playful rather than precision-seeking.
net-ops
IncludedCross-platform network troubleshooting (Windows, macOS, Linux) via local or remote shell. Use for: DNS broken, can't resolve hostnames, nslookup/dig works but apps fail, NRPT, WFP, scutil, /etc/resolver, systemd-resolved, /etc/resolv.conf, NetworkManager, VPN DNS leak residue (ProtonVPN/Mullvad/WireGuard/AnyConnect), AV/firewall blocking DNS or DoH, Tailscale DNS interaction, intermittent connectivity, remote diagnostics over SSH.