deepgram-data-handling
Implement audio data handling best practices for Deepgram integrations. Use when managing audio file storage, implementing data retention, or ensuring GDPR/HIPAA compliance for transcription data. Trigger: "deepgram data", "audio storage", "transcription data", "deepgram GDPR", "deepgram HIPAA", "deepgram privacy", "PII redaction".
What this skill does
# Deepgram Data Handling
## Overview
Best practices for handling audio and transcript data with Deepgram. Covers Deepgram's built-in `redact` parameter for PII, secure audio upload with encryption, transcript storage patterns, data retention policies, and GDPR/HIPAA compliance workflows.
## Data Privacy Quick Reference
| Deepgram Feature | What It Does | Enable |
|-------------------|-------------|--------|
| `redact: ['pci']` | Masks credit card numbers in transcript | Query param |
| `redact: ['ssn']` | Masks Social Security numbers | Query param |
| `redact: ['numbers']` | Masks all numeric sequences | Query param |
| Data retention | Deepgram does NOT store audio or transcripts | Default behavior |
**Deepgram's data policy:** Audio is processed in real-time and not stored. Transcripts are not retained unless you use Deepgram's optional storage features.
## Instructions
### Step 1: Deepgram Built-in PII Redaction
```typescript
import { createClient } from '@deepgram/sdk';
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);
// Deepgram redacts PII directly during transcription
const { result } = await deepgram.listen.prerecorded.transcribeUrl(
{ url: audioUrl },
{
model: 'nova-3',
smart_format: true,
redact: ['pci', 'ssn'], // Credit cards + SSNs -> [REDACTED]
}
);
// Output: "My card is [REDACTED] and SSN is [REDACTED]"
console.log(result.results.channels[0].alternatives[0].transcript);
// For maximum privacy, redact all numbers:
// redact: ['pci', 'ssn', 'numbers']
```
### Step 2: Application-Level PII Redaction
```typescript
// Additional redaction patterns beyond Deepgram's built-in
const piiPatterns: Array<{ name: string; pattern: RegExp; replacement: string }> = [
{ name: 'email', pattern: /\b[\w.-]+@[\w.-]+\.\w{2,}\b/g, replacement: '[EMAIL]' },
{ name: 'phone', pattern: /\b(\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b/g, replacement: '[PHONE]' },
{ name: 'dob', pattern: /\b(0[1-9]|1[0-2])\/.-\/.-\d{2}\b/g, replacement: '[DOB]' },
{ name: 'address', pattern: /\b\d{1,5}\s[\w\s]+(?:Street|St|Avenue|Ave|Road|Rd|Drive|Dr|Lane|Ln|Boulevard|Blvd)\b/gi, replacement: '[ADDRESS]' },
];
function redactPII(text: string): { redacted: string; found: string[] } {
let redacted = text;
const found: string[] = [];
for (const { name, pattern, replacement } of piiPatterns) {
const matches = text.match(pattern);
if (matches) {
found.push(`${name}: ${matches.length} occurrence(s)`);
redacted = redacted.replace(pattern, replacement);
}
}
return { redacted, found };
}
// Usage after Deepgram transcription:
const transcript = result.results.channels[0].alternatives[0].transcript;
const { redacted, found } = redactPII(transcript);
if (found.length > 0) console.log('PII found and redacted:', found);
```
### Step 3: Secure Audio Upload and Storage
```typescript
import { S3Client, PutObjectCommand, GetObjectCommand } from '@aws-sdk/client-s3';
import { getSignedUrl } from '@aws-sdk/s3-request-presigner';
import { createHash, randomUUID } from 'crypto';
import { readFileSync } from 'fs';
const s3 = new S3Client({ region: process.env.AWS_REGION ?? 'us-east-1' });
const BUCKET = process.env.AUDIO_BUCKET!;
async function uploadAudio(filePath: string, metadata: Record<string, string> = {}) {
const audio = readFileSync(filePath);
const checksum = createHash('sha256').update(audio).digest('hex');
const key = `audio/${randomUUID()}-${checksum.substring(0, 8)}.wav`;
await s3.send(new PutObjectCommand({
Bucket: BUCKET,
Key: key,
Body: audio,
ContentType: 'audio/wav',
ServerSideEncryption: 'aws:kms', // Encrypt at rest
Metadata: {
...metadata,
checksum,
uploadedAt: new Date().toISOString(),
},
}));
// Generate presigned URL for Deepgram to fetch (expires in 1 hour)
const presignedUrl = await getSignedUrl(s3,
new GetObjectCommand({ Bucket: BUCKET, Key: key }),
{ expiresIn: 3600 }
);
return { key, checksum, presignedUrl };
}
// Upload -> Get presigned URL -> Send to Deepgram
const { presignedUrl } = await uploadAudio('./recording.wav', { source: 'call-center' });
const { result } = await deepgram.listen.prerecorded.transcribeUrl(
{ url: presignedUrl },
{ model: 'nova-3', smart_format: true, redact: ['pci', 'ssn'] }
);
```
### Step 4: Transcript Storage Pattern
```typescript
interface StoredTranscript {
id: string;
audioKey: string; // S3 reference
requestId: string; // Deepgram request_id
transcript: string; // Redacted text
confidence: number;
duration: number; // Audio duration in seconds
model: string;
speakers: number;
utterances?: Array<{
speaker: number;
text: string;
start: number;
end: number;
}>;
metadata: {
redacted: boolean;
piiTypesFound: string[];
createdAt: string;
retentionPolicy: 'standard' | 'legal_hold' | 'hipaa';
expiresAt: string;
};
}
function buildTranscriptRecord(
audioKey: string,
result: any,
retentionDays = 90
): StoredTranscript {
const alt = result.results.channels[0].alternatives[0];
const { redacted, found } = redactPII(alt.transcript);
return {
id: randomUUID(),
audioKey,
requestId: result.metadata.request_id,
transcript: redacted,
confidence: alt.confidence,
duration: result.metadata.duration,
model: Object.keys(result.metadata.model_info ?? {})[0] ?? 'unknown',
speakers: new Set(alt.words?.map((w: any) => w.speaker).filter(Boolean)).size,
utterances: result.results.utterances?.map((u: any) => ({
speaker: u.speaker,
text: u.transcript,
start: u.start,
end: u.end,
})),
metadata: {
redacted: found.length > 0,
piiTypesFound: found,
createdAt: new Date().toISOString(),
retentionPolicy: 'standard',
expiresAt: new Date(Date.now() + retentionDays * 86400000).toISOString(),
},
};
}
```
### Step 5: Data Retention Policies
```typescript
const retentionPolicies = {
standard: { days: 90, description: 'Default retention' },
legal_hold: { days: 2555, description: '7 years for legal' },
hipaa: { days: 2190, description: '6 years per HIPAA' },
temp: { days: 7, description: 'Temporary processing' },
};
async function enforceRetention(db: any, s3Client: S3Client, bucket: string) {
const now = new Date();
// Find expired transcripts
const expired = await db.query(
'SELECT id, audio_key FROM transcripts WHERE expires_at < $1 AND retention_policy != $2',
[now.toISOString(), 'legal_hold']
);
console.log(`Found ${expired.rows.length} expired transcripts`);
for (const row of expired.rows) {
// Delete audio from S3
try {
await s3Client.send(new DeleteObjectCommand({
Bucket: bucket, Key: row.audio_key,
}));
} catch (err: any) {
console.error(`S3 delete failed for ${row.audio_key}:`, err.message);
}
// Delete transcript from database
await db.query('DELETE FROM transcripts WHERE id = $1', [row.id]);
console.log(`Deleted: ${row.id}`);
}
return expired.rows.length;
}
```
### Step 6: GDPR Right to Erasure
```typescript
async function processErasureRequest(userId: string, db: any, s3Client: S3Client, bucket: string) {
console.log(`Processing GDPR erasure request for user: ${userId}`);
// 1. Find all user transcripts
const transcripts = await db.query(
'SELECT id, audio_key FROM transcripts WHERE user_id = $1', [userId]
);
// 2. Delete audio files from S3
for (const row of transcripts.rows) {
if (row.audio_key) {
await s3Client.send(new DeleteObjectCommand({ Bucket: bucket, Key: row.audio_key }));
}
}
// 3. Delete transcripts from database
const deleted = await db.query('DELETE FROM transcripts WHERE user_id = $1', [userId]);
// 4. Delete user metadata
await db.query('DELETE FROM user_metadata WHERE user_id = $1', [userId]);
// 5. ARelated in Image & Video
watch
IncludedWatch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.
physical-ai-defect-image-generation
IncludedUse when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.
accelint-react-best-practices
IncludedReact performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.
elevenlabs-agents
IncludedBuild conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication
humanizer
IncludedHumanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.
generating-mermaid-diagrams
IncludedSalesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.