abridge-performance-tuning

Included with Lifetime

$97 forever

Optimize Abridge clinical AI integration performance for high-volume deployments. Use when reducing note generation latency, optimizing audio streaming throughput, improving FHIR push performance, or scaling for multi-site health systems. Trigger: "abridge performance", "abridge latency", "abridge optimization", "abridge slow", "abridge scale".

Image & Videosaashealthcareaiabridgeperformance

What this skill does

# Abridge Performance Tuning

## Overview

Performance optimization for high-volume Abridge deployments. Large health systems process thousands of encounters daily — latency in note generation directly impacts clinical workflow throughput.

## Performance Targets

| Metric | Target | Critical Threshold |
|--------|--------|--------------------|
| Audio stream → first transcript | < 2s | > 5s |
| Encounter → completed note | < 30s | > 60s |
| Note → EHR push | < 3s | > 10s |
| Patient summary generation | < 10s | > 30s |
| Concurrent sessions per org | 100+ | < 50 |

## Instructions

### Step 1: Audio Streaming Optimization

```typescript
// src/performance/audio-optimizer.ts
// Optimize audio chunk size and streaming for lowest latency

interface AudioStreamMetrics {
  chunkSize: number;
  sendInterval: number;
  bufferUtilization: number;
  latencyP50: number;
  latencyP99: number;
}

class OptimizedAudioStream {
  private buffer: Buffer[] = [];
  private metrics: AudioStreamMetrics = {
    chunkSize: 3200,       // 100ms at 16kHz 16-bit mono = 3200 bytes
    sendInterval: 100,     // Send every 100ms
    bufferUtilization: 0,
    latencyP50: 0,
    latencyP99: 0,
  };

  constructor(
    private ws: WebSocket,
    private sampleRate: number = 16000,
  ) {}

  // Optimal chunk size: 100ms for low latency, 500ms for bandwidth efficiency
  processAudioChunk(chunk: Buffer): void {
    this.buffer.push(chunk);

    const totalSize = this.buffer.reduce((sum, b) => sum + b.length, 0);
    if (totalSize >= this.metrics.chunkSize) {
      const combined = Buffer.concat(this.buffer);
      this.buffer = [];

      if (this.ws.readyState === WebSocket.OPEN) {
        const start = performance.now();
        this.ws.send(combined);
        this.recordLatency(performance.now() - start);
      }
    }
  }

  private recordLatency(ms: number): void {
    // Track P50/P99 for monitoring
    this.metrics.latencyP50 = ms; // Simplified — use histogram in production
  }

  getMetrics(): AudioStreamMetrics {
    return { ...this.metrics };
  }
}
```

### Step 2: Note Generation Pipeline Optimization

```typescript
// src/performance/note-pipeline.ts
// Pre-warm note generation and parallelize post-processing

interface PipelineStage {
  name: string;
  durationMs: number;
  parallel: boolean;
}

async function optimizedNotePipeline(
  api: any,
  sessionId: string,
): Promise<{ note: any; metrics: PipelineStage[] }> {
  const stages: PipelineStage[] = [];

  // Stage 1: Finalize session (triggers AI processing)
  const t1 = performance.now();
  await api.post(`/encounters/sessions/${sessionId}/finalize`);
  stages.push({ name: 'finalize', durationMs: performance.now() - t1, parallel: false });

  // Stage 2: Poll with exponential backoff (adaptive polling)
  const t2 = performance.now();
  let pollInterval = 500;  // Start fast
  let note = null;

  for (let i = 0; i < 30; i++) {
    const { data } = await api.get(`/encounters/sessions/${sessionId}/note`);
    if (data.status === 'completed') {
      note = data.note;
      break;
    }
    await new Promise(r => setTimeout(r, pollInterval));
    pollInterval = Math.min(pollInterval * 1.5, 3000); // Back off gradually
  }
  stages.push({ name: 'note_generation', durationMs: performance.now() - t2, parallel: false });

  if (!note) throw new Error('Note generation timed out');

  // Stage 3: Parallel post-processing
  const t3 = performance.now();
  const [patientSummary, ehrResult] = await Promise.allSettled([
    api.post(`/encounters/sessions/${sessionId}/patient-summary`, { language: 'en' }),
    pushNoteToEhr(note),
  ]);
  stages.push({ name: 'post_processing', durationMs: performance.now() - t3, parallel: true });

  return { note, metrics: stages };
}
```

### Step 3: Connection Pooling for FHIR Push

```typescript
// src/performance/connection-pool.ts
import axios from 'axios';
import https from 'https';

// Reuse TCP connections for FHIR endpoint
const fhirAgent = new https.Agent({
  keepAlive: true,
  keepAliveMsecs: 30000,
  maxSockets: 20,          // Max concurrent FHIR connections
  maxFreeSockets: 5,
  minVersion: 'TLSv1.3',
});

const fhirClient = axios.create({
  baseURL: process.env.EPIC_FHIR_BASE_URL,
  httpsAgent: fhirAgent,
  timeout: 10000,
});

// Batch FHIR pushes for multi-encounter processing
async function batchFhirPush(notes: Array<{ docRef: any }>): Promise<void> {
  // FHIR Bundle for batch operations
  const bundle = {
    resourceType: 'Bundle',
    type: 'batch',
    entry: notes.map(n => ({
      resource: n.docRef,
      request: { method: 'POST', url: 'DocumentReference' },
    })),
  };

  await fhirClient.post('/', bundle, {
    headers: { 'Content-Type': 'application/fhir+json' },
  });
}
```

### Step 4: Performance Monitoring Dashboard

```typescript
// src/performance/monitor.ts
interface PerformanceSnapshot {
  timestamp: string;
  activeSessions: number;
  avgNoteLatencyMs: number;
  p99NoteLatencyMs: number;
  fhirPushSuccessRate: number;
  audioStreamDropRate: number;
}

class PerformanceMonitor {
  private noteLatencies: number[] = [];
  private fhirPushResults: boolean[] = [];

  recordNoteLatency(ms: number): void {
    this.noteLatencies.push(ms);
    if (this.noteLatencies.length > 1000) this.noteLatencies.shift();
  }

  recordFhirPush(success: boolean): void {
    this.fhirPushResults.push(success);
    if (this.fhirPushResults.length > 1000) this.fhirPushResults.shift();
  }

  getSnapshot(activeSessions: number): PerformanceSnapshot {
    const sorted = [...this.noteLatencies].sort((a, b) => a - b);
    return {
      timestamp: new Date().toISOString(),
      activeSessions,
      avgNoteLatencyMs: sorted.length ? sorted.reduce((a, b) => a + b, 0) / sorted.length : 0,
      p99NoteLatencyMs: sorted.length ? sorted[Math.floor(sorted.length * 0.99)] : 0,
      fhirPushSuccessRate: this.fhirPushResults.length
        ? this.fhirPushResults.filter(Boolean).length / this.fhirPushResults.length
        : 1,
      audioStreamDropRate: 0, // Populated by audio stream metrics
    };
  }
}
```

## Output

- Optimized audio streaming with 100ms chunking
- Adaptive polling for note generation (500ms → 3s backoff)
- Connection-pooled FHIR batch pushes
- Real-time performance monitoring with P50/P99 latency tracking

## Error Handling

| Issue | Cause | Solution |
|-------|-------|----------|
| High note latency | Complex encounter | Pre-segment long encounters |
| FHIR push timeout | EHR server overloaded | Use connection pool; batch pushes |
| Audio drops | Network jitter | Buffer 500ms; reconnect on drop |

## Resources

- [Abridge Platform](https://www.abridge.com/product)
- [Node.js HTTPS Agent](https://nodejs.org/api/https.html#class-httpsagent)

## Next Steps

For cost optimization, see `abridge-cost-tuning`.

Files: 1

Size: 7.3 KB

Complexity: 18/100

Category: Image & Video

Source: https://github.com/jeremylongshore/claude-code-plugins-plus-skills/tree/main/plugins/saas-packs/abridge-pack/skills/abridge-performance-tuning

Related in Image & Video

watch

Included

Watch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.

Image & Videoscriptsfeatured

physical-ai-defect-image-generation

Included

Use when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.

Image & Videoscripts

accelint-react-best-practices

Included

React performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.

Image & Videoscripts

elevenlabs-agents

Included

Build conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication

Image & Videoscripts

humanizer

Included

Humanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.

Image & Videoscripts

generating-mermaid-diagrams

Included

Salesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.

Image & Videoscripts