Claude
Skills
Sign in
Back

podium-call-transcript-pipeline

Included with Lifetime
$97 forever

Durable, idempotent ingest pipeline for Podium call transcripts — the layer between Podium's transcript webhook and a downstream RAG/LLM queue. Survives minutes-to-hours transcript latency, partial-then-final overwrites, non-English callers, PII leakage to downstream consumers, queue-write failures, and speaker-diarization loss. Use when wiring Podium phone-call transcripts into an AI-assist queue, hardening an existing ingest against transcript drift, redacting PII before transcripts reach an LLM, or building the durable ack/replay layer that feeds podium-rag-context-bridge. Trigger with "podium call transcripts", "podium transcript webhook", "podium transcript ingest", "podium transcript pii redact", "podium transcript chunking", "podium transcript queue".

AI Agentspodiumcall-transcriptswebhookspii-redactionrag-pipelinelanguage-detectionscripts

What this skill does


# Podium Call Transcript Pipeline

## Overview

Ingest Podium phone-call transcripts and stage them on a downstream queue so an LLM (with RAG context) can assist the team answering the phone. This is not a real-time transcription tool — Podium emits transcripts on a webhook minutes-to-hours after the call ends, and the design assumes that asynchrony. The skill is the durable layer between Podium and the RAG bridge: webhook lands, transcript is verified and de-duplicated, PII is redacted on ingest, speaker structure is preserved, language is detected, and a chunked record is enqueued for the next stage.

The six production failures this skill prevents:

1. **Assuming transcripts arrive in real-time** — they don't. Transcripts land on the webhook minutes to hours after the call ends. Pipelines designed around the call-ended event blocking until transcript availability either time out or hold an HTTP request open for hours. The ingest must be webhook-driven and ack-decoupled.
2. **Partial-transcript update events overwrite the final transcript** — Podium can emit `call.transcript.partial` before `call.transcript.completed`. Naive handlers store the partial as final, and the LLM downstream sees a truncated transcript. The ingest must key on `(transcript_id, event_type)` and only promote a record to "final" on a `completed` event.
3. **No language detection on ingest** — non-English transcripts sent to an English-only LLM produce nonsense answers that the on-phone agent reads to the customer. Detection on ingest routes non-English transcripts to a separate handling path before they reach the RAG layer.
4. **PII leakage to downstream consumers** — call transcripts contain credit card numbers, full phone numbers, addresses, and dates of birth. Once these reach a third-party LLM or RAG vector store they are effectively un-redactable. Redaction must happen on ingest, before the queue write, with an auditable per-redaction log.
5. **Queueing failures lose transcripts permanently** — the webhook handler returns 200 to Podium but the downstream queue write fails. The transcript is gone with no replay path. The ingest must persist the raw transcript to a local durable store **before** acking the webhook; the queue write happens from that durable store with retries.
6. **Missing speaker diarization fields** — Podium's transcript JSON tags each segment with a speaker role (caller vs agent). Flat ingest that concatenates segments destroys the structure the LLM needs. The chunker must be speaker-aware and never split a segment across speakers.

## Authentication

This skill does not authenticate to Podium directly. Two distinct auth paths are involved and **both are consumed by reference** from sibling skills — never re-implemented:

- **Inbound webhook auth** — HMAC signature verification is delegated to `podium-webhook-reliability::verify_webhook(raw, signature)`. The webhook secret lives in the verifier's config. The handler fails closed if the verifier is not importable.
- **Outbound Podium API auth** — the fallback poller acquires an OAuth bearer token via `podium-auth::PodiumAuth.get_token()`. Credentials live in `podium-auth`'s secret store.

The pipeline inherits the auth posture of both skills. Operator checklist when installing:

1. Verify that `podium-auth` and `podium-webhook-reliability` are both installed and configured.
2. Configure `.gitignore` to exclude the inbox database (`*.db`) and the redaction audit log (`redactions.jsonl`).
3. Run a regex grep across the host repo for Podium client-secret formats and Stripe-style live keys (the canonical patterns are listed in `references/implementation.md`) to confirm no inline credentials leaked.
4. Set `PODIUM_TRANSCRIPT_INBOX_PATH` to a writable path with mode 0600 ownership.
5. Configure the downstream queue backend (Redis Streams, SQS, or SQLite-as-queue) before enabling webhook traffic.

## Prerequisites

- Python 3.10+
- `podium-auth` skill installed (consumed for outbound API auth)
- `podium-webhook-reliability` skill installed (consumed for HMAC verification)
- `podium-rate-limit-survival` skill installed (consumed by the fallback poller)
- A durable inbox store — SQLite default; Postgres or DynamoDB are drop-in replacements
- A downstream queue — Redis Streams (default), AWS SQS, or local SQLite-as-queue for dev
- `langdetect` (default) or `fasttext-langdetect` for language detection
- `presidio-analyzer` + `presidio-anonymizer` for high-recall PII, or the bundled regex layer alone

## Instructions

Build in this order. Each section neutralizes one production failure mode.

### 1. Webhook-driven, ack-decoupled ingest

The webhook handler returns 200 fast and does all transcript work asynchronously. The handler's only synchronous job is verify-and-store-raw; everything else happens out-of-band in the processor.

```python
import json, time
from fastapi import FastAPI, Request, HTTPException
from podium_webhook_reliability import verify_webhook   # consumed by reference
from podium_call_transcript_pipeline import inbox

app = FastAPI()

@app.post("/podium/transcripts")
async def transcript_webhook(request: Request):
    raw = await request.body()
    sig = request.headers.get("podium-signature", "")
    if not verify_webhook(raw, sig):
        raise HTTPException(401, "invalid signature")
    event = json.loads(raw)
    if not event.get("type", "").startswith("call.transcript."):
        return {"status": "ignored"}

    # Durable write happens BEFORE returning 200. If this fails, return 5xx so Podium retries.
    inbox.insert(
        transcript_id=event["data"]["transcript_id"],
        event_type=event["type"],
        received_at=time.time(),
        raw_payload=raw,
    )
    return {"status": "accepted"}
```

### 2. Partial-vs-completed de-duplication

Podium emits these event types on a single call:

| Event type | Meaning | Handling |
|---|---|---|
| `call.ended` | Audio capture complete | Note arrival; no transcript yet |
| `call.transcript.partial` | Best-effort transcript while final generates | Store as partial; never promote to final |
| `call.transcript.completed` | Final transcript ready | Promote to final; supersedes any partial |
| `call.transcript.failed` | Transcription failed | Record failure; alert if call duration was material |

The inbox table is keyed on `(transcript_id, event_type)`. A separate `transcripts` table is keyed on `transcript_id` alone. A `completed` event always supersedes a `partial` for the same `transcript_id`. A late-arriving `partial` after `completed` is ignored — the processor checks current status before writing.

### 3. Language detection on ingest

Detect language before redaction (redaction patterns are language-aware downstream). The default policy: English transcripts proceed to the standard RAG queue; non-English transcripts go to a separate queue with a translation step inserted.

```python
from langdetect import detect_langs, DetectorFactory
DetectorFactory.seed = 0   # deterministic detection across runs

def detect_transcript_language(text: str) -> tuple[str, float]:
    if len(text.strip()) < 20:
        return ("und", 0.0)   # too short to detect reliably
    try:
        top = detect_langs(text)[0]
        return (top.lang, top.prob)
    except Exception:
        return ("und", 0.0)
```

Routing rule: English with confidence ≥ 0.85 → `queue:rag.transcripts.en`; confidence < 0.50 → `queue:rag.transcripts.review` (human review); otherwise → per-language queue.

### 4. PII redaction on ingest

Redaction is **non-optional** and happens before the transcript is written to the outbound queue. The redaction is auditable — for every redaction the system records category, character offsets, and the rule's id. Conservative regex layer for high-precision categories; presidio/spaCy for lower-precision recall categories (names, addresses).

```python
import re
from dataclasses import dataclass

@dataclass
class Redaction:
    category: str
    rule_id:

Related in AI Agents