podium-call-transcript-pipeline
Durable, idempotent ingest pipeline for Podium call transcripts — the layer between Podium's transcript webhook and a downstream RAG/LLM queue. Survives minutes-to-hours transcript latency, partial-then-final overwrites, non-English callers, PII leakage to downstream consumers, queue-write failures, and speaker-diarization loss. Use when wiring Podium phone-call transcripts into an AI-assist queue, hardening an existing ingest against transcript drift, redacting PII before transcripts reach an LLM, or building the durable ack/replay layer that feeds podium-rag-context-bridge. Trigger with "podium call transcripts", "podium transcript webhook", "podium transcript ingest", "podium transcript pii redact", "podium transcript chunking", "podium transcript queue".
What this skill does
# Podium Call Transcript Pipeline
## Overview
Ingest Podium phone-call transcripts and stage them on a downstream queue so an LLM (with RAG context) can assist the team answering the phone. This is not a real-time transcription tool — Podium emits transcripts on a webhook minutes-to-hours after the call ends, and the design assumes that asynchrony. The skill is the durable layer between Podium and the RAG bridge: webhook lands, transcript is verified and de-duplicated, PII is redacted on ingest, speaker structure is preserved, language is detected, and a chunked record is enqueued for the next stage.
The six production failures this skill prevents:
1. **Assuming transcripts arrive in real-time** — they don't. Transcripts land on the webhook minutes to hours after the call ends. Pipelines designed around the call-ended event blocking until transcript availability either time out or hold an HTTP request open for hours. The ingest must be webhook-driven and ack-decoupled.
2. **Partial-transcript update events overwrite the final transcript** — Podium can emit `call.transcript.partial` before `call.transcript.completed`. Naive handlers store the partial as final, and the LLM downstream sees a truncated transcript. The ingest must key on `(transcript_id, event_type)` and only promote a record to "final" on a `completed` event.
3. **No language detection on ingest** — non-English transcripts sent to an English-only LLM produce nonsense answers that the on-phone agent reads to the customer. Detection on ingest routes non-English transcripts to a separate handling path before they reach the RAG layer.
4. **PII leakage to downstream consumers** — call transcripts contain credit card numbers, full phone numbers, addresses, and dates of birth. Once these reach a third-party LLM or RAG vector store they are effectively un-redactable. Redaction must happen on ingest, before the queue write, with an auditable per-redaction log.
5. **Queueing failures lose transcripts permanently** — the webhook handler returns 200 to Podium but the downstream queue write fails. The transcript is gone with no replay path. The ingest must persist the raw transcript to a local durable store **before** acking the webhook; the queue write happens from that durable store with retries.
6. **Missing speaker diarization fields** — Podium's transcript JSON tags each segment with a speaker role (caller vs agent). Flat ingest that concatenates segments destroys the structure the LLM needs. The chunker must be speaker-aware and never split a segment across speakers.
## Authentication
This skill does not authenticate to Podium directly. Two distinct auth paths are involved and **both are consumed by reference** from sibling skills — never re-implemented:
- **Inbound webhook auth** — HMAC signature verification is delegated to `podium-webhook-reliability::verify_webhook(raw, signature)`. The webhook secret lives in the verifier's config. The handler fails closed if the verifier is not importable.
- **Outbound Podium API auth** — the fallback poller acquires an OAuth bearer token via `podium-auth::PodiumAuth.get_token()`. Credentials live in `podium-auth`'s secret store.
The pipeline inherits the auth posture of both skills. Operator checklist when installing:
1. Verify that `podium-auth` and `podium-webhook-reliability` are both installed and configured.
2. Configure `.gitignore` to exclude the inbox database (`*.db`) and the redaction audit log (`redactions.jsonl`).
3. Run a regex grep across the host repo for Podium client-secret formats and Stripe-style live keys (the canonical patterns are listed in `references/implementation.md`) to confirm no inline credentials leaked.
4. Set `PODIUM_TRANSCRIPT_INBOX_PATH` to a writable path with mode 0600 ownership.
5. Configure the downstream queue backend (Redis Streams, SQS, or SQLite-as-queue) before enabling webhook traffic.
## Prerequisites
- Python 3.10+
- `podium-auth` skill installed (consumed for outbound API auth)
- `podium-webhook-reliability` skill installed (consumed for HMAC verification)
- `podium-rate-limit-survival` skill installed (consumed by the fallback poller)
- A durable inbox store — SQLite default; Postgres or DynamoDB are drop-in replacements
- A downstream queue — Redis Streams (default), AWS SQS, or local SQLite-as-queue for dev
- `langdetect` (default) or `fasttext-langdetect` for language detection
- `presidio-analyzer` + `presidio-anonymizer` for high-recall PII, or the bundled regex layer alone
## Instructions
Build in this order. Each section neutralizes one production failure mode.
### 1. Webhook-driven, ack-decoupled ingest
The webhook handler returns 200 fast and does all transcript work asynchronously. The handler's only synchronous job is verify-and-store-raw; everything else happens out-of-band in the processor.
```python
import json, time
from fastapi import FastAPI, Request, HTTPException
from podium_webhook_reliability import verify_webhook # consumed by reference
from podium_call_transcript_pipeline import inbox
app = FastAPI()
@app.post("/podium/transcripts")
async def transcript_webhook(request: Request):
raw = await request.body()
sig = request.headers.get("podium-signature", "")
if not verify_webhook(raw, sig):
raise HTTPException(401, "invalid signature")
event = json.loads(raw)
if not event.get("type", "").startswith("call.transcript."):
return {"status": "ignored"}
# Durable write happens BEFORE returning 200. If this fails, return 5xx so Podium retries.
inbox.insert(
transcript_id=event["data"]["transcript_id"],
event_type=event["type"],
received_at=time.time(),
raw_payload=raw,
)
return {"status": "accepted"}
```
### 2. Partial-vs-completed de-duplication
Podium emits these event types on a single call:
| Event type | Meaning | Handling |
|---|---|---|
| `call.ended` | Audio capture complete | Note arrival; no transcript yet |
| `call.transcript.partial` | Best-effort transcript while final generates | Store as partial; never promote to final |
| `call.transcript.completed` | Final transcript ready | Promote to final; supersedes any partial |
| `call.transcript.failed` | Transcription failed | Record failure; alert if call duration was material |
The inbox table is keyed on `(transcript_id, event_type)`. A separate `transcripts` table is keyed on `transcript_id` alone. A `completed` event always supersedes a `partial` for the same `transcript_id`. A late-arriving `partial` after `completed` is ignored — the processor checks current status before writing.
### 3. Language detection on ingest
Detect language before redaction (redaction patterns are language-aware downstream). The default policy: English transcripts proceed to the standard RAG queue; non-English transcripts go to a separate queue with a translation step inserted.
```python
from langdetect import detect_langs, DetectorFactory
DetectorFactory.seed = 0 # deterministic detection across runs
def detect_transcript_language(text: str) -> tuple[str, float]:
if len(text.strip()) < 20:
return ("und", 0.0) # too short to detect reliably
try:
top = detect_langs(text)[0]
return (top.lang, top.prob)
except Exception:
return ("und", 0.0)
```
Routing rule: English with confidence ≥ 0.85 → `queue:rag.transcripts.en`; confidence < 0.50 → `queue:rag.transcripts.review` (human review); otherwise → per-language queue.
### 4. PII redaction on ingest
Redaction is **non-optional** and happens before the transcript is written to the outbound queue. The redaction is auditable — for every redaction the system records category, character offsets, and the rule's id. Conservative regex layer for high-precision categories; presidio/spaCy for lower-precision recall categories (names, addresses).
```python
import re
from dataclasses import dataclass
@dataclass
class Redaction:
category: str
rule_id:Related in AI Agents
skill-development
IncludedComprehensive meta-skill for creating, managing, validating, auditing, and distributing Claude Code skills and slash commands (unified in v2.1.3+). Provides skill templates, creation workflows, validation patterns, audit checklists, naming conventions, YAML frontmatter guidance, progressive disclosure examples, and best practices lookup. Use when creating new skills, validating existing skills, auditing skill quality, understanding skill architecture, needing skill templates, learning about YAML frontmatter requirements, progressive disclosure patterns, tool restrictions (allowed-tools), skill composition, skill naming conventions, troubleshooting skill activation issues, creating custom slash commands, configuring command frontmatter, using command arguments ($ARGUMENTS, $1, $2), bash execution in commands, file references in commands, command namespacing, plugin commands, MCP slash commands, Skill tool configuration, or deciding between skills vs slash commands. Delegates to docs-management skill for official documentation.
reprompter
IncludedTransform messy prompts into well-structured, effective prompts — single or multi-agent. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags and best practices, "reprompter teams", "repromptception", "run with quality", "smart run", "smart agents", multi-agent tasks, audits, parallel work, anything going to agent teams. Don't use when: simple Q&A, pure chat, immediate execution-only tasks. See "Don't Use When" section for details. Outputs: Structured XML/Markdown prompt, quality score (before/after), optional team brief + per-agent sub-prompts, agent team output files. Success criteria: Single mode quality score ≥ 7/10; Repromptception per-agent prompt quality score 8+/10; all required sections present, actionable and specific.
adaptive-compaction
IncludedAdaptive add-on policy and recovery layer that decides WHEN to compact, prune, snapshot, or fork -- replacing fixed-percent auto-compaction across Claude Code, Codex, and MCP-capable hosts. Trigger on auto-compact timing or damage: "when should I compact", "is it safe to compact now or start a fresh session", "auto-compact fires too early/mid-task", "switching to an unrelated task but the window still has space", "context rot", "answers get worse the longer the session runs", "the agent forgot the plan or my decisions after it summarized", "add a layer on top that manages context without changing the agent", raising autoCompactWindow to give the policy room, or installing/tuning a cross-tool compaction policy or PreCompact hook -- even when "compaction" is never said but the problem is context-window pressure or post-summarization memory loss. Do NOT use to summarize a conversation, build RAG, write a summarization prompt (decides WHEN not HOW), or answer max-context-length trivia.
agent-skill-creator
IncludedCreate cross-platform agent skills from workflow descriptions. Activates when users ask to create an agent, automate a repetitive workflow, create a custom skill, or need advanced agent creation. Triggers on phrases like create agent for, automate workflow, create skill for, every day I have to, daily I need to, turn process into agent, need to automate, create a cross-platform skill, validate this skill, export this skill, migrate this skill. Supports single skills, multi-agent suites, transcript processing, template-based creation, interactive configuration, cross-platform export, and spec validation.
llm-wiki
IncludedUse when building or maintaining a persistent personal knowledge base (second brain) in Obsidian where an LLM incrementally ingests sources, updates entity/concept pages, maintains cross-references, and keeps a synthesis current. Triggers include "second brain", "Obsidian wiki", "personal knowledge management", "ingest this paper/article/book", "build a research wiki", "compound knowledge", "Memex", or whenever the user wants knowledge to accumulate across sessions instead of being re-derived by RAG on every query.
skill-master
IncludedAgent Skills authoring, evaluation, and optimization. Create, edit, validate, benchmark, and improve skills following the agentskills.io specification. Use when designing SKILL.md files, structuring skill folders (references, scripts, assets), ingesting external documentation into skills, running trigger evals, benchmarking skill quality, optimizing descriptions, or performing blind A/B comparisons. Keywords: agentskills.io, SKILL.md, skill authoring, eval, benchmark, trigger optimization.