Claude
Skills
Sign in
Back

podium-rate-limit-survival

Included with Lifetime
$97 forever

Survive the rate-limit failure modes that crater production Podium integrations — cascading 429s that burn the daily quota by lunch, ignored `Retry-After` hints, silent daily-quota breaches, per-endpoint budget exhaustion, end-of-day review-request bursts, and webhook-driven outbound amplification. Use when building the outbound API layer, instrumenting quota monitoring, smoothing end-of-day review-request bursts, or recovering from a 429 cascade. Trigger with "podium rate limit", "podium 429", "podium token bucket", "podium quota monitor", "podium burst smoothing", "podium retry-after".

Backend & APIspodiumrate-limitstoken-bucketquota-monitoringresilienceburst-controlscripts

What this skill does


# Podium Rate Limit Survival

## Overview

Make the outbound side of a Podium integration survive a real production day. This is not a "just retry on 429" walkthrough — it is the rate-limiting code your integration runs when Shopify ships 80 orders at 5pm AEST and KombiLife fires 80 review-request POSTs in 30 seconds, when an inbound webhook burst fans out 5x outbound, and when a junior engineer's naive retry loop has already eaten 92% of the daily quota by 10:30am.

The six production failures this skill prevents:

1. **Cascading 429s burn the whole day** — a naive `while status == 429: retry` loop stampedes the per-minute window for the rest of the minute, then the next minute, etc. By 11am you've consumed the 24-hour quota and every endpoint is hard-down until UTC midnight.
2. **`Retry-After` header ignored** — clients that retry on a fixed delay (or worse, no delay) miss Podium's server-side hint and hit the same rate wall again. The header supports both integer seconds and HTTP-date form; many clients parse one and crash on the other.
3. **No daily-quota monitor** — the 24-hour envelope quota is silent until you breach it. Operations discover the wall on a Friday afternoon when review-request automation collapses and the on-call has no leading indicator.
4. **No per-endpoint isolation** — the `conversations.write` endpoint blows its budget on a chatty inbound webhook; `contacts.read` also fails because the client treats the API as a single bucket. One endpoint family taking down siblings is a multiplier on every other failure mode.
5. **End-of-day burst overflow** — Shopify orders ship in a 5pm cluster, KombiLife fires ~80 review-request POSTs in 30 seconds, the per-minute ceiling rejects half. The integration "works" 23 hours a day and silently drops 30-50% of review requests during the only hour that matters commercially.
6. **Webhook-driven amplification** — one inbound webhook triggers 5 outbound API calls; 100 inbound webhooks in a burst = 500 outbound = quota collapse. The amplification factor is invisible until the cascade fires.

## Authentication

This skill **does not** mint, refresh, or hold Podium credentials — those concerns live in the sibling `podium-auth` skill. Every wrapped HTTP call in this skill calls `auth.get_token()` immediately after the bucket releases, where `auth` is a `PodiumAuth` instance constructed by the consumer per the [podium-auth SKILL.md](../podium-auth/SKILL.md) instructions (OAuth2 refresh-token grant against `https://accounts.podium.com/oauth/token`). The bearer token is passed in the `Authorization: Bearer {token}` header on every `api.podium.com` request. If `auth.get_token()` raises, this skill propagates the auth error to the caller without retry — auth recovery is `podium-auth`'s responsibility, not this skill's.

## Prerequisites

- A working `podium-auth` integration (this skill assumes a `PodiumAuth` instance is available — see the [podium-auth skill](../podium-auth/SKILL.md) in this pack)
- Python 3.10+ with `asyncio` (the patterns translate to Node.js; see references/implementation.md)
- A token-bucket library — `aiolimiter` recommended, or hand-rolled on `asyncio.sleep`
- A daily-quota counter store — Redis preferred (atomic INCR + TTL), local SQLite acceptable for single-process integrations
- Knowledge of which Podium endpoint families your integration hits (conversations, contacts, reviews, locations, webhooks) — bucket isolation is per-family

## Instructions

Build in this order. Each section neutralizes one of the six production failures.

### 1. Token-bucket rate limiter (neutralizes cascading 429s)

The Podium API's documented ceiling is 60 requests per minute per OAuth app. Treat it as a hard ceiling and stay under it by construction — never by reacting to 429s. Hand the hot path a token-bucket gate that paces requests at the documented rate; concurrent callers serialize on the bucket, no retry storm is possible.

```python
import asyncio
import time
from contextlib import asynccontextmanager
from typing import Optional

class TokenBucket:
    """Async token-bucket limiter. Pace = rate tokens per second, max burst = capacity."""

    def __init__(self, rate_per_minute: int, capacity: int):
        self.rate_per_sec = rate_per_minute / 60.0
        self.capacity = capacity
        self._tokens = float(capacity)
        self._last_refill = time.monotonic()
        self._lock = asyncio.Lock()

    async def acquire(self, tokens: float = 1.0) -> None:
        while True:
            async with self._lock:
                self._refill()
                if self._tokens >= tokens:
                    self._tokens -= tokens
                    return
                deficit = tokens - self._tokens
                wait_s = deficit / self.rate_per_sec
            # Sleep OUTSIDE the lock so other callers can refill-and-check in parallel
            await asyncio.sleep(wait_s)

    def _refill(self) -> None:
        now = time.monotonic()
        elapsed = now - self._last_refill
        self._tokens = min(self.capacity, self._tokens + elapsed * self.rate_per_sec)
        self._last_refill = now
```

Wire it into the outbound HTTP path:

```python
PODIUM_LIMIT_PER_MIN = 60          # documented ceiling
PODIUM_BURST_CAPACITY = 10         # conservative burst headroom; tune per endpoint

bucket = TokenBucket(rate_per_minute=PODIUM_LIMIT_PER_MIN, capacity=PODIUM_BURST_CAPACITY)

async def podium_call(method: str, path: str, **kwargs) -> httpx.Response:
    await bucket.acquire()
    token = await auth.get_token()
    async with httpx.AsyncClient(timeout=10) as c:
        return await c.request(
            method,
            f"https://api.podium.com{path}",
            headers={"Authorization": f"Bearer {token}"},
            **kwargs,
        )
```

The bucket converts what would be a 429 cascade into bounded queueing. Latency goes up on the burst; **success rate stays at 100%**.

### 2. `Retry-After` parsing for the residual 429s (neutralizes ignored hints)

Even with a bucket, the residual 429s happen — clock drift between your process and Podium's edge, multiple processes sharing a quota, an inbound webhook fan-out that the bucket sees but the server already counted. When 429 happens, Podium returns a `Retry-After` header. Honor it. Support both forms:

- `Retry-After: 30` — integer seconds to wait
- `Retry-After: Wed, 21 Oct 2026 07:28:00 GMT` — HTTP-date (RFC 7231)

```python
from email.utils import parsedate_to_datetime
from datetime import datetime, timezone

def parse_retry_after(header_value: str) -> float:
    """Return seconds to wait. Supports int-seconds and HTTP-date forms."""
    header_value = header_value.strip()
    # Try integer seconds first — most common form Podium returns
    try:
        seconds = int(header_value)
        return max(0.0, float(seconds))
    except ValueError:
        pass
    # HTTP-date form — RFC 7231
    try:
        retry_at = parsedate_to_datetime(header_value)
        if retry_at.tzinfo is None:
            retry_at = retry_at.replace(tzinfo=timezone.utc)
        delta = (retry_at - datetime.now(timezone.utc)).total_seconds()
        return max(0.0, delta)
    except (TypeError, ValueError):
        # Malformed header — fall back to a safe default rather than crash
        return 60.0
```

Wire it into the retry wrapper:

```python
async def podium_call_with_retry(method: str, path: str, max_attempts: int = 4, **kwargs):
    for attempt in range(1, max_attempts + 1):
        await bucket.acquire()
        r = await _raw_call(method, path, **kwargs)
        if r.status_code != 429:
            return r
        wait_s = parse_retry_after(r.headers.get("Retry-After", "60"))
        # Cap the wait so a misconfigured server can't pin us indefinitely
        wait_s = min(wait_s, 120.0)
        await asyncio.sleep(wait_s)
    raise PodiumRateLimitError(f"429 persisted after {max_attempts} attempts on {path}")
```

Two things make this correct: parse both 

Related in Backend & APIs