sentry-load-scale

Included with Lifetime

$97 forever

Scale Sentry for high-traffic applications handling millions of events per day. Use when optimizing SDK performance at high volume, implementing adaptive sampling, managing quotas and costs at scale, or deploying Sentry across multi-region infrastructure. Trigger with phrases like "sentry high traffic", "scale sentry", "sentry millions events", "sentry high volume", "sentry quota management", "sentry load test".

Backend & APIssaassentryperformancescalinghigh-trafficenterprise

What this skill does

# Sentry Load & Scale

Configure Sentry for applications processing 1M+ requests/day without sacrificing error visibility, burning through quota, or adding measurable SDK overhead. Covers adaptive sampling, connection pooling, multi-region tagging, quota management, SDK benchmarking, batch submission, load testing, and self-hosted deployment considerations.

## Prerequisites

- Application handling sustained high traffic (>10K requests/min or >1M events/day)
- Sentry organization with quota and billing access (Settings > Subscription)
- `@sentry/node` v8+ installed (`npm ls @sentry/node`)
- Performance baseline established (p50/p95/p99 latency without Sentry)
- Event volume estimates calculated per category (errors, transactions, replays, attachments)

## Instructions

### Step 1 — Implement Adaptive Sampling

Static `tracesSampleRate` wastes quota at scale because it treats a health check the same as a checkout. Replace it with a traffic-aware `tracesSampler` that adjusts rates based on endpoint criticality and current load.

**Traffic-aware tracesSampler:**

```typescript
import * as Sentry from '@sentry/node';

// Track request volume per endpoint for adaptive rate adjustment
const endpointVolume = new Map<string, { count: number; resetAt: number }>();
const WINDOW_MS = 60_000;

function getAdaptiveRate(name: string, baseRate: number): number {
  const now = Date.now();
  let entry = endpointVolume.get(name);

  if (!entry || now > entry.resetAt) {
    entry = { count: 0, resetAt: now + WINDOW_MS };
    endpointVolume.set(name, entry);
  }
  entry.count++;

  // Scale down sampling as volume increases within window
  // 0-100 req/min: full base rate
  // 100-1000: halve it
  // 1000+: quarter it
  if (entry.count > 1000) return baseRate * 0.25;
  if (entry.count > 100) return baseRate * 0.5;
  return baseRate;
}

Sentry.init({
  dsn: process.env.SENTRY_DSN,

  tracesSampler: (samplingContext) => {
    const { name, parentSampled } = samplingContext;

    // Always respect parent decision for distributed tracing consistency
    if (parentSampled !== undefined) return parentSampled ? 1.0 : 0;

    // Tier 0: Never sample — high-frequency, zero diagnostic value
    if (name?.match(/\/(health|ready|alive|ping|metrics|favicon)/)) return 0;
    if (name?.match(/\.(css|js|png|jpg|svg|woff2?|ico)$/)) return 0;

    // Tier 1: Always sample — business-critical, low volume
    if (name?.includes('/payment') || name?.includes('/checkout')) return 1.0;
    if (name?.includes('/auth/login')) return getAdaptiveRate('auth', 0.5);

    // Tier 2: Moderate sampling — API mutations (higher signal)
    if (name?.startsWith('POST /api/')) return getAdaptiveRate(name, 0.05);
    if (name?.startsWith('PUT /api/'))  return getAdaptiveRate(name, 0.05);
    if (name?.startsWith('DELETE /api/')) return getAdaptiveRate(name, 0.05);

    // Tier 3: Light sampling — API reads
    if (name?.startsWith('GET /api/')) return getAdaptiveRate(name, 0.02);

    // Tier 4: Background jobs — sample sparingly
    if (name?.startsWith('job:') || name?.startsWith('queue:')) {
      return getAdaptiveRate(name, 0.01);
    }

    // Tier 5: Everything else — minimal baseline
    return getAdaptiveRate(name || 'default', 0.005);
  },
});
```

**Adaptive error deduplication with `beforeSend`:**

```typescript
// Reduce duplicate error volume by 90%+ while preserving first-occurrence fidelity
const errorCounts = new Map<string, number>();
const ERROR_WINDOW_MS = 60_000;

setInterval(() => errorCounts.clear(), ERROR_WINDOW_MS);

Sentry.init({
  dsn: process.env.SENTRY_DSN,

  beforeSend(event, hint) {
    const error = hint?.originalException;
    const key = error instanceof Error
      ? `${error.name}:${error.message?.substring(0, 100)}`
      : `unknown:${String(event.message || '').substring(0, 100)}`;

    const count = (errorCounts.get(key) || 0) + 1;
    errorCounts.set(key, count);

    // First occurrence: always send with full context
    if (count === 1) return event;

    // 2-10: send every 5th (capture ramp-up pattern)
    if (count <= 10) return count % 5 === 0 ? event : null;

    // 11-100: send every 25th (confirm still happening)
    if (count <= 100) return count % 25 === 0 ? event : null;

    // 100+: send every 100th (volume indicator only)
    return count % 100 === 0 ? event : null;
  },
});
```

### Step 2 — Optimize SDK for Minimal Overhead

At high throughput, every byte and every millisecond of SDK processing matters. This configuration reduces memory footprint, payload size, and CPU time.

**Lean SDK initialization:**

```typescript
import * as Sentry from '@sentry/node';
import os from 'node:os';

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  environment: process.env.NODE_ENV || 'production',
  release: `${process.env.SERVICE_NAME}@${process.env.VERSION || 'unknown'}`,

  // --- Memory reduction ---
  maxBreadcrumbs: 15,          // Down from 100 default; saves ~85KB/scope
  maxValueLength: 200,         // Truncate long string values

  // --- Disable high-overhead integrations ---
  integrations: (defaults) => defaults.filter(i =>
    !['Console', 'ContextLines'].includes(i.name)
  ),

  // --- No profiling at high scale (use dedicated APM if needed) ---
  profilesSampleRate: 0,

  // --- Transport tuning for high-throughput ---
  transportOptions: {
    bufferSize: 100,           // Default 64; absorbs traffic spikes
  },

  // --- Context size limiter ---
  beforeSend(event) {
    // Truncate oversized contexts to prevent payload bloat
    if (event.contexts) {
      for (const [key, ctx] of Object.entries(event.contexts)) {
        const str = JSON.stringify(ctx);
        if (str.length > 2000) {
          event.contexts[key] = { _truncated: true, originalSize: str.length };
        }
      }
    }

    // Strip headers that add bulk without diagnostic value
    if (event.request?.headers) {
      const keep = ['content-type', 'accept', 'user-agent', 'x-request-id'];
      event.request.headers = Object.fromEntries(
        Object.entries(event.request.headers)
          .filter(([k]) => keep.includes(k.toLowerCase()))
      );
    }

    return event;
  },

  // --- Multi-region tags for infrastructure visibility ---
  serverName: process.env.HOSTNAME || process.env.POD_NAME || os.hostname(),
  initialScope: {
    tags: {
      region: process.env.AWS_REGION || process.env.GCP_REGION || 'unknown',
      cluster: process.env.K8S_CLUSTER || 'default',
      pod: process.env.POD_NAME || 'unknown',
      service: process.env.SERVICE_NAME || 'unknown',
    },
  },
});
```

**Graceful shutdown ensuring event delivery:**

```typescript
import * as Sentry from '@sentry/node';

async function shutdown(signal: string) {
  console.log(`${signal} received — flushing Sentry events`);

  // Stop accepting new requests
  server.close();

  // Flush all pending events (2s timeout prevents hanging deploys)
  const flushed = await Sentry.close(2000);
  if (!flushed) {
    console.warn('Sentry flush timed out — some events may be lost');
  }

  process.exit(0);
}

process.on('SIGTERM', () => shutdown('SIGTERM'));
process.on('SIGINT',  () => shutdown('SIGINT'));
```

### Step 3 — Manage Quotas, Test Under Load, and Plan for Scale

**Quota management and reserved volume pricing:**

```
Application: 10M requests/day, 0.1% error rate, @sentry/node v8

Error events (with adaptive beforeSend):
  Raw errors:     10M x 0.001 = 10,000/day
  After dedup:    ~1,000/day (90% reduction)        = 30K/month

Transaction events (with tiered tracesSampler):
  Health/static:  0% of 4M    = 0
  Payment (T1):   100% of 5K  = 5,000/day
  POST API (T2):  5% of 500K  = 25,000/day
  GET API (T3):   2% of 5M    = 100,000/day
  Other (T5):     0.5% of 500K = 2,500/day
  Total:                        ~132K/day            = 4M/month

Sentry Business plan ($26/mo base):
  Errors:       30K included in base plan
  Transactions: 100K included, overage 3.9M x $0.000025 = ~$97/mo
  Estimated total: ~$123/

Files: 7

Size: 24.4 KB

Complexity: 51/100

Category: Backend & APIs

Source: https://github.com/jeremylongshore/claude-code-plugins-plus-skills/tree/main/plugins/saas-packs/sentry-pack/skills/sentry-load-scale

Related in Backend & APIs

jfrog

Included

Interact with the JFrog Platform via the JFrog CLI and REST/GraphQL APIs. Use this skill when the user wants to manage Artifactory repositories, upload or download artifacts, manage builds, configure permissions, manage users and groups, work with access tokens, configure JFrog CLI servers, search artifacts, manage properties, set up replication, manage JFrog Projects, run security audits or scans, look up CVE details, query exposures scan results from JFrog Advanced Security, manage release bundles and lifecycle operations, aggregate or export platform data, or perform any JFrog Platform administration task. Also use when the user mentions jf, jfrog, artifactory, xray, distribution, evidence, apptrust, onemodel, graphql, workers, mission control, curation, advanced security, exposures, or any JFrog product name.

Backend & APIsscripts

cupynumeric-migration-readiness

Included

Pre-migration readiness assessor for porting NumPy to cuPyNumeric. Use BEFORE substantial porting work begins when the user asks whether code will scale on GPU, whether they should migrate to cuPyNumeric, which NumPy patterns transfer cleanly, what must be refactored before porting, or mentions pre-port assessment, scaling analysis, or refactor planning. Inspect the user's source code, look up NumPy usage, cross-reference the cuPyNumeric API support manifest, and distinguish distributed-scaling-friendly patterns from blockers such as unsupported APIs, scalar synchronization, host round-trips, Python/object-heavy control flow, shape/data-dependent branching, and in-place mutation hazards. Produce a verdict of READY, LIGHT REFACTOR, SIGNIFICANT REFACTOR, or NOT RECOMMENDED, with concrete refactor pointers.

Backend & APIsscripts

alibabacloud-data-agent-skill

Included

Invoke Alibaba Cloud Apsara Data Agent for Analytics via CLI to perform natural language-driven data analysis on enterprise databases. Data Agent for Analytics is an intelligent data analysis agent developed by Alibaba Cloud Database team for enterprise users. It automatically completes requirement analysis, data understanding, analysis insights, and report generation based on natural language descriptions. This tool supports: discovering data resources (instances/databases/tables) managed in DMS, initiating query or deep analysis sessions, real-time progress tracking, and retrieving analysis conclusions and generated reports. Use this Skill when users need to query databases, analyze data trends, generate data reports, ask questions in natural language, or mention "Data Agent", "data analysis", "database query", "SQL analysis", "data insights".

Backend & APIsscripts

token-optimizer

Included

Reduce OpenClaw token usage and API costs through smart model routing, heartbeat optimization, budget tracking, and native 2026.2.15 features (session pruning, bootstrap size limits, cache TTL alignment). Use when token costs are high, API rate limits are being hit, or hosting multiple agents at scale. The 4 executable scripts (context_optimizer, model_router, heartbeat_optimizer, token_tracker) are local-only — no network requests, no subprocess calls, no system modifications. Reference files (PROVIDERS.md, config-patches.json) document optional multi-provider strategies that require external API keys and network access if you choose to use them. See SECURITY.md for full breakdown.

Backend & APIsscripts

resend-cli

Included

Use this skill when the task is specifically about operating Resend from an AI agent, terminal session, or CI job via the official resend CLI: installing/authenticating the CLI, sending/listing/updating/cancelling emails, batch sends, domains and DNS, webhooks and local listeners, inbound receiving, contacts, topics, segments, broadcasts, templates, API keys, profiles, or debugging Resend CLI/API failures. Trigger on mentions of Resend CLI, `resend`, `resend doctor`, `resend emails send`, `resend domains`, `resend webhooks listen`, `resend emails receiving`, or agent-friendly terminal automation.

Backend & APIsscripts

alibabacloud-odps-maxframe-coding

Included

Use this skill for MaxFrame SDK development and documentation navigation on Alibaba Cloud MaxCompute (ODPS). Helps answer MaxFrame API, concept, official example, and supported pandas API questions; create data processing programs; read/write MaxCompute tables; debug jobs (remote or local); and build custom DPE runtime images. Trigger when users mention MaxFrame, MaxCompute with MaxFrame, ODPS table processing, DPE runtime, MaxFrame docs/examples, DataFrame/Tensor operations, or GPU runtime setup. Works for both English and Chinese queries about Alibaba Cloud data processing with MaxFrame.

Backend & APIsscripts