Claude
Skills
Sign in
Back

defensive-coding

Included with Lifetime
$97 forever

This skill MUST be used before writing any implementation code — feature work, bug fixes, pipeline stages, data processing, API handlers, K8s manifests, or integration code. Enforces fail-loud patterns, input/output validation, connection verification, and pre-commit gates. Triggered automatically on any code writing task. Also use when user says "defensive", "fail-fast", "validate", "check failures", "harden".

Backend & APIs

What this skill does


# Defensive Coding

Stop writing only the happy path. Every piece of code must answer: **"What happens when this fails?"**

## The Rule

**Before writing any code, complete the Defensive Checklist for that code.** Do not skip items — each exists because a real production incident was caused by its absence.

## Defensive Checklist

Run through these checks BEFORE writing implementation code. For each check, either address it in the code or explicitly note why it doesn't apply.

### 1. Input Validation

Every function that receives external data must validate it before processing.

- **Numeric data**: Check for NaN, Infinity, negative values where only positive expected, zero where division follows
- **String data**: Check for empty strings, unexpected formats, encoding issues
- **API responses**: Check HTTP status, check response body is non-empty, check expected fields exist
- **Database results**: Check row counts, check for NULL in required fields
- **File/stream data**: Check file exists, is non-empty, has expected format/headers

```rust
// BAD — trusts API response blindly
let price = response.price;
// GOOD — validates before use
let price = response.price;
if price <= 0.0 || price.is_nan() || price.is_infinite() {
    return Err(anyhow!("invalid price {} for {}", price, ticker));
}
```

### 2. Output Assertions

After producing output (writing files, publishing messages, inserting rows), verify the result.

- **Parquet/CSV files**: Check row count > 0, check key columns have no NULLs, spot-check value ranges
- **Database writes**: Verify affected row count matches expectation
- **NATS/message publishing**: Confirm publish acknowledgment
- **API responses**: Validate response shape before returning to caller
- **GCS uploads**: Verify object exists after upload

```typescript
// BAD — writes and moves on
await writeParquet(rows, path);
// GOOD — verifies what was written
await writeParquet(rows, path);
const stats = await readParquetMetadata(path);
if (stats.rowCount === 0) throw new Error(`Empty parquet at ${path}`);
if (stats.rowCount !== rows.length) throw new Error(`Row count mismatch: wrote ${rows.length}, file has ${stats.rowCount}`);
```

### 3. Fail-Loud (No Silent Failures)

Never swallow errors. Never log-and-continue on critical paths. If something fails, make it visible.

- **No empty catch blocks** — every catch must either re-throw, return an error, or crash
- **No `try { } catch { log.warn(...) }`** on data paths — if data is missing, that's an error, not a warning
- **No `unwrap_or_default()`** on critical data — a default value hides the bug
- **No `if let Some(x)` without handling `None`** when None means broken state
- **HTTP fetches**: A 200 with empty/garbage body is still a failure — check the content

```rust
// BAD — silently returns empty on failure
let injuries = fetch_injuries().unwrap_or_default();
// GOOD — fails loudly
let injuries = fetch_injuries()
    .map_err(|e| anyhow!("injury fetch failed: {e}"))?;
if injuries.is_empty() {
    return Err(anyhow!("injury API returned empty response — expected data for today's games"));
}
```

### 4. Connection & Configuration Verification

On startup or first use, verify you're connected to the right thing.

- **NATS streams**: After connecting, verify stream name and subjects match expectations. Log the stream config at startup.
- **Database**: Verify schema version or expected tables exist on connect
- **Redis**: Verify connectivity AND that expected key patterns are accessible
- **API endpoints**: Make a health check or test request on startup
- **WebSocket**: After connect, verify subscription acknowledgment for expected channels
- **K8s ConfigMaps/env vars**: Validate required env vars are set and non-empty at startup, not at first use

```rust
// BAD — connects and hopes
let stream = nats.subscribe("prod.kalshi.*.json.ticker.>").await?;
// GOOD — verifies the subscription is on the right stream
let stream_info = js.stream_info("PROD_KALSHI_CRYPTO").await?;
info!("connected to stream {} with {} messages, subjects: {:?}",
    stream_info.config.name, stream_info.state.messages, stream_info.config.subjects);
if !stream_info.config.subjects.iter().any(|s| s.contains("ticker")) {
    return Err(anyhow!("stream {} has no ticker subjects — wrong stream?", stream_info.config.name));
}
```

### 5. Pre-Commit Gates

Before committing code, verify it builds and passes tests locally.

- **Always run `make all`** (or the relevant build command) before committing
- **Run the specific test file** for code you changed
- **Check for compiler warnings** — `cargo clippy`, `deno lint`, etc.
- **If the project has CI, mirror it locally** — don't push and hope
- **Check for `is_multiple_of`** — it's unstable in CI Docker images, use `% N == 0`
- **Check image tag format** — verify against `.github/workflows/build-*.yaml` trigger patterns

```bash
# Before every commit
make all           # or: cargo clippy && cargo test && deno lint && deno test
```

### 6. Boundary Assumptions

Document and verify assumptions at system boundaries.

- **Time zones**: Explicitly convert and label (UTC vs EST vs local). Kalshi uses EST for ticker naming.
- **Units**: Price in cents vs dollars? Quantity in contracts vs lots? Document at the boundary.
- **Encoding**: UTF-8? JSON? Cap'n Proto? Verify format at deserialization, not downstream.
- **Ordering**: Don't assume messages arrive in order unless the transport guarantees it.

## When to Apply Each Check

| Writing... | Must Apply |
|-----------|-----------|
| Data processing / ETL | Input validation, Output assertions, Fail-loud |
| API handler | Input validation, Fail-loud, Boundary assumptions |
| Connector / subscriber | Connection verification, Fail-loud |
| Pipeline stage / CronJob | All six checks |
| K8s manifest / deployment | Connection verification, Pre-commit gates |
| Library / shared code | Input validation, Fail-loud, Boundary assumptions |
| Tests | Input validation (test data), Output assertions |

## Anti-Patterns to Reject

Reject these patterns in code review and never write them:

| Anti-Pattern | Why It's Dangerous | Write Instead |
|-------------|-------------------|---------------|
| `catch (e) { log.warn(e) }` | Hides failure, process continues with bad state | `catch (e) { throw e }` or crash |
| `.unwrap_or_default()` on data | Produces empty/zero instead of surfacing the bug | `.map_err(\|e\| ...)? ` with context |
| `if (data) { process(data) }` (no else) | Silently skips when data is missing | Add `else { throw }` |
| Writing output without checking it | Corrupt/empty files go undetected | Read back and validate |
| Connecting without verifying target | Wrong stream/DB/endpoint for days | Health check + log target on startup |
| Committing without building | CI catches what you could have caught in 30 seconds | `make all` first |

## Additional Resources

### Reference Files

For real production incidents that motivated each check:
- **`references/failure-catalog.md`** — Catalog of real failures from this project, mapped to which checklist item would have caught them

Related in Backend & APIs