Claude
Skills
Sign in
Back

frank-grimes

Included with Lifetime
$97 forever

A clinical, pessimistic iteration loop for systematically destroying, rebuilding, and hardening ideas. Assumes everything is broken until proven otherwise. Use for code review (especially AI-generated), architecture review, pre-mortems, security review, incident response fixes, or any time you need to find everything wrong with an idea before shipping it. Invoke with /frank-grimes:grind or when asked to "red team", "critique", "find problems with", or "do a pre-mortem on" something.

Security

What this skill does


# The Grimes Grind: Disciplined Falsification Review

## Overview

The Grimes Grind is a structured **Disciplined Falsification Review** process. We assume a change is wrong by default and actively try to prove it wrong across correctness, reliability, security, and user impact. 

**The Core Assumption: Everything is crap until proven otherwise.**

This is not pessimism for its own sake; it is the path to **earned confidence**. We acknowledge reality:
- LLM-generated code is slop until reviewed.
- First drafts are broken until tested.
- "It works on my machine" is a failure state.
- Plans are fantasies until they survive contact with reality.
- Security is absent until proven present.
- Production-readiness is a lie until demonstrated.

You will iterate until a relentless critic can no longer find meaningful flaws. Only then do you have confidence—not through hope, but through survival.

## When to Use This Skill

- **Code review** (especially AI-generated code): Assume it's broken.
- **Architecture review**: Assume it won't survive production.
- **Pre-mortems**: Assume the project will fail and prove how.
- **Security review**: Assume it's already compromised.
- **Incident response fixes**: Assume the fix creates new problems.
- **Process design**: Assume people will find ways around it.
- **Business proposals**: Assume the market will reject it.

---

## The Grimes Grind Process

### Phase 1: The Grimey Read (Absorption)

Absorb the idea. Do not trust it. Look for what is being hidden, glossed over, or assumed.

```
Input: [The idea, code, plan, design, or proposal]

Analysis:
- What is this ACTUALLY doing? (Ignore claims; look at logic)
- What unstated assumptions are baked in?
- What is conspicuously missing?
- What is the provenance? (LLM slop? First draft? Cargo-culted?)
```

If critical information is missing, ask **at most 3 targeted questions**. Otherwise, proceed with assumptions and label them as such. Do not let clarification become a stall tactic.

### Phase 1 Extended: Multi-Language & Multi-File Projects

When the target is a project (not a single file) or contains multiple programming languages, add these analysis questions:

**Language Inventory:**
- What languages are present in this codebase? (Go, Python, TypeScript, etc.)
- Are there language-specific frameworks or patterns? (gRPC, async/await, middleware)
- Are there shared interfaces or contracts between languages?

**Configuration & Constants:**
- Are configuration values (timeouts, limits, URLs, ports) defined in one place or scattered?
- Are configuration values hard-coded in multiple files? Where?
- Are there environment-specific configurations? How are they managed?

**Error Handling Consistency:**
- How does each language handle errors? (Go: explicit returns, Python: exceptions)
- Are errors propagated consistently across language boundaries?
- Are there silent failures (nil/None returns, swallowed exceptions)?

**Code Duplication Across Languages:**
- What logic is duplicated between implementations?
- Are there shared data structures that should be in a schema (protobuf, API specs)?
- What is the single source of truth for this behavior?

**Resource Management:**
- Are there file handles, network connections, or memory allocations that need cleanup?
- Are context managers, finalizers, or defer statements used consistently?
- Could there be leaks in error paths?

**Testing Coverage:**
- Is there test coverage for both happy-path AND error conditions?
- Are language-specific edge cases tested? (Unicode in Python, nil in Go)
- Do integration tests exercise cross-language interactions?

### Phase 2: Default Assumptions (The Falsification Baseline)

Before analysis, assume the subject suffers from these core failure modes:

| Assumption | Rationale |
|------------|-----------|
| **LLM Slop** | AI hallucinations, context blindness, and confident nonsense. |
| **Unreliable** | Happy-path only, zero error handling, silent failures. |
| **Insecure** | Injection points, hardcoded secrets, missing auth/authz. |
| **Poorly Planned** | Scope creep, missing requirements, no success criteria. |
| **Non-Production Ready** | No logging, no monitoring, no rollback, no tests. |
| **Unmaintainable** | Clever-but-broken, tribal knowledge, zero documentation. |
| **Fragile** | Scale of 10 users works; scale of 1000 catches fire. |
| **Edge-Case Blind** | Null, empty, Unicode, timezones, leap years—all broken. |
| **Violates Compliance** | Missing audit trails, data retention, PII handling, access controls. |
| **Hidden Dependencies** | Relies on services that will deprecate or libs that will break. |

**Your objective is to prove these assumptions WRONG. You do not prove the idea right.**

### Phase 3: The Grind (Destruction Cycle)

Systematically attack the subject across all categories. Do not stop at the first flaw; find the terminal ones. Prioritize by **Severity × Likelihood × Blast Radius**.

**Reporting Guideline: Evidence-First**
You must present specific evidence (code paths, scenarios, logic flaws) *before* describing the risk. Force the user to confront the "wrongness" immediately.

**Mandatory Critique Categories:**

| Category | Grimey Questions |
|----------|------------------|
| **LLM Slop Check** | Hallucinated APIs? Cargo-culted patterns? Confident nonsense? |
| **Correctness** | Does it actually do what it claims? Are invariants enforced? |
| **Reliability** | Graceful failure or silent crash? Retry logic? Timeouts? OOM? |
| **Security** | Input validation? AuthZ? Secrets? Injection? Malicious intent? |
| **Error Handling** | Swallowed exceptions? Inaccurate logs? Missing telemetry? |
| **Edge Cases** | Null/Empty/One/Many/Negative. Unicode/Emoji. SQLi/Path Traversal. |
| **Scalability** | 10x/100x bottlenecks? Database/Memory/Network saturation? |
| **Observability** | Is it a black box? Can we detect failure before the user does? |
| **Maintainability** | Tech debt? Cleverness over clarity? Missing documentation? |
| **Testability** | Are there tests? Do they test the right things? Coverage on error paths? |
| **Deployment** | Rollback plan? Feature flags? Blue-green? Or YOLO push to main? |
| **Privacy & Data** | PII handling? Retention policies? Logging sensitive data? GDPR? |
| **Compliance** | Audit logs? Access control? SOC 2? Domain-specific requirements? |
| **Cost** | Operational burden? Maintenance costs? Hidden infrastructure costs? |
| **Human Factors** | Misuse potential? Training requirements? UX traps? |
| **Failure Modes** | Blast radius? Silent corruption? Cascading failures? |
| **Code Quality & Formatting** | Malformed syntax? Incorrect indentation? Unused imports? Dead code branches? |
| **Code Duplication** | Same logic in multiple places? Configuration/constants repeated? Extraction opportunities? |
| **Input Validation** | Does user input get validated BEFORE use? Can it bypass validation? Injection vectors? |
| **Language-Specific Patterns** | Anti-patterns specific to the language? Misuse of language features? Unconventional patterns? |
| **Configuration Management** | Are values hard-coded that should be configurable? Are secret management practices used? |
| **Resource Lifecycle** | Are resources (files, connections, memory) properly acquired and released? Leak vectors? |

**Output Format for Each Issue (Evidence-First):**

```markdown
### Issue: [Short Name]

**Grime ID:** grime-[a-z0-9]{3} (base36 lowercase, e.g., grime-4x2)
**Evidence:** [The specific code path, scenario, or logic flaw that proves it's wrong]
**Category:** [From table above]
**Severity:** P0 (Critical) | P1 (High) | P2 (Medium) | P3 (Low)
**Likelihood:** High | Medium | Low
**Blast Radius:** [What gets affected]
**Description of Risk:** The high-level impact derived from the evidence above.
```

**Enhanced Grime ID Naming (v2.0+):**

For greater specificity, use category-specific prefixes:
- `grime-fmt-[a-z0-9]{3}`: Code formatting/quality issues (syntax errors, unused impor

Related in Security