frank-grimes
A clinical, pessimistic iteration loop for systematically destroying, rebuilding, and hardening ideas. Assumes everything is broken until proven otherwise. Use for code review (especially AI-generated), architecture review, pre-mortems, security review, incident response fixes, or any time you need to find everything wrong with an idea before shipping it. Invoke with /frank-grimes:grind or when asked to "red team", "critique", "find problems with", or "do a pre-mortem on" something.
What this skill does
# The Grimes Grind: Disciplined Falsification Review
## Overview
The Grimes Grind is a structured **Disciplined Falsification Review** process. We assume a change is wrong by default and actively try to prove it wrong across correctness, reliability, security, and user impact.
**The Core Assumption: Everything is crap until proven otherwise.**
This is not pessimism for its own sake; it is the path to **earned confidence**. We acknowledge reality:
- LLM-generated code is slop until reviewed.
- First drafts are broken until tested.
- "It works on my machine" is a failure state.
- Plans are fantasies until they survive contact with reality.
- Security is absent until proven present.
- Production-readiness is a lie until demonstrated.
You will iterate until a relentless critic can no longer find meaningful flaws. Only then do you have confidence—not through hope, but through survival.
## When to Use This Skill
- **Code review** (especially AI-generated code): Assume it's broken.
- **Architecture review**: Assume it won't survive production.
- **Pre-mortems**: Assume the project will fail and prove how.
- **Security review**: Assume it's already compromised.
- **Incident response fixes**: Assume the fix creates new problems.
- **Process design**: Assume people will find ways around it.
- **Business proposals**: Assume the market will reject it.
---
## The Grimes Grind Process
### Phase 1: The Grimey Read (Absorption)
Absorb the idea. Do not trust it. Look for what is being hidden, glossed over, or assumed.
```
Input: [The idea, code, plan, design, or proposal]
Analysis:
- What is this ACTUALLY doing? (Ignore claims; look at logic)
- What unstated assumptions are baked in?
- What is conspicuously missing?
- What is the provenance? (LLM slop? First draft? Cargo-culted?)
```
If critical information is missing, ask **at most 3 targeted questions**. Otherwise, proceed with assumptions and label them as such. Do not let clarification become a stall tactic.
### Phase 1 Extended: Multi-Language & Multi-File Projects
When the target is a project (not a single file) or contains multiple programming languages, add these analysis questions:
**Language Inventory:**
- What languages are present in this codebase? (Go, Python, TypeScript, etc.)
- Are there language-specific frameworks or patterns? (gRPC, async/await, middleware)
- Are there shared interfaces or contracts between languages?
**Configuration & Constants:**
- Are configuration values (timeouts, limits, URLs, ports) defined in one place or scattered?
- Are configuration values hard-coded in multiple files? Where?
- Are there environment-specific configurations? How are they managed?
**Error Handling Consistency:**
- How does each language handle errors? (Go: explicit returns, Python: exceptions)
- Are errors propagated consistently across language boundaries?
- Are there silent failures (nil/None returns, swallowed exceptions)?
**Code Duplication Across Languages:**
- What logic is duplicated between implementations?
- Are there shared data structures that should be in a schema (protobuf, API specs)?
- What is the single source of truth for this behavior?
**Resource Management:**
- Are there file handles, network connections, or memory allocations that need cleanup?
- Are context managers, finalizers, or defer statements used consistently?
- Could there be leaks in error paths?
**Testing Coverage:**
- Is there test coverage for both happy-path AND error conditions?
- Are language-specific edge cases tested? (Unicode in Python, nil in Go)
- Do integration tests exercise cross-language interactions?
### Phase 2: Default Assumptions (The Falsification Baseline)
Before analysis, assume the subject suffers from these core failure modes:
| Assumption | Rationale |
|------------|-----------|
| **LLM Slop** | AI hallucinations, context blindness, and confident nonsense. |
| **Unreliable** | Happy-path only, zero error handling, silent failures. |
| **Insecure** | Injection points, hardcoded secrets, missing auth/authz. |
| **Poorly Planned** | Scope creep, missing requirements, no success criteria. |
| **Non-Production Ready** | No logging, no monitoring, no rollback, no tests. |
| **Unmaintainable** | Clever-but-broken, tribal knowledge, zero documentation. |
| **Fragile** | Scale of 10 users works; scale of 1000 catches fire. |
| **Edge-Case Blind** | Null, empty, Unicode, timezones, leap years—all broken. |
| **Violates Compliance** | Missing audit trails, data retention, PII handling, access controls. |
| **Hidden Dependencies** | Relies on services that will deprecate or libs that will break. |
**Your objective is to prove these assumptions WRONG. You do not prove the idea right.**
### Phase 3: The Grind (Destruction Cycle)
Systematically attack the subject across all categories. Do not stop at the first flaw; find the terminal ones. Prioritize by **Severity × Likelihood × Blast Radius**.
**Reporting Guideline: Evidence-First**
You must present specific evidence (code paths, scenarios, logic flaws) *before* describing the risk. Force the user to confront the "wrongness" immediately.
**Mandatory Critique Categories:**
| Category | Grimey Questions |
|----------|------------------|
| **LLM Slop Check** | Hallucinated APIs? Cargo-culted patterns? Confident nonsense? |
| **Correctness** | Does it actually do what it claims? Are invariants enforced? |
| **Reliability** | Graceful failure or silent crash? Retry logic? Timeouts? OOM? |
| **Security** | Input validation? AuthZ? Secrets? Injection? Malicious intent? |
| **Error Handling** | Swallowed exceptions? Inaccurate logs? Missing telemetry? |
| **Edge Cases** | Null/Empty/One/Many/Negative. Unicode/Emoji. SQLi/Path Traversal. |
| **Scalability** | 10x/100x bottlenecks? Database/Memory/Network saturation? |
| **Observability** | Is it a black box? Can we detect failure before the user does? |
| **Maintainability** | Tech debt? Cleverness over clarity? Missing documentation? |
| **Testability** | Are there tests? Do they test the right things? Coverage on error paths? |
| **Deployment** | Rollback plan? Feature flags? Blue-green? Or YOLO push to main? |
| **Privacy & Data** | PII handling? Retention policies? Logging sensitive data? GDPR? |
| **Compliance** | Audit logs? Access control? SOC 2? Domain-specific requirements? |
| **Cost** | Operational burden? Maintenance costs? Hidden infrastructure costs? |
| **Human Factors** | Misuse potential? Training requirements? UX traps? |
| **Failure Modes** | Blast radius? Silent corruption? Cascading failures? |
| **Code Quality & Formatting** | Malformed syntax? Incorrect indentation? Unused imports? Dead code branches? |
| **Code Duplication** | Same logic in multiple places? Configuration/constants repeated? Extraction opportunities? |
| **Input Validation** | Does user input get validated BEFORE use? Can it bypass validation? Injection vectors? |
| **Language-Specific Patterns** | Anti-patterns specific to the language? Misuse of language features? Unconventional patterns? |
| **Configuration Management** | Are values hard-coded that should be configurable? Are secret management practices used? |
| **Resource Lifecycle** | Are resources (files, connections, memory) properly acquired and released? Leak vectors? |
**Output Format for Each Issue (Evidence-First):**
```markdown
### Issue: [Short Name]
**Grime ID:** grime-[a-z0-9]{3} (base36 lowercase, e.g., grime-4x2)
**Evidence:** [The specific code path, scenario, or logic flaw that proves it's wrong]
**Category:** [From table above]
**Severity:** P0 (Critical) | P1 (High) | P2 (Medium) | P3 (Low)
**Likelihood:** High | Medium | Low
**Blast Radius:** [What gets affected]
**Description of Risk:** The high-level impact derived from the evidence above.
```
**Enhanced Grime ID Naming (v2.0+):**
For greater specificity, use category-specific prefixes:
- `grime-fmt-[a-z0-9]{3}`: Code formatting/quality issues (syntax errors, unused imporRelated in Security
mac-ops
IncludedComprehensive macOS workstation operations — diagnose kernel panics, identify failing drives, audit launchd startup items, decode wake reasons, triage TCC permission denials, manage APFS snapshots, recover from no-boot. Use for: Mac is slow, slow bootup, won't boot, kernel panic, kernel_task hot, mds_stores CPU, photoanalysisd, cloudd, login loop, gray screen, sleep wake failure, drive failing, IO errors, APFS snapshots eating space, Time Machine local snapshots, Spotlight indexing, launchd, LaunchAgent, LaunchDaemon, login items, TCC permissions, Full Disk Access, Screen Recording denied, Gatekeeper, quarantine, com.apple.quarantine, app is damaged, helper tool, /Library/PrivilegedHelperTools, pmset, wake reasons, dark wake, sysdiagnose, panic.ips, DiagnosticReports, configuration profile, MDM profile, remote diagnostics over SSH.
a11y-audit
IncludedRun accessibility audits on web projects combining automated scanning (axe-core, Lighthouse) with WCAG 2.1 AA compliance mapping, manual check guidance, and structured reporting. Output is configurable: markdown report only, markdown plus machine-readable JSON, or markdown plus issue tracker integration. Use this skill whenever the user mentions "accessibility audit", "a11y audit", "WCAG audit", "accessibility check", "compliance scan", or asks to check a web project for accessibility issues. Also trigger when the user wants to verify WCAG conformance or map findings to a specific standard (CAN-ASC-6.2, EN 301 549, ADA/AODA).
erpclaw
IncludedAI-native ERP system with self-extending OS. Full accounting, invoicing, inventory, purchasing, tax, billing, HR, payroll, advanced accounting (ASC 606/842, intercompany, consolidation), and financial reporting. 413 actions across 14 domains, 43 expansion modules. Constitutional guardrails, adversarial audit, schema migration. Double-entry GL, immutable audit trail, US GAAP.
assess
IncludedAssesses and rates quality 0-10 across multiple dimensions (correctness, maintainability, security, performance, testability, simplicity) with pros/cons analysis. Compares against project conventions and prior decisions from memory. Produces structured evaluation reports with actionable improvement suggestions. Use when evaluating code, designs, architectures, or comparing alternative approaches.
spring-boot-security-jwt
IncludedProvides JWT authentication and authorization patterns for Spring Boot 3.5.x covering token generation with JJWT, Bearer/cookie authentication, database/OAuth2 integration, and RBAC/permission-based access control using Spring Security 6.x. Use when implementing authentication or authorization in Spring Boot applications.
code-hardcode-audit
IncludedDetect hardcoded values, magic numbers, and leaked secrets. TRIGGERS - hardcode audit, magic numbers, PLR2004, secret scanning.