justify
Audits changes for additive bias and Iron Law compliance. Use when reviewing completed work before merging or after AI-assisted implementation.
What this skill does
> The simplest change that fixes the problem is the > safest change to merge. > Adding code is easy. Removing the need for code is > engineering. # Justify ## The Additive Bias Problem AI models are trained to be helpful, which creates a systematic bias toward *adding* code rather than *fixing* root causes: | AI Default Behavior | Correct Behavior | |---------------------|------------------| | Add a workaround | Fix the root cause | | Modify test expectations | Fix the implementation | | Create a new helper | Use an existing one | | Add error handling | Prevent the error | | Add a compatibility shim | Remove the old code | | Wrap in try/catch | Fix the exception source | This skill audits changes for these patterns and requires explicit justification for each. ## When to Use - After completing implementation work - Before committing or creating PRs - When reviewing your own changes for quality - When scope-guard flags RED/YELLOW zone ## Audit Protocol ### Step 1: Gather the Delta ```bash # Determine base branch base=$(git merge-base master HEAD 2>/dev/null \ || git merge-base main HEAD 2>/dev/null) # Get change statistics git diff "$base" --stat git diff "$base" --shortstat git diff "$base" --diff-filter=A --name-only # new files git diff "$base" --diff-filter=M --name-only # modified files git diff "$base" --diff-filter=D --name-only # deleted files ``` ### Step 2: Compute Additive Bias Score Score each dimension 0-3 (0 = clean, 3 = high bias): | Signal | Weight | How to Measure | |--------|--------|----------------| | Line ratio | 2x | `additions / max(deletions, 1)` | | New files | 2x | Count of `--diff-filter=A` | | Test logic changes | 3x | Test assertion/expectation diffs | | New abstractions | 1x | New classes, functions, modules | | Workaround patterns | 2x | Try/catch, if/else guards added | **Line Ratio Scoring:** | Ratio | Score | Interpretation | |-------|-------|----------------| | < 2:1 | 0 | Balanced change | | 2:1 to 5:1 | 1 | Mildly additive | | 5:1 to 10:1 | 2 | Additive bias likely | | > 10:1 | 3 | Strong additive bias | **Aggregate Score:** ``` bias_score = sum(signal_score * weight) / sum(weights) ``` | Aggregate | Zone | Action | |-----------|------|--------| | 0.0 - 0.5 | GREEN | Proceed | | 0.5 - 1.5 | YELLOW | Justify each signal | | 1.5 - 2.5 | RED | Rethink approach | | 2.5+ | STOP | Likely wrong approach | ### Step 3: Iron Law Compliance Check The Iron Law states: tests drive implementation, not the other way around. Check for violations: ```bash # Find test files that were modified git diff "$base" --name-only | rg "test_|_test\.|spec\." \ || git diff "$base" --name-only | grep -E "test_|_test\.|spec\." # For each modified test file, check what changed git diff "$base" -- <test_file> | rg "^[-+].*assert|^[-+].*expect|^[-+].*should" ``` **Violation patterns (test logic was tampered):** - Assertion values changed (expected output modified) - Test cases removed or commented out - `@skip` or `@pytest.mark.skip` added - Error expectations weakened (broad exception types) - Mock return values changed to match new behavior - Test renamed to no longer describe original behavior **Each violation requires explicit justification:** > "I changed this test assertion because the > *requirement* changed, not because my implementation > couldn't meet the original requirement." If the requirement didn't change, the test should not change. Fix the implementation instead. ### Step 4: Minimal Intervention Analysis For each changed file, answer: 1. **Was this change necessary?** Could the goal be achieved without touching this file? 2. **Was this the minimal change?** Could fewer lines achieve the same result? 3. **Did this change add or remove complexity?** New functions, classes, or control flow = added complexity that needs justification. 4. **Is there a subtraction-first alternative?** Could removing code fix the problem instead of adding code? ### Step 4.5: Invariant Impact Analysis Changes can be minimal and still catastrophically wrong if they silently revise a load-bearing design decision. For each changed file, check whether it touches a design invariant: **What counts as an invariant:** - Architectural patterns (module boundaries, layer separation, data flow direction) - Data structure choices (why a map vs list, why normalized vs denormalized) - API contracts (public interfaces, protocol formats) - Error handling strategies (fail-fast vs recovery) - Concurrency models (single-threaded assumption, actor model, shared-nothing) **Detection heuristic:** ```bash # Check for structural changes (new modules, moved # boundaries, changed interfaces) git diff "$base" --name-only | rg "(interface|abstract|base|core|types|schema|model)" \ || git diff "$base" --name-only | grep -E "(interface|abstract|base|core|types|schema|model)" # Check for pattern-breaking changes git diff "$base" -U5 | rg "(TODO.*refactor|HACK|WORKAROUND|XXX)" \ || git diff "$base" -U5 | grep -E "(TODO.*refactor|HACK|WORKAROUND|XXX)" ``` **When an invariant conflict is detected:** Do NOT silently pick a resolution. Present the three options to the human: | Option | Description | When Right | |--------|-------------|------------| | **Preserve** | Don't add the feature; the invariant pays dividends | Invariant simplifies many things; feature is marginal | | **Layer** | Add feature inelegantly on top | Feature is needed; invariant is still valuable; imperfection is acceptable | | **Revise** | Change the invariant itself | Genuine new learning invalidates the original decision | **Add to Justification Report:** ```markdown ### Invariant Impact: NONE / DETECTED [If DETECTED:] - **Invariant**: [name the design decision] - **Conflict**: [what change clashes with it] - **Option chosen**: Preserve / Layer / Revise - **Justification**: [why this option, not the others] - **Human reviewed**: YES / NO — if NO, flag as requiring review before merge ``` **Compounding risk warning:** Bad invariant decisions accumulate. If this branch has multiple invariant revisions, flag the entire branch for architectural review. Each silent invariant change multiplies the probability of an unsalvageable codebase. ### Step 5: Generate Justification Report Output a structured report: ```markdown ## Justification Report **Branch**: feature/xyz **Base**: master **Delta**: +N/-M lines, X files changed ### Additive Bias Score: X.X (ZONE) | Signal | Score | Detail | |--------|-------|--------| | Line ratio | N | +A/-D = R:1 | | New files | N | [list] | | Test changes | N | [list] | | New abstractions | N | [list] | | Workarounds | N | [list] | ### Iron Law Compliance: PASS/FAIL [List any test logic modifications with justification] ### Change-by-Change Justification #### file.py (+N/-M) - **What**: [description] - **Why**: [root cause this addresses] - **Alternatives considered**: [what else could work] - **Why this is minimal**: [why fewer changes won't work] #### test_file.py (+N/-M) - **What**: [description] - **Justification**: [why test logic changed, if it did] - **Iron Law status**: PASS/VIOLATION ### Risk Assessment | Factor | Rating | |--------|--------| | Lines changed | LOW/MED/HIGH | | Files touched | LOW/MED/HIGH | | Test modifications | NONE/JUSTIFIED/VIOLATION | | New abstractions | NONE/JUSTIFIED/UNNECESSARY | | Overall merge risk | LOW/MED/HIGH | ### Recommendations [List any changes that should be reconsidered, simpler alternatives, or unnecessary additions] ``` ## Decision Weights When evaluating competing approaches, weight these factors: | Factor | Weight | Rationale | |--------|--------|-----------| | Fewer lines changed | HIGH | Less risk, easier review | | No new files | HIGH | No new maintenance burden | | No test logic changes | HIGH | Iron Law compliance | | Root cause fix | HIGH | Prevents recurrence | | Removes code | BONUS | Reduces maintenance surface | | Adds abstraction |
Related in workflow-methodology
proof-of-work
IncludedEnforces validation and evidence before claiming work complete. Use before declaring implementation done, creating a PR, or submitting deliverables for review.
rigorous-reasoning
IncludedApplies anti-sycophancy checklist to override agreement bias. Use when analyzing contested claims or avoiding socially convenient but inaccurate conclusions.
feature-review
IncludedScores backlog items with RICE/WSJF/Kano and files GitHub issues for top candidates. Use when triaging a roadmap or prioritizing features for a sprint.
scope-guard
IncludedScores feature worthiness and enforces branch-size limits against overengineering. Use when evaluating whether a feature belongs in the current scope or branch.
assisted-mastery
IncludedMakes agent reasoning visible, surfaces tradeoffs, and fades help so humans build judgment. Use when reviewing or learning from agent-written code.
workflow-monitor
IncludedDetects workflow failures and inefficient patterns then files GitHub issues. Use when a workflow step repeatedly fails or produces inconsistent output.