tdd

Included with Lifetime

$97 forever

Use this skill whenever the user wants to write code using TDD, test-driven development, or test-first methodology. Triggers on: "/tdd", "let's do TDD", "write a failing test", "red green refactor", "test-first", "start a TDD cycle", or any request to implement a feature by writing tests before code. Adaptive five-step cycle (RED-DOMAIN-GREEN-DOMAIN-COMMIT) that detects harness capabilities and routes to guided (/tdd red|domain|green|commit) or automated (/tdd) mode. NOT for running existing tests, debugging test failures, or reviewing code -- only for the disciplined test-first cycle.

Writing & Docs

What this skill does


# TDD

**Value:** Feedback -- short cycles with verifiable evidence keep AI-generated
code honest and the human in control. Tests express intent; evidence confirms
progress.

## Purpose

Teaches a five-step TDD cycle (RED, DOMAIN, GREEN, DOMAIN, COMMIT) that
adapts to whatever harness runs it. Detects available delegation primitives
and routes to guided mode (human drives each phase) or automated mode
(system orchestrates phases). Prevents primitive obsession, skipped reviews,
and untested complexity regardless of mode.

## Practices

### The Five-Step Cycle

Every feature is built by repeating: RED -> DOMAIN -> GREEN -> DOMAIN -> COMMIT.

1. **RED** -- Write one failing test with one assertion. Only edit test files.
   Write the code you wish you had -- reference types and functions that do not
   exist yet. Run the test. Paste the failure output. Stop.
   Done when: tests run and FAIL (compilation error OR assertion failure).

2. **DOMAIN (after RED)** -- Review the test for primitive obsession and
   invalid-state risks. Create type definitions with stub bodies (`todo!()`,
   `raise NotImplementedError`, etc.). Do not implement logic. Stop.
   Done when: tests COMPILE but still FAIL (assertion/panic, not compilation error).

3. **GREEN** -- Address the immediate error — NEVER "make the test pass" in
   one go. Scope check before every change: can this be fixed with
   ~function-scope work (~20 lines, one file)? YES → make the change, run
   tests, check the next error. NO → drill down by writing a failing unit
   test for the smallest piece needed, then route it through a standard TDD
   cycle with swapped roles. Only edit production files (except when drilling
   down). Paste output after each change.
   Done when: tests PASS with minimal implementation.

4. **DOMAIN (after GREEN)** -- Review the implementation for domain violations:
   anemic models, leaked validation, primitive obsession that slipped through.
   If violations found, raise a concern and propose a revision.
   Done when: types are clean and tests still pass.

5. **COMMIT** -- Run the full test suite. Stage all changes and create a git
   commit referencing the GWT scenario. Run `git status` after committing to
   verify no uncommitted files remain. This is a **hard gate**: no new RED
   phase may begin until this commit exists and the working tree is clean.
   Done when: git commit created, all tests passing, working tree clean.

After step 5, either start the next RED phase or tidy the code (structural
changes only, separate commit).

A compilation failure IS a test failure. Do not pre-create types to avoid
compilation errors. Types flow FROM tests, never precede them.

Domain review has veto power over primitive obsession and invalid-state
representability. Debate continues until resolved or escalated to the human —
there is no round limit.

### User-Facing Modes

**Guided mode** (`/tdd red`, `/tdd domain`, `/tdd green`, `/tdd commit`):
Each phase loads `references/{phase}.md` with detailed instructions for that
step. For experienced engineers who want explicit phase control. Works on
any harness -- no delegation primitives required. The human decides when to
advance phases.

**Automated mode** (`/tdd` or `/tdd auto`):
The system detects harness capabilities, selects an execution strategy, and
orchestrates the full cycle. The user sees working code, not sausage-making.
For verbose output showing phase transitions and evidence, use `/tdd auto --verbose`.

### Capability Detection (Automated Mode)

When automated mode activates, detect available primitives in this order:

1. **Subagents available?** Check for Agent tool. If present, use the
   **subagents** strategy with focused per-phase agents.
2. **Fallback.** Use the **chaining** strategy -- role-switch internally between
   phases within a single context.

Select the most capable strategy available. Do not attempt a higher strategy
when its primitives are missing.

**You are the orchestrator.** The agent reading this file performs capability
detection and dispatches directly. Do NOT spawn a single "orchestrator"
subagent to do it for you -- that hides work, bypasses strategy detection,
and pre-selects the wrong strategy. Whether you were invoked by `/tdd`, by
the pipeline, or by any other caller: you detect capabilities, you choose
the strategy, you spawn the phase agents yourself.

**After determining your strategy, read ONLY the entry-point file for that
strategy:**

| Strategy | Entry-point file |
|----------|-----------------|
| Subagents | `references/orchestrator.md` |
| Chaining | (no entry file -- follow the chaining section below) |

Do NOT read `orchestrator.md` when using chaining.

`orchestrator.md` references `references/shared-rules.md` for rules that
apply to all strategies (domain veto, outside-in progression, pipeline
integration, pre-implementation context checklist). Read `shared-rules.md`
when directed by your strategy's entry-point file.

### Execution Strategy: Chaining (Fallback)

Used when no delegation primitives are available. The agent plays each role
sequentially:

1. Load `references/red.md`. Execute the RED phase.
2. Load `references/domain.md`. Execute DOMAIN review of the test.
3. Load `references/green.md`. Execute the GREEN phase.
4. Load `references/domain.md`. Execute DOMAIN review of the implementation.
5. Load `references/commit.md`. Execute the COMMIT phase.
6. Repeat.

Role boundaries are advisory in this mode. The agent must self-enforce phase
boundaries: only edit file types permitted by the current phase (see
`references/phase-boundaries.md`).

### Execution Strategy: Subagents

Used when the Agent tool is available for spawning focused subagents. Each
phase runs in an isolated subagent with constrained scope.

- Spawn each phase agent using `Agent(subagent_type="<agent-name>", prompt="...")`
  with the prompt template in `references/{phase}-prompt.md`.
- The orchestrator follows `references/orchestrator.md` for coordination rules.
- **Structural handoff schema** (`references/handoff-schema.md`): every phase
  agent must return evidence fields (test output, file paths changed, domain
  concerns). Missing evidence fields = handoff blocked. The orchestrator does
  not proceed to the next phase until the schema is satisfied.
- Context isolation provides structural enforcement: each subagent receives
  only the files relevant to its phase.

### Named Team Member Personas (Subagent Strategy)

When `.claude/agents/` definitions exist (from the `ensemble-team` skill), the
subagent strategy uses named personas for ping and pong roles. The orchestrator
selects team members based on slice context, spawns them as subagents using
`Agent(subagent_type="<agent-name>", prompt="...")`, and collects results to pass
as context to the next subagent.

See `references/orchestrator.md` for coordination rules and
`references/ping-pong-pairing.md` for persona selection, rotation, and
pairing history.

### Phase Boundary Rules

Each phase edits only its own file types. This prevents drift. See
`references/phase-boundaries.md` for the complete file-type matrix.

| Phase | Can Edit | Cannot Edit |
|-------|----------|-------------|
| RED | Test files | Production code, type definitions |
| DOMAIN | Type definitions (stubs) | Test logic, implementation bodies |
| GREEN | Implementation bodies | Test files, type signatures |
| COMMIT | Nothing -- git operations only | All source files |

If blocked by a boundary, stop and return to the orchestrator (automated) or
report to the user (guided). Never circumvent boundaries.

### Walking Skeleton First

The first vertical slice must be a walking skeleton: the thinnest end-to-end
path proving all architectural layers connect. It may use hardcoded values or
stubs. Build it before any other slice. It de-risks the architecture and gives
subsequent slices a proven wiring path to extend.

### Outside-In TDD

Start from an acceptance test at the application boundary --

Files: 17

Size: 89.3 KB

Complexity: 71/100

Category: Writing & Docs

Source: https://github.com/jwilger/agent-skills/tree/main/skills/tdd

Related in Writing & Docs

jax-development

Included

Use this skill when the user is writing, debugging, profiling, refactoring, reviewing, benchmarking, parallelising, exporting, or explaining JAX code, or when they mention JAX, jax.numpy, jit, grad, value_and_grad, vmap, scan, lax, random keys, pytrees, jax.Array, sharding, Mesh, PartitionSpec, NamedSharding, pmap, shard_map, Pallas, XLA, StableHLO, checkify, profiler, or the JAX repo. It helps turn NumPy or PyTorch-style code into pure functional JAX, fix tracer/control-flow/shape/PRNG bugs, remove recompiles and host-device syncs, choose transforms and sharding strategies, inspect jaxpr/lowering/IR, and benchmark compiled code correctly.

Writing & Docsscripts

nature-article-writer

Included

Drafts, rewrites, diagnostically critiques, and style-calibrates primary research manuscripts for Nature and Nature Portfolio journals. Use when the user wants a Nature-style title, summary paragraph or abstract, introduction, results, discussion, methods, figure legends, presubmission enquiry, cover letter, reviewer response, or when a scientific draft sounds generic, jargon-heavy, structurally weak, or AI-ish and needs precise, broad-reader-friendly prose without inventing data, analyses, or references. Best for primary research articles and letters rather than reviews or press releases unless explicitly adapting one.

Writing & Docsscripts

deckrd

Included

Document-driven framework that derives requirements, specifications, implementation plans, and executable tasks from goals through structured AI dialogue. Use when user says "write requirements", "create spec", "plan implementation", "derive tasks", "structure this feature", "break down into tasks", or "document this module". Also use for reverse engineering existing code into docs (/deckrd rev). Do NOT use for direct code writing — use /deckrd-coder after tasks are generated. Do NOT use when the user only wants to run or fix existing code without planning.

Writing & Docsscripts

clinical-decision-support

Included

Generate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug development, clinical research, and evidence synthesis.

Writing & Docsscripts

handling-sf-data

Included

Salesforce data operations with 130-point scoring. Use this skill to create, update, delete, bulk import/export, generate test data, and clean up org records using sf CLI and anonymous Apex. TRIGGER when: user creates test data, performs bulk import/export, uses sf data CLI commands, needs data factory patterns for Apex tests, or needs to seed/clean records in a Salesforce org. DO NOT TRIGGER when: SOQL query writing only (use querying-soql), Apex test execution (use running-apex-tests), or metadata deployment (use deploying-metadata).

Writing & Docsscripts

accelint-ac-to-playwright

Included

Convert and validate acceptance criteria for Playwright test automation. Use when user asks to (1) review/evaluate/check if AC are ready for automation, (2) assess if AC can be converted as-is, (3) validate AC quality for Playwright, (4) turn AC into tests, (5) generate tests from acceptance criteria, (6) convert .md bullets or .feature Gherkin files to Playwright specs, (7) create test automation from requirements. Handles both bullet-style markdown and Gherkin syntax with JSON test plan generation and validation.

Writing & Docsscripts