test-review

Included with Lifetime

$97 forever

Evaluates test suites for coverage gaps, TDD/BDD compliance, and anti-patterns. Use when auditing test quality or before a major release.

testingtestingtddbddcoveragequalityfixtures

What this skill does

## Table of Contents

- [Quick Start](#quick-start)
- [When to Use](#when-to-use)
- [Required TodoWrite Items](#required-todowrite-items)
- [Progressive Loading](#progressive-loading)
- [Workflow](#workflow)
- [Step 1: Detect Languages (`test-review:languages-detected`)](#step-1:-detect-languages-(test-review:languages-detected))
- [Step 2: Inventory Coverage (`test-review:coverage-inventoried`)](#step-2:-inventory-coverage-(test-review:coverage-inventoried))
- [Step 3: Assess Scenario Quality (`test-review:scenario-quality`)](#step-3:-assess-scenario-quality-(test-review:scenario-quality))
- [Step 4: Plan Remediation (`test-review:gap-remediation`)](#step-4:-plan-remediation-(test-review:gap-remediation))
- [Step 5: Log Evidence (`test-review:evidence-logged`)](#step-5:-log-evidence-(test-review:evidence-logged))
- [Test Quality Checklist (Condensed)](#test-quality-checklist-(condensed))
- [Output Format](#output-format)
- [Summary](#summary)
- [Framework Detection](#framework-detection)
- [Coverage Analysis](#coverage-analysis)
- [Quality Issues](#quality-issues)
- [Remediation Plan](#remediation-plan)
- [Recommendation](#recommendation)
- [Integration Notes](#integration-notes)
- [Exit Criteria](#exit-criteria)


# Test Review Workflow

Evaluate and improve test suites with TDD/BDD rigor.

## Quick Start

```bash
/test-review
```
**Verification:** Run `pytest -v` to verify tests pass.

## When To Use

- Reviewing test suite quality
- Analyzing coverage gaps
- Before major releases
- After test failures
- Planning test improvements

## When NOT To Use

- Writing new tests - use parseltongue:python-testing
- Updating existing tests - use sanctum:test-updates

## Required TodoWrite Items

1. `test-review:languages-detected`
2. `test-review:coverage-inventoried`
3. `test-review:scenario-quality`
4. `test-review:invariant-preservation`
5. `test-review:gap-remediation`
6. `test-review:evidence-logged`
7. `test-review:findings-verified`

## Progressive Loading

Load modules as needed based on review depth:

- **Basic review**: Core workflow (this file)
- **Framework detection**: Load `modules/framework-detection.md`
- **Coverage analysis**: Load `modules/coverage-analysis.md`
- **Quality assessment**: Load `modules/scenario-quality.md`
- **Remediation planning**: Load `modules/remediation-planning.md`

## Workflow

### Step 1: Detect Languages (`test-review:languages-detected`)

Identify testing frameworks and version constraints.
→ **See**: `modules/framework-detection.md`

Quick check:
```bash
find . -maxdepth 2 -name "Cargo.toml" -o -name "pyproject.toml" -o -name "package.json" -o -name "go.mod"
```
**Verification:** Run the command with `--help` flag to verify availability.

### Step 2: Inventory Coverage (`test-review:coverage-inventoried`)

Run coverage tools and identify gaps.
→ **See**: `modules/coverage-analysis.md`

Quick check:
```bash
git diff --name-only | rg 'tests|spec|feature'
```
**Verification:** Run `pytest -v` to verify tests pass.

### Step 3: Assess Scenario Quality (`test-review:scenario-quality`)

Evaluate test quality using BDD patterns and assertion checks.
→ **See**: `modules/scenario-quality.md`

Focus on:
- Given/When/Then clarity
- Assertion specificity
- Anti-patterns (dead waits, mocking internals, repeated boilerplate)

### Step 4: Plan Remediation (`test-review:gap-remediation`)

Create concrete improvement plan with owners and dates.
→ **See**: `modules/remediation-planning.md`

### Step 5: Log Evidence (`test-review:evidence-logged`)

Record executed commands, outputs, and recommendations.
→ **See**: `imbue:proof-of-work`

## Test Quality Checklist (Condensed)

- [ ] Clear test structure (Arrange-Act-Assert)
- [ ] Critical paths covered (auth, validation, errors)
- [ ] Specific assertions with context
- [ ] No flaky tests (dead waits, order dependencies)
- [ ] Reusable fixtures/factories
- [ ] Invariant-encoding tests intact (see below)

### Invariant-Encoding Tests

Tests encode design invariants as well as verifying behavior.
A test that asserts "module A never imports from module B"
encodes a layer boundary. A test that
asserts "this function is pure" encodes a concurrency
model. These tests are load-bearing in ways that
coverage metrics cannot capture.

**During review, check:**

1. **Were invariant-encoding tests removed or weakened?**
   A test that enforced an architectural boundary,
   data structure constraint, or API contract should
   not be deleted without naming the invariant being
   abandoned and escalating to human judgment.

2. **Were test expectations changed to match a broken
   implementation?** If an assertion value changed, ask:
   did the *requirement* change, or did the agent change
   the test to make its code pass? The latter is the
   single most dangerous form of test tampering.

3. **Are new invariants encoded as tests?** When a design
   decision is made (choice of data structure, module
   boundary, error strategy), there should be at least
   one test whose failure would signal that the
   invariant was violated.

**Red flag patterns:**

| Pattern | Risk |
|---------|------|
| `@pytest.mark.skip` added to a passing test | Invariant being silently dropped |
| Assertion changed from specific to broad | Constraint being relaxed |
| Test renamed to describe new behavior | Old invariant erased from history |
| Test deleted "because it tested old code" | Invariant removed without replacement |

**When invariant erosion is detected:**

Do NOT approve. Flag as a BLOCKING quality issue and
present the three options to the human:

1. **Preserve**: Revert the test change, fix the
   implementation to satisfy the invariant
2. **Layer**: Keep the invariant test, add the new
   behavior alongside it (accepting inelegance)
3. **Revise**: The invariant is genuinely wrong; remove
   the old test AND write a new test encoding the
   replacement invariant

This is a judgment call that models get wrong far too
often. Default to option 1 (preserve) when no human is
available.

## Output Format

```markdown
## Summary
[Brief assessment]

## Framework Detection
- Languages: [list] | Frameworks: [list] | Versions: [constraints]

## Coverage Analysis
- Overall: X% | Critical: X% | Gaps: [list]

## Quality Issues
[Q1] [Issue] - Location - Anchor: `verbatim source text at file:line` - Fix

## Remediation Plan
1. [Action] - Owner - Date

## Recommendation
Approve / Approve with actions / Block
```
**Verification:** Run the command with `--help` flag to verify availability.

## Integration Notes

- Use `imbue:proof-of-work` for reproducible evidence capture
- Reference `imbue:diff-analysis` for risk assessment
- Format output using `imbue:structured-output` patterns

## Verify Findings Are Grounded (`test-review:findings-verified`)

Every finding must cite a real location and a verbatim anchor. Write
findings to `.review/findings.json` and confirm each citation resolves:

```bash
python plugins/imbue/scripts/citation_verifier.py \
  --findings .review/findings.json --repo-root .
```

Drop or label `UNVERIFIED` any finding the verifier fails (exit `1`); only
verified findings enter the report. See `Skill(imbue:review-core)` Step 5
and `Skill(imbue:structured-output)` for the schema.

## Exit Criteria

- Frameworks detected and documented
- Coverage analyzed and gaps identified
- Scenario quality assessed
- Remediation plan created with owners and dates
- Evidence logged with citations
- Every reported finding carries a `Location` + verbatim `Anchor` confirmed
  by `citation_verifier.py` (exit `0`), or unverified findings were dropped
  or labeled `UNVERIFIED`
## Troubleshooting

### Common Issues

**Tests not discovered**
Ensure test files match pattern `test_*.py` or `*_test.py`. Run `pytest --collect-only` to verify.

**Import errors**
Check that the module being tested is in `PYTHONPATH` or install with `pip install -e .`

**Async tests failing**
Install pytest-asyncio and decorate test functions with `@pytest.

Files: 6

Size: 26.4 KB

Complexity: 40/100

Category: testing

Source: https://github.com/athola/claude-night-market/tree/main/plugins/pensive/skills/test-review

Related in testing

python-testing

Included

Python testing patterns with pytest, fixtures, TDD, mocking, async and integration tests. Use when writing or auditing a Python test suite.

testing

Skill Validator

Included

Validates that a skill or MCP implementation matches its manifest by running Codex-powered semantic comparisons across descriptions, preconditions, effects, and API surface.

testing

unit-test-vue-pinia

Included

Write and review unit tests for Vue 3 + TypeScript + Vitest + Pinia codebases. Use when creating or updating tests for components, composables, and stores; mocking Pinia with createTestingPinia; applying Vue Test Utils patterns; and enforcing black-box assertions over implementation details.

testing

subagent-testing

Included

Test skills via TDD in fresh subagents. Use when validating behavior or preventing bias.

testing

code-review

Included

Systematic code review patterns covering security, performance, maintainability, correctness, and testing — with severity levels, structured feedback guidance, review process, and anti-patterns to avoid. Use when reviewing PRs, establishing review standards, or improving review quality.

testing

playwright-browser

Included

Use when automating browsers, testing pages, taking screenshots, checking UI, verifying login flows, or testing responsive behavior

testing

python-testing

Included

Python testing patterns with pytest, fixtures, TDD, mocking, async and integration tests. Use when writing or auditing a Python test suite.

testing

Skill Validator

Included

Validates that a skill or MCP implementation matches its manifest by running Codex-powered semantic comparisons across descriptions, preconditions, effects, and API surface.

testing

unit-test-vue-pinia

Included

testing

subagent-testing

Included

Test skills via TDD in fresh subagents. Use when validating behavior or preventing bias.

testing

code-review

Included

testing

playwright-browser

Included

Use when automating browsers, testing pages, taking screenshots, checking UI, verifying login flows, or testing responsive behavior

testing