ultra-debug

Included with Lifetime

$97 forever

Spawn debugger and critic agents to collaboratively investigate a bug's root cause through adversarial hypothesis testing. Produces a grounded markdown report with no speculation.

Writing & Docs

What this skill does


# Ultra Debug

Orchestrate a team of debugger agents and a critic agent to investigate the root cause of a production issue through adversarial hypothesis testing.

## Input

The user provides: `$ARGUMENTS`

This may be a Sentry issue URL/ID, a problem description, relevant shop IDs, timestamps, error messages, or any combination.

## Orchestration Workflow

### Phase 0: Gather Context (keep it brief)

Spend minimal time here — just enough to frame the problem and form hypotheses. Quick searches are fine; deep investigation is the debuggers' job.

1. **Sentry**: If a Sentry URL or issue ID is provided, fetch the issue summary (title, stack trace, first/last seen)
2. **Quick checks**: A few log or code searches to understand the affected area are okay
3. **Summarize in one paragraph**: WHAT is broken, WHEN it started, WHO is affected, WHERE in the system

Do NOT spend more than a few tool calls here. Move on to hypothesis formation quickly.

### Phase 1: Form Hypotheses

Based on the summary and your general knowledge of the system, generate **5 distinct, testable hypotheses**. Each hypothesis must be:

- **Specific**: names a concrete mechanism (code path, data state, timing condition, external dependency)
- **Testable**: can be confirmed or refuted with available evidence (logs, code, data, errors)
- **Independent**: does not overlap significantly with other hypotheses

### Phase 2: Create Team

1. Create a team:

```
TeamCreate with team_name: "ultra-debug-<short-kebab-description>"
```

2. Create one task per hypothesis using TaskCreate:
   - Subject: `Investigate: <hypothesis summary>`
   - Description: full hypothesis details and suggested investigation approach

3. Create one task for the critic:
   - Subject: `Critique all hypotheses`
   - Description: list of all hypotheses, debugger names, and the evidence standards

### Phase 3: Spawn Agents

Spawn **all agents in parallel** (single message with multiple Agent tool calls).

For **each hypothesis**, spawn a debugger:

```
Agent tool:
  subagent_type: "debugger-teammate"
  team_name: "ultra-debug-<name>"
  name: "debugger-N"
  prompt: |
    You are debugger-N in team ultra-debug-<name>.

    YOUR HYPOTHESIS:
    <hypothesis description>

    PROBLEM CONTEXT:
    <gathered context from Phase 0>

    YOUR TEAMMATES:
    - Other debuggers: <list names and their hypotheses>
    - Critic: critic-1

    YOUR TASK ID: <task-id>

    Begin investigation. Share findings via SendMessage with critic-1
    and relevant debuggers.
```

Spawn **one critic**:

```
Agent tool:
  subagent_type: "critic-teammate"
  team_name: "ultra-debug-<name>"
  name: "critic-1"
  prompt: |
    You are critic-1 in team ultra-debug-<name>.

    PROBLEM CONTEXT:
    <gathered context from Phase 0>

    HYPOTHESES UNDER INVESTIGATION:
    1. <hypothesis 1> — investigated by debugger-1
    2. <hypothesis 2> — investigated by debugger-2
    3. <hypothesis 3> — investigated by debugger-3

    YOUR TASK ID: <task-id>

    Wait for debuggers to share initial findings, then begin your
    adversarial review. Continue for up to 5 rounds until every
    hypothesis is either proved or disproved.
```

### Phase 4: Monitor & Facilitate

1. **Wait** for agents to send findings — messages from teammates arrive automatically
2. If a debugger discovers evidence relevant to another's hypothesis, relay it if they haven't communicated directly
3. If agents are stuck (idle without progress), send guidance or additional context
4. Allow **up to 5 rounds** of challenge-response between debuggers and critic, until every hypothesis is either proved or disproved
5. If a new hypothesis emerges during investigation, spawn an additional debugger if warranted
6. Track progress via TaskList

### Phase 5: Synthesize Report

After all agents report their final assessments:

1. Collect all findings, evidence, and debate outcomes
2. Determine the consensus:
   - If one hypothesis **SURVIVED** the critic's challenges: status = **CONFIRMED**
   - If multiple survived or none did: status = **INCONCLUSIVE**
3. Write the report to `.claude-works/ultra-debug-<name>/report.md`
4. Present the report to the user

## Report Template

```markdown
# <title>

**Date**: YYYY-MM-DD
**Status**: CONFIRMED | INCONCLUSIVE

## Problem Statement

<What is broken. When it started. Who is affected. How it manifests.>

## Root Cause

<If CONFIRMED: Single definitive statement of the root cause, with primary evidence reference.>
<If INCONCLUSIVE: What was eliminated and what remains unresolved.>

## Evidence

| # | Finding | Source |
|---|---------|--------|
| 1 | <concrete finding> | <file:line / log timestamp / query / Sentry event ID> |
| 2 | ... | ... |

## Investigation Timeline

| Hypothesis | Verdict | Summary |
|------------|---------|---------|
| <hypothesis 1> | CONFIRMED / DISPROVED | <one line> |
| <hypothesis 2> | CONFIRMED / DISPROVED | <one line> |

## Debate Log

### Hypothesis 1: <name>
- **Debugger finding**: <summary with evidence refs>
- **Critic challenge**: <the challenge and its basis>
- **Resolution**: <how it was resolved, with evidence>

### Hypothesis 2: <name>
...

## Recommendations

<Concrete, actionable next steps based on findings. Reference specific code locations.>
```

## Strict Rules

### Language Rules

The final report MUST NOT contain any of these hedge words or phrases:

> likely, unlikely, maybe, perhaps, possibly, probably, might, could (expressing uncertainty),
> appears to, seems to, seems like, it looks like, we think, we believe,
> should be (expressing uncertainty), in theory

Every statement must be either:
- A **fact** backed by cited evidence in the Evidence table, OR
- Explicitly labeled as **[UNVERIFIED]** with a note on what evidence is missing

### Evidence Rules

- Code references: markdown link with `file:line` as link text and GitHub permalink with commit SHA as URL (e.g., `[/path/to/file.ts:142](https://github.com/org/repo/blob/<commit-sha>/path/to/file.ts#L142)`). Run `git rev-parse HEAD` to get the current commit SHA.
- Log references: timestamp and log source
- Database findings: the exact query used (reproducible)
- Sentry references: event ID or issue ID with link
- Git references: commit hash

### Scope Rules

- Do NOT fix the bug — only identify the root cause
- Do NOT modify any source files
- Do NOT speculate about fixes beyond the Recommendations section
- The report is the sole deliverable

Files: 1

Size: 6.6 KB

Complexity: 15/100

Category: Writing & Docs

Source: https://github.com/swen128/claude-plugins/tree/main/ultra-debug/skills/ultra-debug

Related in Writing & Docs

jax-development

Included

Use this skill when the user is writing, debugging, profiling, refactoring, reviewing, benchmarking, parallelising, exporting, or explaining JAX code, or when they mention JAX, jax.numpy, jit, grad, value_and_grad, vmap, scan, lax, random keys, pytrees, jax.Array, sharding, Mesh, PartitionSpec, NamedSharding, pmap, shard_map, Pallas, XLA, StableHLO, checkify, profiler, or the JAX repo. It helps turn NumPy or PyTorch-style code into pure functional JAX, fix tracer/control-flow/shape/PRNG bugs, remove recompiles and host-device syncs, choose transforms and sharding strategies, inspect jaxpr/lowering/IR, and benchmark compiled code correctly.

Writing & Docsscripts

nature-article-writer

Included

Drafts, rewrites, diagnostically critiques, and style-calibrates primary research manuscripts for Nature and Nature Portfolio journals. Use when the user wants a Nature-style title, summary paragraph or abstract, introduction, results, discussion, methods, figure legends, presubmission enquiry, cover letter, reviewer response, or when a scientific draft sounds generic, jargon-heavy, structurally weak, or AI-ish and needs precise, broad-reader-friendly prose without inventing data, analyses, or references. Best for primary research articles and letters rather than reviews or press releases unless explicitly adapting one.

Writing & Docsscripts

deckrd

Included

Document-driven framework that derives requirements, specifications, implementation plans, and executable tasks from goals through structured AI dialogue. Use when user says "write requirements", "create spec", "plan implementation", "derive tasks", "structure this feature", "break down into tasks", or "document this module". Also use for reverse engineering existing code into docs (/deckrd rev). Do NOT use for direct code writing — use /deckrd-coder after tasks are generated. Do NOT use when the user only wants to run or fix existing code without planning.

Writing & Docsscripts

clinical-decision-support

Included

Generate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug development, clinical research, and evidence synthesis.

Writing & Docsscripts

handling-sf-data

Included

Salesforce data operations with 130-point scoring. Use this skill to create, update, delete, bulk import/export, generate test data, and clean up org records using sf CLI and anonymous Apex. TRIGGER when: user creates test data, performs bulk import/export, uses sf data CLI commands, needs data factory patterns for Apex tests, or needs to seed/clean records in a Salesforce org. DO NOT TRIGGER when: SOQL query writing only (use querying-soql), Apex test execution (use running-apex-tests), or metadata deployment (use deploying-metadata).

Writing & Docsscripts

accelint-ac-to-playwright

Included

Convert and validate acceptance criteria for Playwright test automation. Use when user asks to (1) review/evaluate/check if AC are ready for automation, (2) assess if AC can be converted as-is, (3) validate AC quality for Playwright, (4) turn AC into tests, (5) generate tests from acceptance criteria, (6) convert .md bullets or .feature Gherkin files to Playwright specs, (7) create test automation from requirements. Handles both bullet-style markdown and Gherkin syntax with JSON test plan generation and validation.

Writing & Docsscripts