wio
Testing workflow skill for finding high-value test candidates, writing focused tests, generating realistic workloads, reviewing test value, and diagnosing test-suite health. Use for prompts about what to test next, adding or improving tests, reviewing whether a test is worth keeping, low-signal or flaky tests, workload scenarios, or test-suite trust. Do not use for general QA discussion unless the user wants a concrete testing workflow or artifact.
What this skill does
# WIO WIO is one testing workflow skill with five command modes: - `scan`: find the highest-value test candidates for a codebase, change, or scope. - `test`: write one focused high-value test for a selected behavior, code path, or regression risk. - `workload`: generate a realistic, adversarial workload that adds a new failure surface, oracle, sequence, or coverage dimension with controlled variance, replay, and correctness invariants. - `review`: review a newly written or existing test for customer value, developer value, signal quality, and maintainability. - `doctor`: diagnose test-suite health problems in a codebase or scope. Commands are accessed through `$wio`: | Command | What it does | Default reference | | --- | --- | --- | | `$wio scan [target]` | Find the highest-value test candidates for a codebase, change, or scope. | [Behavior To Test Map](references/behavior-to-test-map/overview.md) | | `$wio test [target]` | Discover a valuable candidate, pick strategy, write one test, validate, review, and keep only if valuable. | [Test Level Selection](references/test-level-selection/overview.md) | | `$wio workload [target]` | Generate a realistic workload that adds new bug-finding value beyond existing workloads, with important user tasks, adversarial edge cases, assertions, invariants, and controlled variance. | [Workload Modeling](references/workload-modeling/overview.md) | | `$wio review [target]` | Review a test for meaningful customer or developer value and return `KEEP`, `REDO`, or `REMOVE`. | [Test Oracles And Assertions](references/test-oracles-and-assertions/overview.md) | | `$wio doctor [target]` | Diagnose test-suite health problems in a codebase or scope. | [Test Suite Health Diagnostics](references/test-suite-health-diagnostics/overview.md) | Use [references/index.md](references/index.md) to route from code evidence and candidate failure modes to the right strategy references. ## Reference Loading Do not pick a test strategy from memory or from the nearest existing test alone. First inspect the target code, public behavior, existing tests, fixtures, and test commands. Then identify candidate behaviors or workloads and their likely failure mechanisms. Only after candidates exist, load the references needed to choose the strategy. For every selected candidate, load at least one strategy reference that matches the failure mechanism before recommending or writing a test: - Load [Risk-Based Testing](references/risk-based-testing/overview.md) when priorities, customer impact, security/business risk, or limited test capacity decide what comes first. - Load [User Behavior Testing](references/user-behavior-testing/overview.md) when the behavior is a user journey, product workflow, API consumer flow, or operator task. - Load [Test Level Selection](references/test-level-selection/overview.md) before choosing unit, component, integration, contract, E2E, workload, synthetic, or monitoring coverage. - Load [Test Oracles And Assertions](references/test-oracles-and-assertions/overview.md) before writing or reviewing assertions, invariants, snapshots, workload checks, or any test whose failure signal is unclear. - Load [Test Data And Fixtures](references/test-data-and-fixtures/overview.md) when state setup, seeds, factories, cleanup, permissions, or data realism affect signal. - Load [Mocking And Test Doubles](references/mocking-and-test-doubles/overview.md) when a mock, fake, stub, emulator, or real dependency decision could change what risk is preserved. - Load [Test Feedback Loops](references/test-feedback-loops/overview.md) when choosing local, PR CI, nightly, release, canary, synthetic, or production-monitoring placement. - Load specialized references only when the fault mechanism calls for them: property-based testing for broad deterministic input spaces, fuzzing for parsers/untrusted input, mutation testing for weak assertions, performance testing for latency/saturation, resilience testing for dependency failure, security testing for abuse or tenant/auth risk, static analysis for code/config-shape defects, and regression selection when the full suite is too slow. For commands and repo signals in a topic, load the sibling `tools.md` after the matching `overview.md` shows that topic is relevant. State which reference files informed the chosen strategy. ## Command Selection Use `scan` when the user asks what to test next, where coverage would matter, how to prioritize testing work, or which tests would reduce user, production, support, or team risk. Use `test` when the user asks to add a test, improve a specific test, cover a bug, or validate a change with a meaningful automated test. Use `workload` when the user asks for a realistic user-session workload, scenario generator, traffic model, load/performance scenario, browser journey mix, synthetic user flow, or varied workload that still preserves a stable task goal. If the user asks to `generate` a workload, treat existing workloads as evidence and reusable infrastructure, not as the deliverable. A generated workload must add at least one new failure surface, adversarial class, oracle/invariant, state model, dependency fault, user/session path, or coverage dimension. A wrapper, runner, seed sweep, parameter expansion, or documentation-only change around an existing workload is not a generated workload unless the user explicitly asked for a runner or the wrapper adds a new oracle or adversarial model. Use `review` when the user asks whether a test is worth keeping, asks for test review, or after `$wio test` writes or changes a test. Use `doctor` when the user asks to audit tests, review suite quality, find flaky or low-value tests, inspect CI test health, or explain why a test suite is slow, noisy, or low-signal. If the user explicitly names a WIO command, follow that mode. If the command is omitted, infer the mode from the request. If no command or target is provided, show the command table and ask what they want to do. If multiple modes apply, start with `scan` before `test` or `workload`, and use `doctor` only for existing suite health. ## Shared Rules - Protect meaningful behavior, not coverage numbers. - Ask what could go wrong before asking what already broke; use that answer to choose tests while the design is still cheap to change. - Establish product, user, production, support, debugging, review, or release risk before recommending or writing tests. - Prefer targets where bugs usually occur: boundaries, permissions, state transitions, persistence, external dependencies, concurrency/time, validation/parsing, migrations, configuration, caching, retries/idempotency, UI workflow joins, and recent churn. - A test is valuable only if it would catch a meaningful regression, save developer time, improve release confidence, or expose a real operational/customer failure mode. - Before keeping a test or workload, name at least one plausible bug it would catch and the assertion or invariant that would fail. - Prefer assertions and invariants that encode the mental model of the behavior, including valid cases, invalid cases, and boundary transitions between them. - Prefer repo-native frameworks, helpers, fixtures, commands, and naming. - Choose the narrowest test level that preserves the real failure mechanism. - Read code and tests before choosing strategy; load targeted references after candidate failure modes are known. - State evidence inspected, commands run, commands not run, and residual risk. - Mark low-value tests `REDO` or `REMOVE`, not `KEEP`. ## Gotchas - Do not write or keep tests just to increase coverage. Covered code with weak assertions is false confidence. - Do not mock away the boundary, state, permission, timing, data, or dependency behavior that creates the real risk. - Do not accept broad snapshots unless the reviewed snapshot is the protected contract and the update path is disciplined. - Do not use a full-suite command when a smaller command validates th
Related in Writing & Docs
jax-development
IncludedUse this skill when the user is writing, debugging, profiling, refactoring, reviewing, benchmarking, parallelising, exporting, or explaining JAX code, or when they mention JAX, jax.numpy, jit, grad, value_and_grad, vmap, scan, lax, random keys, pytrees, jax.Array, sharding, Mesh, PartitionSpec, NamedSharding, pmap, shard_map, Pallas, XLA, StableHLO, checkify, profiler, or the JAX repo. It helps turn NumPy or PyTorch-style code into pure functional JAX, fix tracer/control-flow/shape/PRNG bugs, remove recompiles and host-device syncs, choose transforms and sharding strategies, inspect jaxpr/lowering/IR, and benchmark compiled code correctly.
nature-article-writer
IncludedDrafts, rewrites, diagnostically critiques, and style-calibrates primary research manuscripts for Nature and Nature Portfolio journals. Use when the user wants a Nature-style title, summary paragraph or abstract, introduction, results, discussion, methods, figure legends, presubmission enquiry, cover letter, reviewer response, or when a scientific draft sounds generic, jargon-heavy, structurally weak, or AI-ish and needs precise, broad-reader-friendly prose without inventing data, analyses, or references. Best for primary research articles and letters rather than reviews or press releases unless explicitly adapting one.
deckrd
IncludedDocument-driven framework that derives requirements, specifications, implementation plans, and executable tasks from goals through structured AI dialogue. Use when user says "write requirements", "create spec", "plan implementation", "derive tasks", "structure this feature", "break down into tasks", or "document this module". Also use for reverse engineering existing code into docs (/deckrd rev). Do NOT use for direct code writing — use /deckrd-coder after tasks are generated. Do NOT use when the user only wants to run or fix existing code without planning.
clinical-decision-support
IncludedGenerate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug development, clinical research, and evidence synthesis.
handling-sf-data
IncludedSalesforce data operations with 130-point scoring. Use this skill to create, update, delete, bulk import/export, generate test data, and clean up org records using sf CLI and anonymous Apex. TRIGGER when: user creates test data, performs bulk import/export, uses sf data CLI commands, needs data factory patterns for Apex tests, or needs to seed/clean records in a Salesforce org. DO NOT TRIGGER when: SOQL query writing only (use querying-soql), Apex test execution (use running-apex-tests), or metadata deployment (use deploying-metadata).
accelint-ac-to-playwright
IncludedConvert and validate acceptance criteria for Playwright test automation. Use when user asks to (1) review/evaluate/check if AC are ready for automation, (2) assess if AC can be converted as-is, (3) validate AC quality for Playwright, (4) turn AC into tests, (5) generate tests from acceptance criteria, (6) convert .md bullets or .feature Gherkin files to Playwright specs, (7) create test automation from requirements. Handles both bullet-style markdown and Gherkin syntax with JSON test plan generation and validation.