testing-best-practices
Battle-tested testing best practices for AI coding assistants (40+ rules). Use when writing, reviewing, or generating tests. Covers test structure, data factories, assertions, mocking, DOM testing, and database testing. Works with Vitest, Jest, Playwright, Testing Library, Storybook, and more.
What this skill does
# Testing Rules for AI
You're a testing expert that is keen to keep the tests simple, clean, consistent and short. Here is a list of best practices to follow. When you find some issues in a test, mention the violated bullet number
These rules are not applicable to end-to-end tests that spans multiple processes and components, only for unit, integration, component, Microservice, API tests. If you realize tests that don't mock the backend, these are end-to-end tests, in this case apply the rules from references/e2e-testing-rules.md
## The 6 most important (!) rules:
Tests must never become another system to maintain, so we keep its complexity ridiculously low. Building a super simple reading experience is a top priority. Always stop coding a test if you can't follow these rules. While all rules in this document are mandatory, these 6 are absolutely critical:
1. Important: The test should have no more than 10 statements #customize
2. Important: Like a good story, the test should contain no unnecessary details, yet include all details that directly affect the test result
3. Important: Anything beside flat statements is not allowed - no if/else, no loops, no try-catch, no console.log
4. Important: Given the test scope, it should COVER all the layers of the code under test (e.g., frontend page, backend Microservice). In other words, never mock INTERNAL parts of the application, only pieces that make calls to external systems
5. ๐ซ The smoking gun principle: Important: Each data or assumption in the assertion/expectation phase, must appear first in the arrange phase to make the result and cause clear to the reader
6. Important: Each test that is self-contained and never relies on other tests state or generated artifacts. Consequently, if a test depends on any state, it should create it itself or ensure it was created in a hook
## Section A - The Test Structure
A. 1. The test title should have the pattern of 'When {case/scenario}, then {some expectation}', For example, 'When adding a valid order, then it should be retrievable' #customize
A. 3. No more than 10 statements and expressions. Don't count a single expression that was broken to multiple lines #customize
A. 4. If some data from the arrange phase is used in the assert phase, don't duplicate values. Instead, reference the arranged data directly - this closes the loop showing the reader how the ๐ซ smoking gun from the arrange phase leads to the result in the assertion. Example: Use `expect(result.id).toBe(activeOrder.id)` not `expect(result.id).toBe('123')`
A. 5. A test should have at least three phases: Arrange, Act and Assert. Either the phase names exist in the test or a line break must appear before the 2nd and 3rd phases
A. 10. No more than 3 assertions
A. 13. Totally flat, no try-catch, no loops, no comments, no console.log
A. 15. ๐ฅจ The breadcrumb principle: Important: Anything that affects a test directly should exist directly in the test (e.g., a data that will get checked in the assert phase). If something implicitly might affect the test, it should exist in a local test hook (e.g., mock authentication in beforeEach, not in external setup). Avoid hidden effects from extraneous setup files
A.18. For a delightful test experience, ensure all variables are typed implicitly or explicitly. Don't use 'any' type. Should you need to craft a deliberately invalid input, use 'myIllegalObject as unknown as LegalType'
A.23. For clarity, assertions should exist only inside test and never inside helpers or hooks
A.25. Assertions should exist only in the /Assert phase, never in start or middle of a test
A.28. If some specific arrangement demands 3 or more lines, move into a function in the /test/helpers folder. It's OK if the overall Arrange is more than 2 lines, only if specific setup that aims to achieve one thing grabs 3 or more lines - it should be extracted to a helper file
## Section B - The Test Logic
B. 3. ๐ซ The smoking gun principle: Important: Each data or assumption in the assertion phase, must appear first in the arrange phase to make the result and cause clear to the reader
B. 5. Details that are not directly related with understanding the test result, should not be part of the test
B. 10. There should be no redundant assertions
B. 15. Don't assert and compare huge datasets but rather focus on a specific topic or area in a test
B. 20. If a test assumes the existence of some records/data, it must create it upfront in the Arrange phase
B. 23. Don't test implementation details. Mention this issue only if seeing assertions that check internal implementation and not user-facing behavior like screen elements
B. 25. Avoid any time-based waiting like setTimeout or page.waitForTimeout(2000)
B. 28. Clean up before each test (beforeEach) anything that might leak between tests: mocks, environment variables, local storage, globals, and other resources that make tests step on each other's toes
## Section C - The Test Data
C.3. Data like JSON and entities should come from a data factory in the data folder. Each type of data should have its own data factory file with a main function to build the entity (e.g., buildOrder, buildUser)
C.4. The factory function should return default data but also allow the caller to provide overrides to specific fields, this way each test can modify specific field values
C.5. When setting a common universal data in a field like dates, addresses or anything that is not domain-specific, use libraries that provide realistic real-world data like fakerjs and alike
C.7. The data factory function incoming and outgoing params should have types, the same types that are used by the code under test
C.10. For the test data, use meaningful domain data, not dummy values
C.15. When building a field that can have multiple options, by default randomize an option to allow testing across all options
C.20. When having list/arrays, by default put two items. Why? zero and one are a naive choice in terms of finding bugs, putting 20 on the other hand is overwhelming. Two is a good balance between simplicity and realism
### An example of a good data factory that follows these rules:
```
import { faker } from "@faker-js/faker";
import { FileContext } from "../types";
export function buildFileFromIDE(overrides: Partial<FileContext> = {}): FileContext {
return {
path: faker.system.filePath(),
type: faker.helpers.arrayElement(["file", "folder"]),
...overrides,
};
}
```
## Section D - Assertions
D.7. Avoid custom coding, loop and Array.prototype function, stick to built-in expect APIs, including for Arrays
D.11. Use the minimal amount of assertions to catch failures - avoid redundant checks. Use: `expect(response).toEqual([{id: '123'}, {id: '456'}])` instead of:
```
expect(response).not.toBeNull() // redundant
expect(Array.isArray(response)).toBe(true) // redundant
expect(response.length).toBe(2) // redundant
expect(response[0].id).toBe('123') // redundant
```
The single assertion will catch null, non-array, and wrong data issues
D.13. Prefer assertion matchers that provide full comparison details on failure. Use `expect(actualArray).toEqual(expectedArray)` which shows the complete diff, not `expect(actualArray.contains(expectedValue)).toBeTrue()` which only shows true/false
D.15. When asserting on an object that has more than 3 fields, grab the expected object from a data factory, override the key 3 most important values. If there are more than 3 important values to assert on, break this down into one more test case
## Section E - Mocking
E.1. IMPORTANT: Mock only the code that calls external collaborators outside our test scope (e.g., email service clients, payment gateways). Exception: mocks needed to simulate critical events that cannot be triggered otherwise
E.3. Always use the types/interfaces of the mocked code so that when the real implementation changes, the mock fails compilation and forces updates to match the new contRelated in Writing & Docs
jax-development
IncludedUse this skill when the user is writing, debugging, profiling, refactoring, reviewing, benchmarking, parallelising, exporting, or explaining JAX code, or when they mention JAX, jax.numpy, jit, grad, value_and_grad, vmap, scan, lax, random keys, pytrees, jax.Array, sharding, Mesh, PartitionSpec, NamedSharding, pmap, shard_map, Pallas, XLA, StableHLO, checkify, profiler, or the JAX repo. It helps turn NumPy or PyTorch-style code into pure functional JAX, fix tracer/control-flow/shape/PRNG bugs, remove recompiles and host-device syncs, choose transforms and sharding strategies, inspect jaxpr/lowering/IR, and benchmark compiled code correctly.
nature-article-writer
IncludedDrafts, rewrites, diagnostically critiques, and style-calibrates primary research manuscripts for Nature and Nature Portfolio journals. Use when the user wants a Nature-style title, summary paragraph or abstract, introduction, results, discussion, methods, figure legends, presubmission enquiry, cover letter, reviewer response, or when a scientific draft sounds generic, jargon-heavy, structurally weak, or AI-ish and needs precise, broad-reader-friendly prose without inventing data, analyses, or references. Best for primary research articles and letters rather than reviews or press releases unless explicitly adapting one.
deckrd
IncludedDocument-driven framework that derives requirements, specifications, implementation plans, and executable tasks from goals through structured AI dialogue. Use when user says "write requirements", "create spec", "plan implementation", "derive tasks", "structure this feature", "break down into tasks", or "document this module". Also use for reverse engineering existing code into docs (/deckrd rev). Do NOT use for direct code writing โ use /deckrd-coder after tasks are generated. Do NOT use when the user only wants to run or fix existing code without planning.
clinical-decision-support
IncludedGenerate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug development, clinical research, and evidence synthesis.
handling-sf-data
IncludedSalesforce data operations with 130-point scoring. Use this skill to create, update, delete, bulk import/export, generate test data, and clean up org records using sf CLI and anonymous Apex. TRIGGER when: user creates test data, performs bulk import/export, uses sf data CLI commands, needs data factory patterns for Apex tests, or needs to seed/clean records in a Salesforce org. DO NOT TRIGGER when: SOQL query writing only (use querying-soql), Apex test execution (use running-apex-tests), or metadata deployment (use deploying-metadata).
accelint-ac-to-playwright
IncludedConvert and validate acceptance criteria for Playwright test automation. Use when user asks to (1) review/evaluate/check if AC are ready for automation, (2) assess if AC can be converted as-is, (3) validate AC quality for Playwright, (4) turn AC into tests, (5) generate tests from acceptance criteria, (6) convert .md bullets or .feature Gherkin files to Playwright specs, (7) create test automation from requirements. Handles both bullet-style markdown and Gherkin syntax with JSON test plan generation and validation.