Claude
Skills
Sign in
Back

adding-dbt-unit-test

Included with Lifetime
$97 forever

Creates unit test YAML definitions that mock upstream model inputs and validate expected outputs. Use when adding unit tests for a dbt model or practicing test-driven development (TDD) in dbt.

Code Review

What this skill does


# Add unit test for a dbt model

## Additional Resources

- [Spec Reference](references/spec.md) - All required and optional YAML keys for unit tests
- [Examples](references/examples.md) - Unit test examples across formats (dict, csv, sql)
- [Incremental Models](references/special-cases-incremental-model.md) - Unit testing incremental models
- [Ephemeral Dependencies](references/special-cases-ephemeral-dependency.md) - Unit testing models depending on ephemeral models
- [Special Case Overrides](references/special-cases-special-case-overrides.md) - Introspective macros, project variables, environment variables
- [Versioned Models](references/special-cases-versioned-model.md) - Unit testing versioned SQL models
- [BigQuery Caveats](references/warehouse-bigquery-caveats.md) - BigQuery-specific caveats
- [BigQuery Data Types](references/warehouse-bigquery-data-types.md) - BigQuery data type handling
- [Postgres Data Types](references/warehouse-postgres-data-types.md) - Postgres data type handling
- [Redshift Caveats](references/warehouse-redshift-caveats.md) - Redshift-specific caveats
- [Redshift Data Types](references/warehouse-redshift-data-types.md) - Redshift data type handling
- [Snowflake Data Types](references/warehouse-snowflake-data-types.md) - Snowflake data type handling
- [Spark Data Types](references/warehouse-spark-data-types.md) - Spark data type handling

## What are unit tests in dbt

dbt unit tests validate SQL modeling logic on static inputs before materializing in production. If any unit test for a model fails, dbt will not materialize that model.

## When to use

You should unit test a model:
- Adding Model-Input-Output scenarios for the intended functionality of the model as well as edge cases to prevent regressions if the model logic is changed at a later date.
- Verifying that a bug fix solves a bug report for an existing dbt model.

More examples:
- When your SQL contains complex logic:
    - Regex
    - Date math
    - Window functions
    - `case when` statements when there are many `when`s
    - Truncation
    - Complex joins (multiple joins, self-joins, or joins with non-trivial conditions)
- When you're writing custom logic to process input data, similar to creating a function.
- Logic for which you had bugs reported before.
- Edge cases not yet seen in your actual data that you want to be confident you are handling properly.
- Prior to refactoring the transformation logic (especially if the refactor is significant).
- Models with high "criticality" (public, contracted models or models directly upstream of an exposure).

## When not to use

Cases we don't recommend creating unit tests for:
- Built-in functions that are tested extensively by the warehouse provider. If an unexpected issue arises, it's more likely a result of issues in the underlying data rather than the function itself. Therefore, fixture data in the unit test won't provide valuable information.
    - common SQL spec functions like `min()`, etc.

## General format

dbt unit test uses a trio of the model, given inputs, and expected outputs (Model-Inputs-Outputs):

1. `model` - when building this model
2. `given` inputs - given a set of source, seeds, and models as preconditions
3. `expect` output - then expect this row content of the model as a postcondition

### Workflow

### 1. Choose the model to test

Self explanatory -- the title says it all!

### 2. Mock the inputs

- Create an input for each of the nodes the model depends on.
- Specify the mock data it should use.
- Specify the `format` if different than the default (YAML `dict`).
  - See the "Data `format`s for unit tests" section below to determine which `format` to use.
- The mock data only needs include the subset of columns used within this test case.

**Tip:** Use `dbt show` to explore existing data from upstream models or sources. This helps you understand realistic input structures. However, always sanitize the sample data to remove any sensitive or PII information before using it in your unit test fixtures.

```shell
# Preview upstream model data
dbt show --select upstream_model --limit 5
```

### 3. Mock the output

- Specify the data that you expect the model to create given those inputs.
- Specify the `format` if different than the default (YAML `dict`).
  - See the "Data `format`s for unit tests" section below to determine which `format` to use.
- The mock data only needs include the subset of columns used within this test case.

### 4. Ensure upstream models exist before running

**Unit tests require direct parent models to exist in the warehouse.** Before running unit tests standalone (`dbt test`), verify that upstream models already exist first:

```bash
# Check if upstream models exist in the warehouse
dbt list --select +my_model --exclude my_model --resource-type model
# Then verify the tables/views actually exist in the warehouse via dbt show or your SQL client
dbt show --select upstream_model --limit 1
```

If upstream models **do not exist**, or **exist but have been modified and not yet refreshed**, build them using `--empty` to create schema-only versions:

```bash
# Build upstream models cheaply (schema only, no data read)
dbt run --select +my_model --exclude my_model --empty
```

> **Warning:** `--empty` overwrites existing models with schema-only (zero-row) versions. Only use it when models don't exist yet, or when schema changes need to be applied. Do not use it if upstream models contain data you want to preserve — it will wipe that data.

Skip this step if using `dbt build --select my_model` (recommended) — it handles the full pipeline including unit tests.

## Minimal unit test

Suppose you have this model:

```sql
-- models/hello_world.sql

select 'world' as hello
```

Minimal unit test for that model:

```yaml
# models/_properties.yml

unit_tests:
  - name: test_hello_world

    # Always only one transformation to test
    model: hello_world

    # No inputs needed this time!
    # Most unit tests will have inputs -- see the "real world example" section below
    given: []

    # Expected output can have zero to many rows
    expect:
      rows:
        - {hello: world}
```

## Executing unit tests

Run the unit tests, build the model, and run the data tests for the `hello_world` model:

```shell
dbt build --select hello_world
```

This saves on warehouse spend as the model will only be materialized and move on to the data tests if the unit tests pass successfully.

Or only run the unit tests without building the model or running the data tests:

```shell
dbt test --select "hello_world,test_type:unit"
```

Or choose a specific unit test by name:

```shell
dbt test --select test_is_valid_email_address
```

### Excluding unit tests from production builds

dbt Labs strongly recommends only running unit tests in development or CI environments. Since the inputs of the unit tests are static, there's no need to use additional compute cycles running them in production. Use them when doing development for a test-driven approach and CI to ensure changes don't break them.

Use the `--resource-type` flag `--exclude-resource-type` or the `DBT_EXCLUDE_RESOURCE_TYPES` environment variable to exclude unit tests from your production builds and save compute. 

## More realistic example

```yaml
unit_tests:

  - name: test_order_items_count_drink_items_with_zero_drinks
    description: >
      Scenario: Order without any drinks
        When the `order_items_summary` table is built
        Given an order with nothing but 1 food item
        Then the count of drink items is 0

    # Model
    model: order_items_summary

    # Inputs
    given:
      - input: ref('order_items')
        rows:
          - {
              order_id: 76,
              order_item_id: 3,
              is_drink_item: false,
            }
      - input: ref('stg_orders')
        rows:
          - { order_id: 76 }

    # Output
    expect:
      rows:
        - {
            order_id: 76,
            count_drink_item

Related in Code Review