Introduction
dbt provides the ability to test models. Tests are used to detect the inclusion of low-quality data and to maintain data quality.
There are two ways to define tests
- Singular tests
- Generic tests
After defining a test, run the command dbt test
to execute the test.
Singular tests
Singular tests are to write test code as a .sql
file in the directory specified in test-paths
in dbt_project.yml
(/tests
by default).
The test code should contain SELECT
statements that search for rows that should not exist.
For example, if you have a model called fct_payments
and you want to test if amount
is a positive value, you would write the following SQL.
-- Refunds have a negative amount, so the total amount should always be >= 0.
-- Therefore return records where this isn't true to make the test fail
select
order_id,
sum(amount)
from {{ ref('fct_payments')}}
group by 1
having not(amount >= 0)
The dbt test
command executes the test code, and if a row is retrieved, the test fails. If the test fails, it will look like this.
$ dbt test
14:19:07 Running with dbt=1.0.4
14:19:07 Found 1 model, 1 test, 0 snapshots, 0 analyses, 165 macros, 0 operations, 0 seed files, 2 sources, 0 exposures, 0 metrics
14:19:07
14:19:07 Concurrency: 1 threads (target='dev')
14:19:07
14:19:07 1 of 1 START test assert_amount_is_positive........................ [RUN]
14:19:07 1 of 1 FAIL 1 assert_amount_is_positive............................ [FAIL 1 in 0.05s]
14:19:07
14:19:07 Finished running 1 test in 0.23s.
14:19:07
14:19:07 Completed with 1 error and 0 warnings:
14:19:07
14:19:07 Failure in test assert_fct_payments (tests/assert_amount_is_positive.sql)
14:19:07 Got 1 result, configured to fail if != 0
14:19:07
14:19:07 compiled SQL at target/compiled/my_dbt_proj/tests/assert_amount_is_positive.sql
14:19:07
14:19:07 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
A successful test will result in the following.
$ dbt test
14:21:33 Running with dbt=1.0.4
14:21:33 Found 1 model, 1 test, 0 snapshots, 0 analyses, 165 macros, 0 operations, 0 seed files, 2 sources, 0 exposures, 0 metrics
14:21:33
14:21:33 Concurrency: 1 threads (target='dev')
14:21:33
14:21:33 1 of 1 START test assert_amount_is_positive....................... [RUN]
14:21:33 1 of 1 PASS assert_amount_is_positive............................. [PASS in 0.05s]
14:21:33
14:21:33 Finished running 1 test in 0.23s.
14:21:33
14:21:33 Completed successfully
14:21:33
14:21:33 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
Generic Tests
Generic tests use the four standard schema tests provided by dbt.
unique
whether the column is unique or not.not_null
whether the column has no null valuesaccepted_values
Whether the column has the specified values or not.relationships
Whether referential integrity is satisfied.
For example, in a .yml
file as follows.
version: 2
models:
- name: customers
columns:
- name: customer_id
tests:
- unique
- not_null
- name: stg_customers
columns:
- name: customer_id
tests:
- unique
- not_null
- name: stg_orders
columns:
- name: order_id
tests:
- unique
- not_null
- name: status
tests:
- accepted_values:
values: ['placed', 'shipped', 'completed', 'return_pending', 'returned']
- name: customer_id
tests:
- not_null
- relationships:
to: ref('stg_customers')
field: customer_id
Running dbt test
will run a test of the contents of the above yml
file.
The schema test can also be extended by yourself by using external packages such as dbt-utils.
Custom Generic tests
Custom Generic tests can be created if the standard Generic tests lack functionality.
A custom Generic test is a SQL file with a test
block in the tests/generic
or macros/
directory.
{% test is_even(model, column_name) %}
with validation as (
select
{{ column_name }} as even_field
from {{ model }}
),
validation_errors as (
select even_field
from validation
where (even_field % 2) = 1
)
select *
from validation_errors
{% endtest %}
Create a .yml
file on the same level as the data model file and put the test names in the tests
section as follows.
version: 2
models:
- name: users
columns:
- name: favorite_number
tests:
- is_even
References