Meltano Transform
In Meltano, the T (Transform) part of ELT is complemented by dbt. In this article, I will introduce how to perform transforms using Meltano.
Installing dbt
To check the available dbt installations, use the following command:
$ meltano discover transformers
dbt-bigquery
dbt-postgres
dbt-redshift
dbt-snowflake
dbt
While it is possible to install dbt, it is recommended to install specific dbt plugins such as dbt-snowflake
. For this demonstration, we will be using dbt-bigquery
for executing the transform. Install dbt-bigquery
with the following command:
$ meltano add transformer dbt-bigquery
Verify the configuration for dbt-bigquery
:
$ meltano config dbt-bigquery list
Alternatively, you can refer to the official documentation:
Configure the settings for dbt-bigquery
, including project
, dataset
, auth_method
, and keyfile
, using the following commands. (You can also directly edit the meltano.yml
file.)
$ meltano config dbt-bigquery set project "YOUR_PROJECT_ID"
$ meltano config dbt-bigquery set dataset "YOUR_DATASET"
$ meltano config dbt-bigquery set auth_method "service-account"
$ meltano config dbt-bigquery set keyfile "PATH/TO/YOUR/KEYFILE"
Your meltano.yml
file should be like:
.
.
.
environments:
- name: dev
config:
plugins:
transformers:
- name: dbt-bigquery
config:
project: YOUR_PROJECT_ID
dataset: YOUR_DATASET
auth_method: service-account
keyfile: PATH/TO/YOUR/KEYFILE
Coding dbt Processing
The /transform
directory contains the following files:
$ ls /transform
dbt_project.yml
profile
models
Write your SQL statements in the /transform/models/
directory and optionally edit the /transform/dbt_project.yml
file.
Running dbt
You can run dbt in two ways:
- Execute the run command as part of the ELT pipeline, such as meltano run
tap-github target-bigquery dbt-bigquery:run
. - Execute the invoke command as a standalone operation, like
meltano invoke dbt-bigquery run
.
Additionally, you can create custom commands by editing the meltano.yml
file as follows:
.
.
.
plugins:
transformers:
- name: dbt-bigquery
commands:
my_models:
args: run --select +YOUR_MODEL_NAME
description: Run dbt, selecting model `YOUR_MODEL_NAME` and all upstream models.
$ meltano run tap-github target-bigquery dbt-bigquery:my_models
$ meltano invoke dbt-bigquery:my_models
Referencing a dbt Project from an External Repository
In addition to writing dbt code directly in the /transform/
directory, you can separate the Meltano project and the dbt project by creating a package.yml
file in the /transform/
directory. By including the following configuration, you can reference a dbt project from an external repository:
packages:
- git: https://github.com/your_repo/your-dbt-project.git
revision: 1.0.0
References