Meltano Transform
In Meltano, the T (Transform) part of ELT is complemented by dbt. In this article, I will introduce how to perform transforms using Meltano.
Installing dbt
To check the available dbt installations, use the following command:
$ meltano discover transformers
dbt-bigquery
dbt-postgres
dbt-redshift
dbt-snowflake
dbt
While it is possible to install dbt, it is recommended to install specific dbt plugins such as dbt-snowflake. For this demonstration, we will be using dbt-bigquery for executing the transform. Install dbt-bigquery with the following command:
$ meltano add transformer dbt-bigquery
Verify the configuration for dbt-bigquery:
$ meltano config dbt-bigquery list
Alternatively, you can refer to the official documentation:
Configure the settings for dbt-bigquery, including project, dataset, auth_method, and keyfile, using the following commands. (You can also directly edit the meltano.yml file.)
$ meltano config dbt-bigquery set project "YOUR_PROJECT_ID"
$ meltano config dbt-bigquery set dataset "YOUR_DATASET"
$ meltano config dbt-bigquery set auth_method "service-account"
$ meltano config dbt-bigquery set keyfile "PATH/TO/YOUR/KEYFILE"
Your meltano.yml file should be like:
.
.
.
environments:
- name: dev
config:
plugins:
transformers:
- name: dbt-bigquery
config:
project: YOUR_PROJECT_ID
dataset: YOUR_DATASET
auth_method: service-account
keyfile: PATH/TO/YOUR/KEYFILE
Coding dbt Processing
The /transform directory contains the following files:
$ ls /transform
dbt_project.yml
profile
models
Write your SQL statements in the /transform/models/ directory and optionally edit the /transform/dbt_project.yml file.
Running dbt
You can run dbt in two ways:
- Execute the run command as part of the ELT pipeline, such as meltano run
tap-github target-bigquery dbt-bigquery:run. - Execute the invoke command as a standalone operation, like
meltano invoke dbt-bigquery run.
Additionally, you can create custom commands by editing the meltano.yml file as follows:
.
.
.
plugins:
transformers:
- name: dbt-bigquery
commands:
my_models:
args: run --select +YOUR_MODEL_NAME
description: Run dbt, selecting model `YOUR_MODEL_NAME` and all upstream models.
$ meltano run tap-github target-bigquery dbt-bigquery:my_models
$ meltano invoke dbt-bigquery:my_models
Referencing a dbt Project from an External Repository
In addition to writing dbt code directly in the /transform/ directory, you can separate the Meltano project and the dbt project by creating a package.yml file in the /transform/ directory. By including the following configuration, you can reference a dbt project from an external repository:
packages:
- git: https://github.com/your_repo/your-dbt-project.git
revision: 1.0.0
References