You can easily use SQLMesh's open source IDE to explore column level lineage in your existing dbt project. SQLMesh is a powerful alternative to dbt, but is compatible with existing dbt projects and can be used to expose column-level lineage.
This guide will use jaffle shop DuckDB, but you can use this on your own dbt project instead.
Install jaffle shop DuckDB
You can skip to the next step if you already have a dbt project.
Clone the repository and enter it.
git clone https://github.com/TobikoData/jaffle_shop_duckdb.git
cd jaffle_shop_duckdb
Setup a virtual env and install dependencies.
python3 -m venv .env
source .env/bin/activate
pip3 install --upgrade pip
pip3 install -r requirements.txt
Setup SQLMesh
Setup your dbt project to work with SQLMesh. This will not affect your original dbt project. You can learn more about our dbt integration in our documentation.
pip3 install "sqlmesh[web]"
sqlmesh init -t dbt
Add start
to your dbt_project.yml. Start is a required variable in SQLMesh that indicates the earliest date that your data warehouse needs to consider.
diff --git a/dbt_project.yml b/dbt_project.yml
index 8bfaa9c..8bd285d 100644
--- a/dbt_project.yml
+++ b/dbt_project.yml
@@ -19,11 +19,12 @@ clean-targets:
require-dbt-version: [">=1.0.0", "<2.0.0"]
seeds:
+docs:
node_color: '#cd7f32'
models:
+ start: "2020-01-01"
jaffle_shop:
materialized: table
staging:
Start the SQLMesh UI
sqlmesh ui
And click this link to launch the UI.
Click on orders.sql
in the left hand file selector pane.
Click on bank_transfer_amount
in the main.orders box.
You can now see column level lineage for bank_transfer_amount
, including all CTEs.
There's a lot of stuff to unravel here, but you can see bank_transfer_amount
is derived from payments.payment_method
and payments.amount
.
We're actively working on improving column level lineage and have a big update coming soon which should make this an even better experience. If you run into any issues or have any suggestions, please join us on Slack. We are committed towards collaborating with the community to make working with and understanding data better.
If you're interested in learning more about SQLMesh, here are some additional links to check out!