DBT Demo === ## what is dbt ![image](https://hackmd.io/_uploads/ByM0CyAS6.png) - data build tool - el**T** - Transformation Part - prérequis: You already copy the data into your datalake/dwh - Connector - https://docs.getdbt.com/docs/supported-data-platforms - it does not mean dbt can move data from connector A to connector B - ![image](https://hackmd.io/_uploads/S10tpxAr6.png) - ![image](https://hackmd.io/_uploads/HyD4FgCBa.png) - airbyte(https://airbyte.com/) - airflow(https://airflow.apache.org/) ## Feature ![image](https://hackmd.io/_uploads/rJ881g0Ha.png) - frequently used - transform(build models/table) - only write select part - create table(DDL) ```SQL CREATE TABLE employee ( id INT COMMENT 'employee ID', name STRING COMMENT 'employee name' ) COMMENT 'Employee table for some department'; ``` - insert ```SQL INSERT INTO/OVERWRITE employee SELECT id, name FROM TABLE2 ``` - overwrite(default) - incremental - https://docs.getdbt.com/docs/build/incremental-models - CR_BI_DEV - test(quality test) - https://docs.getdbt.com/docs/build/tests - generic test - custom test - seed(dbt seed) - load csv into table(adhoc analysis) - dbt seed - document - dbt docs generate - dbt docs serve ## Demo Part ![image](https://hackmd.io/_uploads/HkjvSxCrT.png) - **Demo Agenda** 1. create a dbt project 2. define the connection 3. build models 4. quality test called dbt test 5. load csv data into table by using dbt seed 6. document management called dbt docs 7. git integration(1mins) - [DBT in Paris](https://drive.google.com/drive/folders/1kpJGlAZUEpUdwaCEN_kze4SIqGr_Rdh1)