## duck_q.py ( update: 2024-03-09 07:47) ``` import sys import duckdb con = duckdb.connect("dev.duckdb") sql_cmd = " ".join(sys.argv[1:]) con.sql(sql_cmd).show() ``` - python duck_q.py show tables - python duck_q.py select * from my_first_dbt_model - python duck_q.py select * from p01 ---- # dbt Workshop 101 @ Taichung py ### 報名頁面 2024.03.09 (六) 10:00 - 12:00 https://www.accupass.com/event/2402290318542860946270 - 2024.03.09 (六) 10:00 - 12:00 (GMT+8)加入行事曆 ### 本頁面短網址: https://tinyurl.com/3yb5u9cf https://pse.is/5nsuub ### Taichung py https://www.facebook.com/groups/taichung.py ### 講者: KuRin_J (クリーン_J) 聯絡方式: jumpmancoding@gmail.com Senior Engineer, 擅長協助企業選擇適宜策略,提升資料素養、降低溝通成本。 願景是遇見不同的人,拓展視野,在各種細節中發現美好。 2024年的研究興趣在於(1)優化資料工程 (2)Microsoft Fabric (3)流行服飾文化 (4)日式餐飲。 ### 活動前準備: - 下載範例資料集: 擇一下載 只差別在資料筆數 - Download Files: dbt101_small.zip ( 613KB ) - https://drive.google.com/file/d/1kOEYP_HUqZAPQScQ1jLc6gzJYHJw0F7G/view?usp=drive_link - Download Files: dbt101_big.zip ( 26MB ) - https://drive.google.com/file/d/1mTWJfUyhfb2NpYdRgQ04Q3PYnf4a6JQH/view?usp=drive_link - 自備筆電且安裝Python 3.8.0 以上版本 - 建議筆電先行充電或自備延長線 - 建議先行在家完成套件安裝 - 現場不提供 Wifi,若需現場安裝,請自行準備上網環境 - 了解如何使用建立及管理python虛擬環境(以下擇一) - pip - conda ### dbt 套件安裝: - Python 3.8.0 以上版本 - 本次教學套件以 dbt-duckdb 1.6.1 為主 - 以下為安裝方式(以下擇一) - 套件安裝方式1→ pip (winddows) - 套件安裝方式2→ pip (Mac) - 套件安裝方式3→ Conda (Winddows or Mac) ### 套件安裝方式1→ pip (Winddows): - 如果不理解以下指令,建議先行上網查詢相關背景知識 - mkdir workshop_202403 - cd workshop_202403 - python -m venv py38_dbt - .\py38_dbt\Scripts\activate.bat - pip install dbt-duckdb==1.6.1 - dbt --version ### 套件安裝方式2→ pip (Mac): - 如果不理解以下指令,建議先行上網查詢相關背景知識 - mkdir workshop_202403 - cd workshop_202403 - python -m venv py38_dbt - source py38_dbt/bin/activate - pip install dbt-duckdb==1.6.1 - dbt --version ### 套件安裝方式3→ Conda (Winddows or Mac): - 如果不理解以下指令,建議先行上網查詢相關背景知識 - mkdir workshop_202403 - cd workshop_202403 - conda create --name py38_dbt python==3.8 - conda activate py38_dbt - conda install conda-forge::dbt-duckdb==1.6.1 - dbt --version --- ### Conda 安裝異常? - pip install wheel ---- # dbt workshop Command ### dbt project name workshop101 ### dbt Command dbt init dbt debug dbt run : Compile SQL and execute against the current target database. ``` dbt run --select sql_model ``` dbt seed dbt docs ``` dbt docs generate - Generate the documentation website for your project dbt serve - Serve the documentation website for your project ``` ### dbt project name : workshop101 dbt init ``` Enter a name for your project (letters, digits, underscore): workshop101 01:13:28 Your new dbt project "workshop101" was created! For more information on how to configure the profiles.yml file, please consult the dbt documentation here: https://docs.getdbt.com/docs/configure-your-profile One more thing: Need help? Don't hesitate to reach out to us via GitHub issues or on Slack: https://community.getdbt.com/ Happy modeling! 01:14:07 Setting up your profile. Which database would you like to use? [1] duckdb (Don't see the one you want? https://docs.getdbt.com/docs/available-adapters) Enter a number: 1 ``` ### Project Main Foler ``` Your_python_Path ├─py38_dbt └─workshop101 --------- Your first dtb project Name ├─models ---------- Put Your SQL fils │ └─example ├─seeds ---------- Put Your CSV fils ``` ### Adding profiles.yml to workshop101 folder( Config: Connect to Database ) ``` workshop101: outputs: dev: type: duckdb path: dev.duckdb threads: 1 target: dev ``` ### dbt debug ``` 13:17:15 Running with dbt=1.6.10 13:17:15 dbt version: 1.6.10 13:17:15 python version: 3.10.7 13:17:15 adapter type: duckdb 13:17:15 adapter version: 1.6.1 13:17:15 Configuration: 13:17:15 profiles.yml file [OK found and valid] 13:17:15 dbt_project.yml file [OK found and valid] 13:17:15 Required dependencies: 13:17:15 - git [OK found] 13:17:15 Connection: 13:17:15 database: dev 13:17:15 schema: main 13:17:15 path: dev.duckdb 13:17:15 config_options: None 13:17:15 extensions: None 13:17:15 settings: None 13:17:15 external_root: . 13:17:15 use_credential_provider: None 13:17:15 attach: None 13:17:15 filesystems: None 13:17:15 remote: None 13:17:15 plugins: None 13:17:15 disable_transactions: False 13:17:15 Registered adapter: duckdb=1.6.1 13:17:15 Connection test: [OK connection ok] 13:17:15 All checks passed! ``` dbt run ``` 13:18:38 1 of 2 START sql table model main.my_first_dbt_model ........................... [RUN] 13:18:38 1 of 2 OK created sql table model main.my_first_dbt_model ...................... [OK in 13:18:38 2 of 2 START sql view model main.my_second_dbt_model ........................... [RUN] 13:18:38 2 of 2 OK created sql view model main.my_second_dbt_model ...................... [OK in ``` ### write your SQL model ``` {{ config(materialized='table') }} select random() as num_1 , random() as num_2 , random() as num_3 ``` ``` dbt run --select your_SQL ``` ### SQL Temple ``` {{ config(materialized = 'table') }} select * from {{ ref('random900') }} where num_1 > 10 ``` ``` {{ config(materialized = 'view') }} select * from {{ ref('p01') }} where num_2 > 30 ``` ``` {{ config(materialized = 'view') }} select * from {{ ref('p02') }} where num_3 > 70 ``` ``` {{ config(materialized = 'table') }} select * from {{ ref('p03') }} limit 700 ``` ``` {{ config(materialized = 'view') }} select *, (num_3 * 2.5 + num_2 * 7.2 + num_1 * 2.31) as num_4 from {{ ref('p01') }} ``` ``` {{ config(materialized = 'table') }} select num_4 from {{ ref('p05') }} order by num_4 ``` ---- # sample dataset ### coffee_sales.csv ``` Columns Name: 'p_date', 'c_id', 'p_id', 'p_q', 'xx1', 'xx2', 'xx3', 'xx4', 'xx5', 'xx6', 'p_price', 'sales', 'day_name', 'first_day_of_week', 'month_of_year', 'day_of_month', 'month_name', 'year_key' p_date c_id p_id p_q xx1 xx2 xx3 xx4 xx5 xx6 p_price sales day_name first_day_of_week month_of_year day_of_month month_name year_key 0 2018-09-04 45 5 3 5JhL ZVTxK SF883ot UQwZ o4c jJ 118 354 Tuesday 2018-09-03 9 4 September 2018 1 2018-07-03 38 5 8 vr64 iXAvX V3zsSkr yXD0 0KD W4 118 944 Tuesday 2018-07-02 7 3 July 2018 2 2015-06-11 41 3 2 oovz CIHlT 6HDzXaf VmJ8 spC DF 80 160 Thursday 2015-06-08 6 11 June 2015 3 2016-10-21 42 3 6 cyKp sXiEl D2PsGhX Aar8 i48 7r 80 480 Friday 2016-10-17 10 21 October 2016 4 2013-04-06 38 2 6 WNoc 2STlO M8sdmva Mrt8 Vk8 xa 50 300 Saturday 2013-04-01 4 6 April 2013 ``` ### coffee_customer.csv ``` Columns Name: 'customer_name', 'address', 'phone', 'email', 'birthday', 'c_id' customer_name address phone email birthday c_id 0 Raymond Gomez 5397 Ronald Keys Suite 813\nLeestad, TX 46384 7557206326 jeffreyswanson@example.com 1980-09-19 1 1 Christine Miller 194 Joseph Mountain Apt. 791\nFloresville, ME ... +1-503-558-1030x9385 donna54@example.org 1947-04-25 2 2 Mark Nelson 76129 Michelle Mews Suite 600\nSinghport, WA 7... 001-541-702-5291x3501 vvasquez@example.org 1935-04-12 3 3 Kirk Thompson USNS Roy\nFPO AP 85218 456-803-6514 scardenas@example.org 1967-04-03 4 4 Amy Todd 246 James Streets\nWest Christina, GA 98801 394-971-6024 brownkelsey@example.org 1986-04-13 5 ``` ### coffee_product.csv ``` Columns Name: 'p_id', 'p_name_zh', 'p_name_en', 'p_area' p_id p_name_zh p_name_en p_area 0 1 美式咖啡 American Coffee Colombia 1 2 拿鐵 Latte Colombia 2 3 卡布奇諾 Cappuccino Brazil 3 4 摩卡咖啡 Mocha Brazil 4 5 義式濃縮咖啡 Espresso Brazil ```