dbt-部署及上手尝试 - 人人从众

相关文章推荐

鼻子大的松鼠 · django中ajax的GET, ...· 2 天前 ·

不爱学习的番茄 · 关于图像分割的标签文件格式 - 叶罅 · 昨天 ·

强悍的牛肉面 · Qt 添加 QtNetwork 库文件 - ...· 23 小时前 ·

慷慨的烤面包 · 一个接口多个实现类，如何调用哪个实现类的方法 ...· 14 小时前 ·

淡定的松鼠 · Ubuntu安装Edge浏览器，好用的浏览器 ...· 12 小时前 ·

阳光的充值卡 · SAS-RTF的合并【完善篇】-腾讯云开发者 ...· 4 月前 ·

温暖的西红柿 · h5 设置剪切板内容 ...· 7 月前 ·

爱看书的乌龙茶 · 快速掌握-jqgrid - 简书· 1 年前 ·

气宇轩昂的春卷 · C 库函数 – rename() | 菜鸟教程· 1 年前 ·

曾深爱过的眼镜 · 效果爆炸的开源漫画变身 ...· 1 年前 ·

Your version of dbt-core is out of date! You can find instructions for upgrading here: https://docs.getdbt.com/docs/installation Plugins: - postgres: 1.5.0 - Update available! At least one plugin is out of date or incompatible with dbt-core. You can find instructions for upgrading here: https://docs.getdbt.com/docs/installation

使用以下命令启动项目 :

dbt init jaffle_shop
// 如果windows的话，需要在用户目录下建一个 .dbt 目录 
// mkdir .dbt
//  C:\Users\用户名\.dbt
(venv) PS D:\workspace\dbt> dbt init jaffle_shop
00:48:43  Running with dbt=1.5.0
Which database would you like to use?
[1] postgres
(Don't see the one you want? https://docs.getdbt.com/docs/available-adapters)
Enter a number: 1
00:48:47  Profile jaffle_shop written to C:\Users\UOS\.dbt\profiles.yml using target's sample configuration. Once updated, you'll be able to start developing with dbt.
00:48:47
Your new dbt project "jaffle_shop" was created!
For more information on how to configure the profiles.yml file,
please consult the dbt documentation here:
  https://docs.getdbt.com/docs/configure-your-profile
One more thing:
Need help? Don't hesitate to reach out to us via GitHub issues or on Slack:
  https://community.getdbt.com/
Happy modeling!
修改配置，链接数据库：
-- C:\Users\UOS\.dbt\profiles.yml
jaffle_shop:
  outputs:
      type: postgres
      threads: 1
      host: 127.0.0.1
      port: 5432
      user: postgres
      pass: haosql
      dbname: postgres
      schema: udo
    prod:
      type: postgres
      threads: 1
      host: 127.0.0.1
      port: 5432
      user: postgres
      pass: haosql
      dbname: postgres
      schema: udo
  target: dev
检测配置： dbt debug
(venv) PS D:\workspace\dbt\jaffle_shop> dbt debug
02:27:38  Running with dbt=1.5.0
02:27:38  dbt version: 1.5.0
02:27:38  python version: 3.8.8
02:27:38  python path: d:\workspace\dbt\venv\scripts\python.exe
02:27:38  os info: Windows-10-10.0.17763-SP0
02:27:38  Using profiles.yml file at C:\Users\UOS\.dbt\profiles.yml
02:27:38  Using dbt_project.yml file at D:\workspace\dbt\jaffle_shop\dbt_project.yml
02:27:38  Configuration:
02:27:38    profiles.yml file [OK found and valid]
02:27:38    dbt_project.yml file [OK found and valid]
02:27:38  Required dependencies:
02:27:38   - git [OK found]
02:27:38  Connection:
02:27:38    host: 127.0.0.1
02:27:38    port: 5432
02:27:38    user: postgres
02:27:38    database: postgres
02:27:38    schema: udo
02:27:38    search_path: None
02:27:38    keepalives_idle: 0
02:27:38    sslmode: None
02:27:38    Connection test: [OK connection ok]
02:27:38  All checks passed!
执行示例模型：dbt run
(venv) PS D:\workspace\dbt\jaffle_shop> dbt run
02:29:14  Running with dbt=1.5.0
02:29:14  Unable to do partial parsing because saved manifest not found. Starting full parse.
02:29:15  Found 2 models, 4 tests, 0 snapshots, 0 analyses, 307 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics, 0 groups
02:29:15
02:29:15  Concurrency: 1 threads (target='dev')
02:29:15
02:29:15  1 of 2 START sql table model udo.my_first_dbt_model ............................ [RUN]
02:29:15  1 of 2 OK created sql table model udo.my_first_dbt_model ....................... [SELECT 2 in 0.17s]
02:29:15  2 of 2 START sql view model udo.my_second_dbt_model ............................ [RUN]
02:29:16  2 of 2 OK created sql view model udo.my_second_dbt_model ....................... [CREATE VIEW in 0.10s]
02:29:16
02:29:16  Finished running 1 table model, 1 view model in 0 hours 0 minutes and 0.56 seconds (0.56s).
02:29:16
02:29:16  Completed successfully
02:29:16
02:29:16  Done. PASS=2 WARN=0 ERROR=0 SKIP=0 TOTAL=2
Build DBT Project
资源Sources
可以暂时理解为是我们各个主要的业务数据库。
数据库模型 model
SQL 模型
run command
ref function
SQL模型是一个语句，模型在文件中定义（通常在目录中）：
每个文件包含一个模型/语句： .sql select
模型名称继承自文件名
模型可以嵌套在目录内的子目录中
当执行dbt run，dbt将构建这个数据模型。
这里新建了一个简单的模型：
-- 计算书籍阅读人数排行
{{ config(materialized='table') }}
with book_ranks as(
    select u_bookid ,count(distinct u_userid) as cnt_rs from udo.t_read_history 
    group by u_bookid order  by cnt_rs desc 
select u_bookid,cnt_rs,t2.u_bookname,t2.u_bookdesc,t2.u_booktypecode,t2.u_author
from book_ranks t1 
udo.t_bookstore t2
on t1.u_bookid=t2.u_id
dbt这里会对应的构建一个脚本：
  create  table "postgres"."udo"."task3__dbt_tmp"  
    -- 计算书籍阅读人数排行
with book_ranks as(
    select u_bookid ,count(distinct u_userid) as cnt_rs from udo.t_read_history 
    group by u_bookid order  by cnt_rs desc 
select u_bookid,cnt_rs,t2.u_bookname,t2.u_bookdesc,t2.u_booktypecode,t2.u_author
from book_ranks t1 
udo.t_bookstore t2
on t1.u_bookid=t2.u_id
执行完成后，数据库里会新增一个task3的表，存储上面的结果。
note：
在哪里可以看到执行的sql：
dbt CLI:
目录： target/compiled/select
目录： target/run/create
日志文件： logs/dbt.log
是否需要提前创建结果表：
dbt会自动检测，如果不存在就创建目录。
可以在dbt_project.yml 和具体的模型SQL中配置。
改变模型的持久化方式
将模型构建为单独的架构？？
创建模型的标签
dbt_project.yml 示例：
name: jaffle_shop
config-version: 2
models:
  jaffle_shop: # this matches the `name:`` config
    +materialized: view # this applies to all models in the current project
    marts:
      +materialized: table # this applies to all models in the `marts/` directory
      marketing:
        +schema: marketing # this applies to all models in the `marts/marketing/`` directory
在具体的模型里配置：
{{ config(
    materialized="view",
    schema="marketing"
with customer_orders as ...
请务必注意，配置是分层应用的 — 应用于子目录的配置将覆盖任何常规配置。
note:
dbt模型结果输出方式： view,table,incremental,ephemeral
模型依赖关系
可以使用 ref 函数代替查询中的表名来构建模型之间的依赖关系
with customers as (
    select * from {{ ref('stg_customers') }}
orders as (
    select * from {{ ref('stg_orders') }}
note:
通过创建DAG来确定运行模型的顺序
推荐模块化转换，重用模型减少复用
dbt docs
通过dbt docs可以生成项目的资料。
首先执行：dbt docs generate
启动服务： dbt docs serve
可以在浏览器中查看：http://localhost:8080/#!/overview