Your version of dbt-core is out of date!
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation
Plugins:
- postgres: 1.5.0 - Update available!
At least one plugin is out of date or incompatible with dbt-core.
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation
使用以下命令启动项目 :
dbt init jaffle_shop
// 如果windows的话,需要在用户目录下建一个 .dbt 目录
// mkdir .dbt
// C:\Users\用户名\.dbt
(venv) PS D:\workspace\dbt> dbt init jaffle_shop
00:48:43 Running with dbt=1.5.0
Which database would you like to use?
[1] postgres
(Don't see the one you want? https://docs.getdbt.com/docs/available-adapters)
Enter a number: 1
00:48:47 Profile jaffle_shop written to C:\Users\UOS\.dbt\profiles.yml using target's sample configuration. Once updated, you'll be able to start developing with dbt.
00:48:47
Your new dbt project "jaffle_shop" was created!
For more information on how to configure the profiles.yml file,
please consult the dbt documentation here:
https://docs.getdbt.com/docs/configure-your-profile
One more thing:
Need help? Don't hesitate to reach out to us via GitHub issues or on Slack:
https://community.getdbt.com/
Happy modeling!
修改配置,链接数据库:
-- C:\Users\UOS\.dbt\profiles.yml
jaffle_shop:
outputs:
type: postgres
threads: 1
host: 127.0.0.1
port: 5432
user: postgres
pass: haosql
dbname: postgres
schema: udo
prod:
type: postgres
threads: 1
host: 127.0.0.1
port: 5432
user: postgres
pass: haosql
dbname: postgres
schema: udo
target: dev
检测配置: dbt debug
(venv) PS D:\workspace\dbt\jaffle_shop> dbt debug
02:27:38 Running with dbt=1.5.0
02:27:38 dbt version: 1.5.0
02:27:38 python version: 3.8.8
02:27:38 python path: d:\workspace\dbt\venv\scripts\python.exe
02:27:38 os info: Windows-10-10.0.17763-SP0
02:27:38 Using profiles.yml file at C:\Users\UOS\.dbt\profiles.yml
02:27:38 Using dbt_project.yml file at D:\workspace\dbt\jaffle_shop\dbt_project.yml
02:27:38 Configuration:
02:27:38 profiles.yml file [OK found and valid]
02:27:38 dbt_project.yml file [OK found and valid]
02:27:38 Required dependencies:
02:27:38 - git [OK found]
02:27:38 Connection:
02:27:38 host: 127.0.0.1
02:27:38 port: 5432
02:27:38 user: postgres
02:27:38 database: postgres
02:27:38 schema: udo
02:27:38 search_path: None
02:27:38 keepalives_idle: 0
02:27:38 sslmode: None
02:27:38 Connection test: [OK connection ok]
02:27:38 All checks passed!
执行示例模型:dbt run
(venv) PS D:\workspace\dbt\jaffle_shop> dbt run
02:29:14 Running with dbt=1.5.0
02:29:14 Unable to do partial parsing because saved manifest not found. Starting full parse.
02:29:15 Found 2 models, 4 tests, 0 snapshots, 0 analyses, 307 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics, 0 groups
02:29:15
02:29:15 Concurrency: 1 threads (target='dev')
02:29:15
02:29:15 1 of 2 START sql table model udo.my_first_dbt_model ............................ [RUN]
02:29:15 1 of 2 OK created sql table model udo.my_first_dbt_model ....................... [SELECT 2 in 0.17s]
02:29:15 2 of 2 START sql view model udo.my_second_dbt_model ............................ [RUN]
02:29:16 2 of 2 OK created sql view model udo.my_second_dbt_model ....................... [CREATE VIEW in 0.10s]
02:29:16
02:29:16 Finished running 1 table model, 1 view model in 0 hours 0 minutes and 0.56 seconds (0.56s).
02:29:16
02:29:16 Completed successfully
02:29:16
02:29:16 Done. PASS=2 WARN=0 ERROR=0 SKIP=0 TOTAL=2
Build DBT Project
资源Sources
可以暂时理解为是我们各个主要的业务数据库。
数据库模型 model
SQL 模型
run command
ref function
SQL模型是一个语句,模型在文件中定义(通常在目录中):
每个文件包含一个模型/语句: .sql select
模型名称继承自文件名
模型可以嵌套在目录内的子目录中
当执行dbt run
,dbt将构建这个数据模型。
这里新建了一个简单的模型:
-- 计算书籍阅读人数排行
{{ config(materialized='table') }}
with book_ranks as(
select u_bookid ,count(distinct u_userid) as cnt_rs from udo.t_read_history
group by u_bookid order by cnt_rs desc
select u_bookid,cnt_rs,t2.u_bookname,t2.u_bookdesc,t2.u_booktypecode,t2.u_author
from book_ranks t1
udo.t_bookstore t2
on t1.u_bookid=t2.u_id
dbt这里会对应的构建一个脚本:
create table "postgres"."udo"."task3__dbt_tmp"
-- 计算书籍阅读人数排行
with book_ranks as(
select u_bookid ,count(distinct u_userid) as cnt_rs from udo.t_read_history
group by u_bookid order by cnt_rs desc
select u_bookid,cnt_rs,t2.u_bookname,t2.u_bookdesc,t2.u_booktypecode,t2.u_author
from book_ranks t1
udo.t_bookstore t2
on t1.u_bookid=t2.u_id
执行完成后,数据库里会新增一个task3的表,存储上面的结果。
note:
在哪里可以看到执行的sql:
dbt CLI:
目录: target/compiled/select
目录: target/run/create
日志文件: logs/dbt.log
是否需要提前创建结果表:
dbt会自动检测,如果不存在就创建目录。
可以在dbt_project.yml 和具体的模型SQL中配置。
改变模型的持久化方式
将模型构建为单独的架构??
创建模型的标签
dbt_project.yml 示例:
name: jaffle_shop
config-version: 2
models:
jaffle_shop: # this matches the `name:`` config
+materialized: view # this applies to all models in the current project
marts:
+materialized: table # this applies to all models in the `marts/` directory
marketing:
+schema: marketing # this applies to all models in the `marts/marketing/`` directory
在具体的模型里配置:
{{ config(
materialized="view",
schema="marketing"
with customer_orders as ...
请务必注意,配置是分层应用的 — 应用于子目录的配置将覆盖任何常规配置。
note:
dbt模型结果输出方式: view,table,incremental,ephemeral
模型依赖关系
可以使用 ref 函数代替查询中的表名来构建模型之间的依赖关系
with customers as (
select * from {{ ref('stg_customers') }}
orders as (
select * from {{ ref('stg_orders') }}
note:
通过创建DAG来确定运行模型的顺序
推荐模块化转换,重用模型减少复用
dbt docs
通过dbt docs可以生成项目的资料。
首先执行:dbt docs generate
启动服务: dbt docs serve
可以在浏览器中查看:http://localhost:8080/#!/overview