相关文章推荐
老实的橙子  ·  Day 11 : psycopg2 操作 ...·  1 周前    · 
紧张的香瓜  ·  odps ...·  1 周前    · 
捣蛋的手套  ·  ABAP内表(internal ...·  4 天前    · 
踏实的长颈鹿  ·  Android ExoPlayer ...·  10 月前    · 
腼腆的茶叶  ·  vue3+ts ...·  1 年前    · 

開始前先介紹什麼是 Left outer join Inner join

Left outer join:

LEFT JOIN 可以用來建立左外部連接(上圖的 A + B 部分),查詢的 SQL 敘述句 LEFT JOIN 左側資料表 (table_1) 的所有記錄都會加入到查詢結果中,最終產生一張新的表。
outer join 會列出主要表格中每一筆資料,無論它的值在另一個表格中有沒有出現
=> 即使右側資料表 (table_2) 有沒有符合的值,左側資料表(table_1)每一筆還是會顯示出來。

LEFT JOIN 查詢用法 (Example)
這是一個客戶資料表 customers:

Address Phone
SELECT customers.Name, orders.Order_No # 挑出要顯示的欄位
FROM customers
LEFT JOIN orders                       # orders 資料表往左接上 customers 資料表
ON customers.C_Id=orders.C_Id;         # 指定兩資料表連接的條件

其中用點號連接之 XXX.YYY 表示 XXX 資料表中的 YYY 欄位。

Order_No

使用前面建立的Product(商品)跟ProductStyle(商品樣式)Model

#app/models/product.rb

class Product < ApplicationRecord
  has_many :styles, class_name: 'ProductStyle' # 關聯取名為styles

#app/models/product_style.rb

class ProductStyle < ApplicationRecord
  belongs_to :product
Table
Model
產生以下 SQL 跟回傳值:
Product Load (0.4ms)  SELECT "products".* FROM "products"
ProductStyle Load (0.2ms)  SELECT  "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" = $1 ORDER BY "product_styles"."id" ASC LIMIT $2  [["product_id", 3], ["LIMIT", 1]]
ProductStyle Load (0.2ms)  SELECT  "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" = $1 ORDER BY "product_styles"."id" ASC LIMIT $2  [["product_id", 4], ["LIMIT", 1]]
ProductStyle Load (0.1ms)  SELECT  "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" = $1 ORDER BY "product_styles"."id" ASC LIMIT $2  [["product_id", 6], ["LIMIT", 1]]
ProductStyle Load (0.1ms)  SELECT  "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" = $1 ORDER BY "product_styles"."id" ASC LIMIT $2  [["product_id", 7], ["LIMIT", 1]]
ProductStyle Load (0.1ms)  SELECT  "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" = $1 ORDER BY "product_styles"."id" ASC LIMIT $2  [["product_id", 8], ["LIMIT", 1]]
# => ["香蕉乾", "芒果乾", "商品款式2", "商品款式3", "商品款式4"] 

後面款式都亂取名,懶得想名字了

N+1 Query這個效能殺手,造成呼叫 5 + 1次SQL(第一次呼叫products資料表,每次迴圈呼叫一次product_styles資料表,共5次)

呼叫SQL的成本是巨大的,該怎麼辦?

Rails透過ORM語法把關聯資料表做預加載,如此一來迴圈內呼叫關聯時,不用再呼叫SQL從DB取資料,改從cache取資料。使用預加載,迴圈越多次程式碼效率提升就越顯著。

加載關聯資料表

Rails 提供四種加載關聯資料表的方法

  • preload
  • eager_load
  • includes
  • joins
  • preload

    preload會生成兩條獨立的SQL查詢,所以關聯資料表會一起被 query 出來,但不會建立資料表之間的關聯

    products = Product.preload(:styles).order("products.id DESC")
    
    產生以下 SQL:
    Product Load (0.4ms)  SELECT  "products".* FROM "products" ORDER BY products.id DESC LIMIT $1  [["LIMIT", 11]]
    ProductStyle Load (0.2ms)  SELECT "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" IN ($1, $2, $3, $4, $5)  [["product_id", 8], ["product_id", 7], ["product_id", 6], ["product_id", 4], ["product_id", 3]]
    

    preload會將products有關的資料表統一加載出來,但不會建立資料表之間的關聯,如果要拿preload生成的SQL語句來做關聯查詢(where)排序(order),會出錯:

    Product.preload(:styles).where("product_styles.title like ?", 'test')
    
    噴錯如下:
    ActiveRecord::StatementInvalid (PG::UndefinedTable: ERROR:  missing FROM-clause entry for table "product_styles")
    

    不過preload確實可以解決N+1 query

    Product.preload(:styles).map do |product|
      product.styles.first.title
    
    只產生以下兩條 SQL 跟回傳值:
    Product Load (0.4ms)  SELECT "products".* FROM "products"
    ProductStyle Load (0.4ms)  SELECT "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" IN ($1, $2, $3, $4, $5)  [["product_id", 3], ["product_id", 4], ["product_id", 6], ["product_id", 7], ["product_id", 8]]
    # => ["香蕉乾", "芒果乾", "商品款式2", "商品款式3", "商品款式4"] 
    

    eager_load

    eager_load 使用LEFT OUTER JOIN進行單次查詢,並加載所有關聯數據(這樣資料表之間就有關聯了)

    products = Product.eager_load(:styles).order("products.id DESC")
    
    產生以下 SQL:
    SQL (0.6ms)  SELECT  DISTINCT products.id AS alias_0, "products"."id" FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" ORDER BY products.id DESC LIMIT $1  [["LIMIT", 11]]
    SQL (0.6ms)  SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "product_styles"."id" AS t1_r0, "product_styles"."title" AS t1_r1, "product_styles"."price" AS t1_r2, "product_styles"."sku" AS t1_r3, "product_styles"."product_id" AS t1_r4, "product_styles"."created_at" AS t1_r5, "product_styles"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE "products"."id" IN ($1, $2, $3, $4, $5) ORDER BY products.id DESC  [["id", 8], ["id", 7], ["id", 6], ["id", 4], ["id", 3]]
    

    可以看到為了加載關聯,eager_load用了更複雜的SQL語法,雖然速度比preload慢,但是解決了preload不能關聯查詢的問題

    Product.eager_load(:styles).where("product_styles.title like ?", 'test')
    
    產生以下 SQL:
    SQL (0.5ms)  SELECT  DISTINCT "products"."id" FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE (product_styles.title like 'test') LIMIT $1  [["LIMIT", 11]]
    SQL (0.4ms)  SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "product_styles"."id" AS t1_r0, "product_styles"."title" AS t1_r1, "product_styles"."price" AS t1_r2, "product_styles"."sku" AS t1_r3, "product_styles"."product_id" AS t1_r4, "product_styles"."created_at" AS t1_r5, "product_styles"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE (product_styles.title like 'test') AND "products"."id" = $1  [["id", 4]]
    

    eager_load 同時也可以解決N+1 query

    Product.eager_load(:styles).map do |product|
      product.styles.first.try(:title)
    
    產生以下 SQL:
      SQL (9.4ms)  SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "products"."order_id" AS t0_r6, "product_styles"."id" AS t1_r0, "product_styles"."title" AS t1_r1, "product_styles"."price" AS t1_r2, "product_styles"."sku" AS t1_r3, "product_styles"."product_id" AS t1_r4, "product_styles"."created_at" AS t1_r5, "product_styles"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id"
    

    從SQL語法來看精準搜尋

    先看沒有限制條件,單純做eager_load
    Product.eager_load(:options)
    
    SQL (0.7ms)  SELECT  DISTINCT "products"."id" FROM "products" LEFT OUTER JOIN "options" ON "options"."product_id" = "products"."id" LIMIT $1  [["LIMIT", 11]]
    SQL (0.5ms)  SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "products"."store_id" AS t0_r6, "options"."id" AS t1_r0, "options"."product_id" AS t1_r1, "options"."title" AS t1_r2, "options"."option1" AS t1_r3, "options"."option2" AS t1_r4, "options"."created_at" AS t1_r5, "options"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "options" ON "options"."product_id" = "products"."id" WHERE "products"."id" IN ($1, $2, $3, $4, $5)  [["id", 4], ["id", 2], ["id", 7], ["id", 8], ["id", 3]]
    

    第二條SQL會把所有product的id都列出來["id", 4], ["id", 2], ["id", 7], ["id", 8], ["id", 3]

    再看有限制條件eager_load
    Product.eager_load(:options).where(options: {title: '尺寸'})
    
    SQL (42.9ms)  SELECT  DISTINCT "products"."id" FROM "products" LEFT OUTER JOIN "options" ON "options"."product_id" = "products"."id" WHERE "options"."title" = $1 LIMIT $2  [["title", "尺寸"], ["LIMIT", 11]]
    SQL (0.6ms)  SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "products"."store_id" AS t0_r6, "options"."id" AS t1_r0, "options"."product_id" AS t1_r1, "options"."title" AS t1_r2, "options"."option1" AS t1_r3, "options"."option2" AS t1_r4, "options"."created_at" AS t1_r5, "options"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "options" ON "options"."product_id" = "products"."id" WHERE "options"."title" = $1 AND "products"."id" IN ($2, $3)  [["title", "尺寸"], ["id", 2], ["id", 3]]
    

    可以看到第一條 SQL 有加上條件["title", "尺寸"],篩選出符合的 products (id: 2 跟 3)
    所以第二條 SQL 在查詢時,只有搜尋符合條件的["id", 2], ["id", 3]
    Rails 的 ORM語法雖然方便,但是如果能了解一些SQL,就能更容易找到適當的方法。

    每次都要想何時用preload?何時用eager_load? 很麻煩

    所以Rails 還有includes這個大家最常用的方法

    ps. eager_load在Rails 4是產生一條SQL,這裡用Rails 5 示範

    includes

    includes預設效果是preload,需要做關聯資料表時又會轉成eager_load

    預設效果同preload

    Product.includes(:styles)
    
    產生以下 SQL:
    Product Load (0.3ms)  SELECT  "products".* FROM "products" LIMIT $1  [["LIMIT", 11]]
    ProductStyle Load (0.4ms)  SELECT "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" IN ($1, $2, $3, $4, $5)  [["product_id", 3], ["product_id", 4], ["product_id", 6], ["product_id", 7], ["product_id", 8]]
    

    精準查詢關聯資料表,效果同eager_load

    Product.includes(:styles).where(product_styles: {title: 'test'})
    
    產生以下 SQL:
    SQL (0.6ms)  SELECT  DISTINCT "products"."id" FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE "product_styles"."title" = $1 LIMIT $2  [["title", "test"], ["LIMIT", 11]]
    SQL (0.4ms)  SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "product_styles"."id" AS t1_r0, "product_styles"."title" AS t1_r1, "product_styles"."price" AS t1_r2, "product_styles"."sku" AS t1_r3, "product_styles"."product_id" AS t1_r4, "product_styles"."created_at" AS t1_r5, "product_styles"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE "product_styles"."title" = $1 AND "products"."id" = $2  [["title", "test"], ["id", 4]]
    # => #<ActiveRecord::Relation [#<Product id: 4, title: "商品二", price: 1985.0, description: "要買要快", created_at: "2019-09-30 14:38:42", updated_at: "2019-09-30 14:38:42">]> 
    

    如果是模糊搜尋,要搭配references才能達到eager_load的效果

    products = Product.includes(:styles).where("product_styles.title like ?", 'test').references(:styles)
    
    產生以下 SQL 跟回傳值:
    SQL (0.5ms)  SELECT  DISTINCT "products"."id" FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE (product_styles.title like 'test') LIMIT $1  [["LIMIT", 11]]
    SQL (0.5ms)  SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "product_styles"."id" AS t1_r0, "product_styles"."title" AS t1_r1, "product_styles"."price" AS t1_r2, "product_styles"."sku" AS t1_r3, "product_styles"."product_id" AS t1_r4, "product_styles"."created_at" AS t1_r5, "product_styles"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE (product_styles.title like 'test') AND "products"."id" = $1  [["id", 4]]
    # => #<ActiveRecord::Relation [#<Product id: 4, title: "商品二", price: 1985.0, description: "要買要快", created_at: "2019-09-30 14:38:42", updated_at: "2019-09-30 14:38:42">]> 
    

    ps. 模糊搜尋的includes不搭配references會噴錯,各位看倌可以拿掉試試

    所以includes既可以解決N+1 query的問題,搭配reference又可以建立關聯查詢,提到N+1 query推薦用includes原因就在這裡。

    joins

    joins主要用於建立model之間的關係,用INNER JOIN來關聯數據, “不會”將關聯的資料做預加載,所以對N+1 query並無幫助,常用來建立關聯搜尋,
    以上面的模糊搜尋為例:

    Product.joins(:styles).where("product_styles.title like ?", 'test')
    
    產生以下 SQL 跟回傳值:
    Product Load (0.6ms)  SELECT  "products".* FROM "products" INNER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE (product_styles.title like 'test') LIMIT $1  [["LIMIT", 11]]
    # => #<ActiveRecord::Relation [#<Product id: 4, title: "商品二", price: 1985.0, description: "要買要快", created_at: "2019-09-30 14:38:42", updated_at: "2019-09-30 14:38:42">]> 
    

    只產生一條SQL。所以joins可以用來建立資料表關聯

    但如果想要joins處理我們今天第一個問題,會更慘。因為可能產生重複資料,SQL查詢次數比沒加joins還多 w(゚Д゚)w

    原因如下:

    Product.joins(:styles).size #=> 9
    Product.joins(:styles).uniq.size  #=> 5
    

    原來Product.joins(:styles)實際上是:

  • 查詢全部含有product_id的product_style,並回傳該product_style所屬的product
  • 如果有很多筆product_style屬於同一個product,就會回傳大量相同的product,要用.uniq來刪除重複的項目
  • 一對一關係不會發生重複的情況
  • 反過來就沒問題,因為 ProductStyle 對 Product 是多對一:
    ProductStyle.joins(:product).size # => 9 
    ProductStyle.joins(:product).uniq.size # => 9 
    

    Rails提供關聯資料表加載,幫助我們解決N+1 query這個隱形效能殺手,也可以建立資料表的關聯。
    四種該如何使用,以下是簡單的判斷

  • 只需要解決 N+1 query,用 preload 預加載
  • 只需要精準或模糊搜尋,使用joins 建立關聯
  • 需要建立關聯,又需要預加載,使用eager_load
  • 新手不知道該怎麼用,用includes 似乎最方便
  • 上面只是概略的描述,實際執行會因為資料表的行數及列數不同,導致效能也不同,還是要依照情況而定。