天天看點

MySQL中left join相關問題整理【left join的用途】【疑問一的分析】【問題二的分析】

【說明】

本文主要是針對以下兩個問題進行的分析整理:

1、如果SQL中存在left join,是否絕對無法改寫?

2、在left join左邊的表,并不一定就是驅動表,explain執行計劃可能會将left join右邊的表作為驅動表,什麼情況下,會産生這種情況?

【left join的用途】

我們先來了解下left join的實際用途,即它會産生什麼效果。

left join(左聯接),會傳回包括左表中的所有記錄和右表中聯結字段相等的記錄。

這句話的含義是,如果a left join b,那麼傳回的結果集是“a表與b表join關聯得到的結果集”+"a表中,無法與b關聯的其它資料",其中,a與b無法關聯的資料隻會顯示a表的内容,b表對應的内容都為空。是以a left join b得到的資料行數,是大于等于a表的資料行的。

示例:

a、b表的結構、資料如下:
mysql>select * from a
+--------------+--------------+--------------+
| id           | t1           | t2           |
+--------------+--------------+--------------+
|            1 |            1 |            1 |
|            2 |            2 |            2 |
|          111 |            1 |            1 |
|          222 |            2 |            2 |
|           11 |            1 |            1 |
|           22 |            2 |            2 |
+--------------+--------------+--------------+
傳回行數:[6],耗時:8 ms.
mysql>select * from b
+--------------+--------------+--------------+
| id           | t1           | t2           |
+--------------+--------------+--------------+
|            1 |            1 |            1 |
|            2 |            2 |            2 |
|            3 |            2 |            2 |
+--------------+--------------+--------------+
傳回行數:[3],耗時:8 ms.

mysql>select a.*,b1.*
from a 
join b b1 on a.id=b1.t1
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            1 |            1 |
|            2 |            2 |            2 |            2 |            2 |            2 |
|            2 |            2 |            2 |            3 |            2 |            2 |
+--------------+--------------+--------------+--------------+--------------+--------------+

mysql>select a.*,b1.*
from a 
left join b b1 on a.id=b1.t1
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            1 |            1 |
|            2 |            2 |            2 |            2 |            2 |            2 |
|            2 |            2 |            2 |            3 |            2 |            2 |
|          111 |            1 |            1 |              |              |              |
|          222 |            2 |            2 |              |              |              |
|           11 |            1 |            1 |              |              |              |
|           22 |            2 |            2 |              |              |              |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[7],耗時:8 ms.           

可以看到a left join b的上半部分結果就是“a表與b表join關聯得到的結果集”,下半部分的結果集是"a表中,無法與b關聯的其它資料"

【疑問一的分析】

明白了left join的實際作用後,我們再來分析以下幾種情況:

1、問:

A:select a.,c. from a left join (select * from b where b.t1=1) c on a.id=c.id

是否可以改寫為

B:select a.,b. from a left join b on a.id=b.id where b.t1=1

答:否,這兩種情況的意義完全不一樣,所得到的是兩種不同的結果。

在A情況下,a表是與子查詢的結果集進行左表連接配接,雖然自查詢通過where條件進行了過濾,但影響的是b表會有哪些資料會在結果集中顯示,a表中的資料不會有任何缺失,最後的結果行數肯定還是大于等于a表的行數的

在B情況下,a表與b表進行左表連接配接後,得到一個結果集,最後再對整個結果集進行過濾,在對整個結果集過濾的時候,a表中的資料可能也會被過濾掉,過濾後的結果行數可能是大于、等于或小于a表的行數的

注:a、b兩表的結構與資料在第一處代碼中已有展示,這裡不再贅述
mysql>select a.*,c.* from a left join (select * from b where b.t1=1) c on a.id=c.id
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            1 |            1 |
|            2 |            2 |            2 |              |              |              |
|          111 |            1 |            1 |              |              |              |
|          222 |            2 |            2 |              |              |              |
|           11 |            1 |            1 |              |              |              |
|           22 |            2 |            2 |              |              |              |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[6],耗時:57 ms.
mysql>select a.*,b.* from a left join b  on a.id=b.id where b.t1=1
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            1 |            1 |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[1],耗時:7 ms.           

2、問:

A:select a.,b. from (select * from a where a.t1=1) left join b on a.id=b.id

B:select a.,b. from a left join b on a.id=b.id where a.t1=1

答:可以,效果是一樣的

在A情況下,a表過濾後的結果集作為左連接配接的左表,最後的結果集肯定是隻包含a.t1=1的

在B情況下,a與b左連接配接先得到一個結果集,然後對結果集中根據a表的t1列進行過濾,最後的結果集肯定也是隻包含a.t1=1的

注:a、b兩表的結構與資料在第一處代碼中已有展示,這裡不再贅述
mysql>select c.*,b.* from (select * from a where a.t1=1) c left join b on c.t1=b.t1
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            1 |            1 |
|          111 |            1 |            1 |            1 |            1 |            1 |
|           11 |            1 |            1 |            1 |            1 |            1 |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[3],耗時:10 ms.
mysql>select a.*,b.* from a  left join b on a.t1=b.t1 where a.t1=1
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            1 |            1 |
|          111 |            1 |            1 |            1 |            1 |            1 |
|           11 |            1 |            1 |            1 |            1 |            1 |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[3],耗時:8 ms.           

【小結】

上述兩種情況是對left join中子查詢存在where的情況進行了分析:

  如果left join的右表即被驅動表是一個子查詢,裡面有where條件對被驅動表進行了過濾,這種情況不能直接将where條件放在最外邊,因為子查詢中通過where過濾後還是會與驅動表進行左連接配接,驅動表中的資料不會有任何缺失,但如果将where條件放在最外邊,可能也會過濾掉驅動表的資料,兩種SQL的意義是完全不一樣的;

  如果left join的左表即驅動表是一個子查詢,裡面有where條件對驅動表進行了過濾,那麼可以将where條件放在最外邊。

3、問:下面三種情況是等價的嗎?

B:select a.,b. from a left join b on a.id=b.id and b.t1=1

C:select a.,b. from a left join b on a.id=b.id and a.t1=1

答:A與B是完全等價的,與C是不等價的。

A的含義是,b表過濾後再被驅動表左連接配接,得到的結果集中,a表的内容會全部顯示,b表隻會顯示與a表關聯後符合b.t1=1的内容

B的含義是,a表與b表進行左連接配接關聯,得到的結果集中,a表的内容會全部顯示,但b表的内容需要同時滿足a.id=b.id與b.t1=1才會顯示,即b表隻會顯示與a表關聯後且b.t1=1的内容

C的含義是,a表與b表進行左連接配接關聯,得到的結果集中,a表的内容會全部顯示,b表會顯示的内容需要同時滿足a.id=b.id且a表的a.t1=1

a表資料如下所示
mysql>select * from a
+--------------+--------------+--------------+
| id           | t1           | t2           |
+--------------+--------------+--------------+
|            1 |            1 |            1 |
|            2 |            2 |            2 |
|          111 |            1 |            1 |
|          222 |            2 |            2 |
|           11 |            1 |            1 |
|           22 |            2 |            2 |
+--------------+--------------+--------------+
b表資料如下所示
mysql>select * from b
+--------------+--------------+--------------+
| id           | t1           | t2           |
+--------------+--------------+--------------+
|            1 |            2 |            1 |
|            2 |            2 |            2 |
|            3 |            2 |            2 |
+--------------+--------------+--------------+

mysql>select a.*,c.* from a left join (select * from b where b.t1=1) c on a.id=c.id
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |              |              |              |
|            2 |            2 |            2 |              |              |              |
|          111 |            1 |            1 |              |              |              |
|          222 |            2 |            2 |              |              |              |
|           11 |            1 |            1 |              |              |              |
|           22 |            2 |            2 |              |              |              |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[6],耗時:9 ms.
mysql>select a.*,b.* from a left join b on a.id=b.id and b.t1=1
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |              |              |              |
|            2 |            2 |            2 |              |              |              |
|          111 |            1 |            1 |              |              |              |
|          222 |            2 |            2 |              |              |              |
|           11 |            1 |            1 |              |              |              |
|           22 |            2 |            2 |              |              |              |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[6],耗時:7 ms.
mysql>select a.*,b.* from a left join b on a.id=b.id and a.t1=1
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            2 |            1 |
|            2 |            2 |            2 |              |              |              |
|          111 |            1 |            1 |              |              |              |
|          222 |            2 |            2 |              |              |              |
|           11 |            1 |            1 |              |              |              |
|           22 |            2 |            2 |              |              |              |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[6],耗時:7 ms.
           

從這裡可以看出 left join on 中and 的含義,and與where不同,and是與on組合使用的,表示當同時滿足這些條件時,才會顯示被驅動表的對應内容,在left join中and對驅動表要顯示的内容并沒有過濾性

【進一步思考】

那麼在join中,and的作用又是怎麼樣的?

A:select a.,c. from a join (select * from b where b.t1=1) c on a.id=c.id

B:select a.,b. from a join b on a.id=b.id and b.t1=1

C:select a.,b. from a join b on a.id=b.id and a.t1=1

此時A與B還是等價的,與C不是等價的

A表示,b表通過b.t1=1過濾,然後與a表關聯,得到一個結果集,結果集是符合a.id=c.id且b.t1=1的

B表示,a與b join關聯後,再對結果集的b.t1=1進行過濾,最後的結果集也是符合a.id=c.id且b.t1=1的

C表示,a與b join關聯後,再對結果集的b.t1=1進行過濾,最後的結果集是符合a.id=c.id且c.t1=1的

是以,在join情況下,and與where的作用是一樣的,都是對結果集的進一步過濾

總結:

如果帶有where的子查詢是驅動表,可以将其改寫,如果帶有where的子查詢是被驅動表,不能改寫

【問題二的分析】

現象:

mysql>explain
SELECT b.*
FROM  a
LEFT JOIN  b ON b.uid = a.uid
WHERE b.w_id = 2327
+--------------+-----------------------+-----------------+----------------+-------------------------+---------------+-------------------+------------------+----------------+--------------------------+
| id           | select_type           | table           | type           | possible_keys           | key           | key_len           | ref              | rows           | Extra                    |
+--------------+-----------------------+-----------------+----------------+-------------------------+---------------+-------------------+------------------+----------------+--------------------------+
| 1            | SIMPLE                | b               | ref            | w_id,uid                | w_id          | 4                 | const            | 402            | Using where              |
| 1            | SIMPLE                | a               | eq_ref         | PRIMARY,idx_4           | PRIMARY       | 3                 | test.b.uid       | 1              | Using where; Using index |           

問:left join不是一定會以左表作為驅動表嗎?為什麼explain檢視left join的執行計劃時,有時右邊的表會作為驅動表?如果left join右邊的表足夠小,會被轉換為驅動表嗎?

答:并不是右表足夠小就會作為驅動表,而是當left join的效果與join是完全相同的時候,left join會被轉換為join,此時驅動表與被驅動表會依據結果集的大小來判斷,自然會有右邊的表會作為驅動表的情況

問:什麼情況下,left join的效果與join是完全相同的?

答:當最外層對右表進行where條件非空過濾時

即:

A:select a.,b. from a left join b on a.id=b.id where b.t1=2

B:select a.,b. from a join b on a.id=b.id where b.t1=2

A與B對效果是完全相同的,最後的結果都是a與滿足b.t1=1的b表關聯的結果集,A會被優化器轉換為B

mysql>select a.*,b.* from a left join b on a.id=b.id
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            2 |            1 |
|            2 |            2 |            2 |            2 |            2 |            2 |
|          111 |            1 |            1 |              |              |              |
|          222 |            2 |            2 |              |              |              |
|           11 |            1 |            1 |              |              |              |
|           22 |            2 |            2 |              |              |              |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[6],耗時:9 ms.
mysql>select a.*,b.* from a join b on a.id=b.id
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            2 |            1 |
|            2 |            2 |            2 |            2 |            2 |            2 |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[2],耗時:8 ms.

結果集相同,可互相轉換:
mysql>select a.*,b.* from a left join b on a.id=b.id where b.t1=2
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            2 |            1 |
|            2 |            2 |            2 |            2 |            2 |            2 |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[2],耗時:8 ms.
mysql>select a.*,b.* from a join b on a.id=b.id where b.t1=2
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            2 |            1 |
|            2 |            2 |            2 |            2 |            2 |            2 |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[2],耗時:7 ms.

結果集不同,不可轉換:
mysql>select a.*,b.* from a left join b on a.id=b.id where b.t1 is null
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|          111 |            1 |            1 |              |              |              |
|          222 |            2 |            2 |              |              |              |
|           11 |            1 |            1 |              |              |              |
|           22 |            2 |            2 |              |              |              |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[4],耗時:9 ms.
mysql>select a.*,b.* from a join b on a.id=b.id where b.t1 is null
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[0],耗時:8 ms.           

【進一步說明】

如果where條件是對左表的非空過濾,left join是否還等價于join?

A:select a.,b. from a left join b on a.id=b.id where a.t1=1

B:select a.,b. from a join b on a.id=b.id where a.t1=1

答:不等價。

A表示a與b左表關聯後,關聯後的結果集中肯定有a表的全部資料,然後再通過a.k1=1進行過濾

B表示a與b進行關聯後,關聯後的結果集中是a表的部分資料,然後再對這部分資料通過a.k1=1進行過濾

兩者是不等價的,這種情況的left join一定會以a表作為驅動表

mysql>select a.*,b.* from a left join b on a.id=b.id
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            2 |            1 |
|            2 |            2 |            2 |            2 |            2 |            2 |
|          111 |            1 |            1 |              |              |              |
|          222 |            2 |            2 |              |              |              |
|           11 |            1 |            1 |              |              |              |
|           22 |            2 |            2 |              |              |              |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[6],耗時:7 ms.
mysql>select a.*,b.* from a join b on a.id=b.id
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            2 |            1 |
|            2 |            2 |            2 |            2 |            2 |            2 |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[2],耗時:8 ms.

mysql>select a.*,b.* from a left join b on a.id=b.id where a.t1=1
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            2 |            1 |
|          111 |            1 |            1 |              |              |              |
|           11 |            1 |            1 |              |              |              |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[3],耗時:10 ms.
mysql>select a.*,b.* from a join b on a.id=b.id where a.t1=1
+--------------+--------------+--------------+--------------+--------------+--------------+
| id           | t1           | t2           | id           | t1           | t2           |
+--------------+--------------+--------------+--------------+--------------+--------------+
|            1 |            1 |            1 |            1 |            2 |            1 |
+--------------+--------------+--------------+--------------+--------------+--------------+
傳回行數:[1],耗時:8 ms.           

簡單點了解,如果left join最後帶有where條件,這個where是針對右表的,通過where過濾後可能導緻右表的資料小于左表,而此時left join與join也是等價的,這樣就會以右表作為驅動表

如果left join的where條件是針對左表的,在left join中,左表本來就會作為驅動表,通過where過濾後左表的資料可能更小,更适合作為驅動表

【總結】

本文對left join對兩種問題進行了分析與說明,如有不足之處歡迎指正。

繼續閱讀