天天看點

正确了解Left join

1.1 測試資料

root@mysql 11:24:  [db1]> show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `uid` int(11) NOT NULL AUTO_INCREMENT,
  `uname` varchar(10) DEFAULT NULL,
  `is_delete` int(11) DEFAULT '0',
  PRIMARY KEY (`uid`),
  KEY `idx_uname` (`uname`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

root@mysql 11:24:  [db1]> show create table t2\G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `uname` varchar(10) DEFAULT NULL,
  `uage` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_uname` (`uname`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

root@mysql 11:24:  [db1]> select * from t1;
+-----+-------+-----------+
| uid | uname | is_delete |
+-----+-------+-----------+
|   1 | aa    |         0 |
|   2 | bb    |         0 |
|   3 | cc    |         0 |
|   4 | dd    |         0 |
|   5 | ee    |         0 |
+-----+-------+-----------+
5 rows in set (0.00 sec)

root@mysql 11:24:  [db1]> select * from t2;
+----+-------+------+
| id | uname | uage |
+----+-------+------+
|  1 | aa    | 12   |
|  2 | aa    | 13   |
|  5 | cc    | 12   |
+----+-------+------+
3 rows in set (0.00 sec)           

1.2 條件放在on和where之前的差別

1、從結果顯示來看

從結果顯示來看兩者是完全不同的:

1)當把過濾條件寫在and上時,傳回結果集中會顯示左表全部記錄,右表滿足條件的記錄正常顯示,不滿足條件的記錄顯示為NULL,是我們通俗了解上left join應該顯示的結果;

2)而當把過濾條件寫在where上時,雖然SQL中我們使用了left join去做表關聯,但是實際結果集中并不是我們想要的傳回,結果隻是傳回了滿足條件的所有記錄。

root@mysql 11:24:  [db1]>  select * from t1 left join t2 on t1.uname=t2.uname and t2.uage>12;
+-----+-------+-----------+------+-------+------+
| uid | uname | is_delete | id   | uname | uage |
+-----+-------+-----------+------+-------+------+
|   1 | aa    |         0 |    2 | aa    | 13   |
|   2 | bb    |         0 | NULL | NULL  | NULL |
|   3 | cc    |         0 | NULL | NULL  | NULL |
|   4 | dd    |         0 | NULL | NULL  | NULL |
|   5 | ee    |         0 | NULL | NULL  | NULL |
+-----+-------+-----------+------+-------+------+
5 rows in set (0.00 sec)

root@mysql 11:25:  [db1]>  select * from t1 left join t2 on t1.uname=t2.uname where t2.uage>12;
+-----+-------+-----------+------+-------+------+
| uid | uname | is_delete | id   | uname | uage |
+-----+-------+-----------+------+-------+------+
|   1 | aa    |         0 |    2 | aa    | 13   |
+-----+-------+-----------+------+-------+------+
1 row in set (0.00 sec)
           

2、從執行計劃來看

1)當把過濾條件寫在and上時,執行計劃沒有做過多的改寫,左表t1作為驅動表與t2進行關聯查詢;

2)當把過濾條件寫在where上時,我們發現MySQL對原SQL進行了改寫,最重要的一點是将left join改寫為join,這個動作就導緻SQL在執行計劃中會優先選擇小表作為驅動表,而并不一定是左表t1作為驅動表。

root@mysql 11:26:  [db1]> explain  select * from t1 left join t2 on t1.uname=t2.uname and t2.uage>12;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                              |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    5 |   100.00 | NULL                                               |
|  1 | SIMPLE      | t2    | NULL       | ALL  | idx_uname     | NULL | NULL    | NULL |    3 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

root@mysql 11:26:  [db1]> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select `db1`.`t1`.`uid` AS `uid`,`db1`.`t1`.`uname` AS `uname`,`db1`.`t1`.`is_delete` AS `is_delete`,`db1`.`t2`.`id` AS `id`,`db1`.`t2`.`uname` AS `uname`,`db1`.`t2`.`uage` AS `uage` from `db1`.`t1` left join `db1`.`t2` on(((`db1`.`t2`.`uname` = `db1`.`t1`.`uname`) and (`db1`.`t2`.`uage` > 12))) where 1
1 row in set (0.00 sec)

root@mysql 11:26:  [db1]>
root@mysql 11:26:  [db1]>
root@mysql 11:26:  [db1]> explain  select * from t1 left join t2 on t1.uname=t2.uname where t2.uage>12;
+----+-------------+-------+------------+------+---------------+-----------+---------+--------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key       | key_len | ref          | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+-----------+---------+--------------+------+----------+-------------+
|  1 | SIMPLE      | t2    | NULL       | ALL  | idx_uname     | NULL      | NULL    | NULL         |    3 |    33.33 | Using where |
|  1 | SIMPLE      | t1    | NULL       | ref  | idx_uname     | idx_uname | 43      | db1.t2.uname |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+------+---------------+-----------+---------+--------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

root@mysql 11:26:  [db1]> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select `db1`.`t1`.`uid` AS `uid`,`db1`.`t1`.`uname` AS `uname`,`db1`.`t1`.`is_delete` AS `is_delete`,`db1`.`t2`.`id` AS `id`,`db1`.`t2`.`uname` AS `uname`,`db1`.`t2`.`uage` AS `uage` from `db1`.`t1` join `db1`.`t2` where ((`db1`.`t1`.`uname` = `db1`.`t2`.`uname`) and (`db1`.`t2`.`uage` > 12))
1 row in set (0.00 sec)

           

1.3 left join .. where .. is null 的使用

基于以上的案例,我們可以得出以下結論:

1)left join會傳回左表所有記錄,右表滿足過濾條件記錄正常回報,不滿足記錄傳回NULL處理,join隻會傳回兩表均滿足過濾條件的記錄

2)left join把過濾條件寫在on條件上時才是我們通俗了解上的left join,而left join中将表過濾條件寫在where上時,MySQL會把left join改寫為join。

3)left join關聯查詢時,表的驅動順序是确定的,左表作為驅動表與右表進行關聯查詢,但是若MySQL優化器将left join改寫為join的情況下,MySQL就會優先選擇小表作為驅動表進行關聯查詢,一定程度上提升了SQL的執行效率。

但是,對于第一/二條,将過濾條件放在where條件上的時候,MySQL優化器就一定會将left join 改寫為join嗎?left join一定會傳回左表全部記錄嗎?答案顯然是“不一定的”,以下就為大家展示一個特例。

root@mysql 11:42:  [db1]> select * from t1 left join t2 on t1.uname=t2.uname;
+-----+-------+-----------+------+-------+------+
| uid | uname | is_delete | id   | uname | uage |
+-----+-------+-----------+------+-------+------+
|   1 | aa    |         0 |    1 | aa    | 12   |
|   1 | aa    |         0 |    2 | aa    | 13   |
|   3 | cc    |         0 |    5 | cc    | 12   |
|   2 | bb    |         0 | NULL | NULL  | NULL |
|   4 | dd    |         0 | NULL | NULL  | NULL |
|   5 | ee    |         0 | NULL | NULL  | NULL |
+-----+-------+-----------+------+-------+------+
6 rows in set (0.00 sec)

root@mysql 11:41:  [db1]> explain  select * from t1 left join t2 on t1.uname=t2.uname where t2.uage is null;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                              |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    5 |   100.00 | NULL                                               |
|  1 | SIMPLE      | t2    | NULL       | ALL  | idx_uname     | NULL | NULL    | NULL |    3 |    33.33 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

root@mysql 11:41:  [db1]> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select `db1`.`t1`.`uid` AS `uid`,`db1`.`t1`.`uname` AS `uname`,`db1`.`t1`.`is_delete` AS `is_delete`,`db1`.`t2`.`id` AS `id`,`db1`.`t2`.`uname` AS `uname`,`db1`.`t2`.`uage` AS `uage` from `db1`.`t1` left join `db1`.`t2` on((`db1`.`t2`.`uname` = `db1`.`t1`.`uname`)) where isnull(`db1`.`t2`.`uage`)
1 row in set (0.00 sec)

root@mysql 11:41:  [db1]> select * from t1 left join t2 on t1.uname=t2.uname where t2.uage is null;
+-----+-------+-----------+------+-------+------+
| uid | uname | is_delete | id   | uname | uage |
+-----+-------+-----------+------+-------+------+
|   2 | bb    |         0 | NULL | NULL  | NULL |
|   4 | dd    |         0 | NULL | NULL  | NULL |
|   5 | ee    |         0 | NULL | NULL  | NULL |
+-----+-------+-----------+------+-------+------+
3 rows in set (0.00 sec)           

可以看到,我們使用left join,但是where 過濾條件是右表某些字段is null的查詢時。首先從執行計劃來看,MySQL優化器并沒有将left join改寫為join;然後從結果傳回來看,可以看到該SQL僅僅傳回了滿足過濾條件的記錄,并沒有傳回左表全部記錄。

對于left join,過濾條件是右表某字段is null的情況是一個特例,而且這種寫法經常被DBA同學用來做業務上一些not in/not exists的改寫優化。

1.4 結論

3)left join且where過濾條件為右表某字段is null時屬于一個特例,該情況下MySQL不會将left join改寫為join,進而其表關聯查詢的順序也隻能是左表作為驅動表與右表進行關聯查詢。

4)left join關聯查詢時,表的驅動順序是确定的,左表作為驅動表與右表進行關聯查詢,但是若MySQL優化器将left join改寫為join的情況下,MySQL就會優先選擇小表作為驅動表進行關聯查詢,一定程度上提升了SQL的執行效率。