索引的十大原則

1、正确了解和計算索引字段的區分度，文中有計算規則，區分度高的索引，可以快速得定位資料，區分度太低，無法有效的利用索引，可能需要掃描大量資料頁，和不使用索引沒什麼差别。

2、正确了解和計算字首索引的字段長度，文中有判斷規則，合适的長度要保證高的區分度和最恰當的索引存儲容量，隻有達到最佳狀态，才是保證高效率的索引。

3、聯合索引注意最左比對原則：必須按照從左到右的順序比對，MySQL會一直向右比對索引直到遇到範圍查詢(>、<、between、like)然後停止比對。

如 depno=1 and empname>'' and job=1 ，如果建立(depno,empname,job)順序的索引，job是用不到索引的。

4、應需而取政策，查詢記錄的時候，不要一上來就使用*，隻取需要的資料，可能的話盡量隻利用索引覆寫，可以減少回表操作，提升效率。

5、正确判斷是否使用聯合索引（上面聯合索引的使用那一小節有說明判斷規則），也可以進一步分析到索引下推（IPC），減少回表操作，提升效率。

6、避免索引失效的原則：禁止對索引字段使用函數、運算符操作，會使索引失效。這是實際上就是需要保證索引所對應字段的”幹淨度“。

7、避免非必要的類型轉換，字元串字段使用數值進行比較的時候會導緻索引無效。

8、模糊查詢'%value%'會使索引無效，變為全表掃描，因為無法判斷掃描的區間，但是'value%'是可以有效利用索引。

9、索引覆寫排序字段，這樣可以減少排序步驟，提升查詢效率

10、盡量的擴充索引，非必要不建立索引。比如表中已經有a的索引，現在要加(a,b)的索引，那麼隻需要修改原來的索引即可。

查詢優化分析器 - explain

explain指令大家應該很熟悉，具體用法和字段含義可以參考官網explain-output，這裡需要強調rows是核心名額，絕大部分rows小的語句執行一定很快，因為掃描的内容基數小。

是以優化語句基本上都是在優化降低rows值。

慢查詢優化基本步驟

1.先運作檢視實際耗時，判斷是否真的很慢（注意設定SQL_NO_CACHE）。

2.高區分度優先政策：where條件單表查，鎖定最小傳回記錄表的條件。

就是查詢語句的where都應用到表中傳回的記錄數最小的表開始查起，單表每個字段分别查詢，看哪個字段的區分度最高。區分度高的字段往前排。

3.explain檢視執行計劃，是否與1預期一緻（從鎖定記錄較少的表開始查詢）

4.order by limit 形式的sql語句讓排序的表優先查

5.了解業務方的使用場景，根據使用場景适時調整。

6.加索引時參照建上面索引的十大原則

7.觀察結果，不符合預期繼續從第一步開始分析

查詢案例分析

下面幾個例子詳細解釋了如何分析和優化慢查詢。

複雜查詢條件的分析

一般來說我們編寫SQL的方式是為了是實作功能，在實作功能的基礎上保證MySQL的執行效率也是非常重要的，這要求我們對MySQL的執行計劃和索引規則有非常清晰的了解，分析下面的案例：

1 mysql> select a.*,b.depname,b.memo from emp a left join 
2 dep b on a.depno = b.depno where sal>100 and a.empname like 'ab%'  and a.depno=106 order by a.hiredate desc ;
3 +---------+---------+---------+---------+-----+---------------------+------+------+-------+------------+----------+
4 | id      | empno   | empname | job     | mgr | hiredate            | sal  | comn | depno | depname    | memo     |
5 +---------+---------+---------+---------+-----+---------------------+------+------+-------+------------+----------+
6 | 4976754 | 4976754 | ABijwE  | SALEMAN |   1 | 2021-01-23 16:46:24 | 2000 | 400  |   106 | kDpNWugzcQ | TYlrVEkm |
7 ......
8 +---------+---------+---------+---------+-----+---------------------+------+------+-------+------------+----------+
9 744 rows in set  (4.958 sec)

總共就查詢了744條資料，卻耗費了4.958的時間，我們看一下目前表中現存的索引以及索引使用的情況分析

1 mysql> show index from emp;
 2 +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
 3 | Table | Non_unique | Key_name      | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
 4 +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
 5 | emp   |          0 | PRIMARY       |            1 | id          | A         |     4952492 | NULL     | NULL   |      | BTREE      |         |               |
 6 | emp   |          1 | idx_emo_depno |            1 | depno       | A         |          18 | NULL     | NULL   |      | BTREE      |         |               |
 7 +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
 8 2 rows in set
 9 
10 mysql> explain select a.*,b.depname,b.memo from emp a left join 
11 dep b on a.depno = b.depno where sal>100 and a.empname like 'ab%'  and a.depno=106 order by a.hiredate desc ;
12 +----+-------------+-------+------+---------------+---------------+---------+-------+--------+-----------------------------+
13 | id | select_type | table | type | possible_keys | key           | key_len | ref   | rows   | Extra                       |
14 +----+-------------+-------+------+---------------+---------------+---------+-------+--------+-----------------------------+
15 |  1 | SIMPLE      | a     | ref  | idx_emo_depno | idx_emo_depno | 3       | const | 974898 | Using where; Using filesort |
16 |  1 | SIMPLE      | b     | ref  | idx_dep_depno | idx_dep_depno | 3       | const |      1 | NULL                        |
17 +----+-------------+-------+------+---------------+---------------+---------+-------+--------+-----------------------------+
18 2 rows in set

可以看出，目前在emp表上除了主鍵隻存在一個索引 idx_emo_depno ，作用在部門編号字段上，該索引的目标是過濾出具體部門編号下的資料。

通過explain 分析器可以看到 where條件後面是走了 idx_emo_depno 索引，但是也比較了 97W的資料，說明該字段的區分度并不高，根據高區分度優先原則，我們對這個表的三個查詢字段分别進行區分度計算。

1 mysql> select count(distinct empname)/count(*),count(distinct depno)/count(*),count(distinct sal)/count(*) from emp; 
2 +----------------------------------+--------------------------------+------------------------------+
3 | count(distinct empname)/count(*) | count(distinct depno)/count(*) | count(distinct sal)/count(*) |
4 +----------------------------------+--------------------------------+------------------------------+
5 | 0.1713                           | 0.0000                         | 0.0000                       |
6 +----------------------------------+--------------------------------+------------------------------+
7 1 row in set

這是計算結果，empname的區分度最高，是以合理上是可以建立一個包含這三個字段的聯合索引，順序如下：empname、depno、sal；

并且查詢條件重新調整了順序，符合最左比對原則；另一方面根據應需而取的政策，把b.memo字段去掉了。

1 mysql> select a.*,b.depname from emp a left join 
2 dep b on a.depno = b.depno where  a.empname like 'ab%'  and a.depno=106 and a.sal>100 order by a.hiredate desc ;
3 +---------+---------+---------+---------+-----+---------------------+------+------+-------+------------+
4 | id      | empno   | empname | job     | mgr | hiredate            | sal  | comn | depno | depname    |
5 +---------+---------+---------+---------+-----+---------------------+------+------+-------+------------+
6 | 4976754 | 4976754 | ABijwE  | SALEMAN |   1 | 2021-01-23 16:46:24 | 2000 | 400  |   106 | kDpNWugzcQ |
7 ......
8 +---------+---------+---------+---------+-----+---------------------+------+------+-------+------------+
9 744 rows in set  (0.006 sec)

這邊還有一個問題，那就是聯合索引根據最左比對原則：必須按照從左到右的順序比對，MySQL會一直向右比對索引直到遇到範圍查詢(>、<、between、like)然後停止比對。

是以語句中執行到a.empname 字段，因為使用了like，後面就不再走索引了。在這個場景中，獨立的empname字段上的索引和這個聯合索引效率是差不多的。

另外排序字段hiredate也可以考慮到覆寫到索引中，會相應的提高效率。

無效索引的分析

有一個需求，使用到了使用者表 userinfo 和消費明細表 salinvest ，目的想把2020年每個使用者在四個品類等級（A1、A2、A3、A4）上的消費額度進行統計，是以便下了如下的腳本：

1 select (@rowNO := @rowNo+1) AS id,bdata.* from 
 2 (
 3 select distinct a.usercode,a.username,
 4 @A1:=IFNULL((select sum(c.ltimenum) from `salinvest` c where c.usercode=a.usercode AND c.gravalue='A1' 
 5 and c.logdate between '2020-01-01' and '2020-12-31'),0) as A1,
 6 @A2:=IFNULL((select sum(c.ltimenum) from `salinvest` c where c.usercode=a.usercode AND c.gravalue='A2' 
 7 and c.logdate between '2020-01-01' and '2020-12-31'),0) as A2,
 8 @A3:=IFNULL((select sum(c.ltimenum) from `salinvest` c where c.usercode=a.usercode AND c.gravalue='A3' 
 9 and c.logdate between '2020-01-01' and '2020-12-31'),0) as A3,
10 @A4:=IFNULL((select sum(c.ltimenum) from `salinvest` c where c.usercode=a.usercode AND c.gravalue='A4' 
11 and c.logdate between '2020-01-01' and '2020-12-31'),0) as A4,
12 ,(@A1+@A2+@A3+@A4) as allnum 
13 from userinfo a 
14 inner JOIN `salinvest` b on a.usercode = b.usercode  
15 where b.logdate between '2020-01-01' and '2020-12-31'
16 order by allnum desc
17 ) as bdata,(SELECT @rowNO:=0) b;

這個查詢看起來貌似沒什麼問題，雖然用到了複合查詢、子查詢，但是如果索引做的正确，也不會有什麼問題。那我們來看看索引，有一個聯合索引，符合我們最左比對原則和高區分度優先原則：

1 mysql> show index from salinvest;
 2 +------------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
 3 | Table      | Non_unique | Key_name               | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
 4 +------------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
 5 | lnuminvest |          0 | PRIMARY                |            1 | autocode    | A         |           5 | NULL     | NULL   |      | BTREE      |         |               |
 6 | lnuminvest |          1 | idx_salinvest_complex |            1 | usercode      | A         |           2 | NULL     | NULL   | YES  | BTREE      |         |               |
 7 | lnuminvest |          1 | idx_salinvest_complex |            2 | gravalue    | A         |           2 | NULL     | NULL   | YES  | BTREE      |         |               |
 8 | lnuminvest |          1 | idx_salinvest_complex |            3 | logdate     | A         |           2 | NULL     | NULL   | YES  | BTREE      |         |               |
 9 +------------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
10 4 rows in set

那我們來看看它的執行效率：

mysql> select (@rowNO := @rowNo+1) AS id,bdata.* from 
(
select (@rowNO := @rowNo+1) AS id,bdata.* from 
(
select distinct a.usercode,a.username,
@A1:=IFNULL((select sum(c.ltimenum) from `salinvest` c where c.usercode=a.usercode AND c.gravalue='A1' 
and c.logdate between '2020-01-01' and '2020-12-31'),0) as A1,
@A2:=IFNULL((select sum(c.ltimenum) from `salinvest` c where c.usercode=a.usercode AND c.gravalue='A2' 
and c.logdate between '2020-01-01' and '2020-12-31'),0) as A2,
@A3:=IFNULL((select sum(c.ltimenum) from `salinvest` c where c.usercode=a.usercode AND c.gravalue='A3' 
and c.logdate between '2020-01-01' and '2020-12-31'),0) as A3,
@A4:=IFNULL((select sum(c.ltimenum) from `salinvest` c where c.usercode=a.usercode AND c.gravalue='A4' 
and c.logdate between '2020-01-01' and '2020-12-31'),0) as A4,
,(@A1+@A2+@A3+@A4) as allnum 
from userinfo a 
inner JOIN `salinvest` b on a.usercode = b.usercode  
where b.logdate between '2020-01-01' and '2020-12-31'
order by allnum desc
) as bdata,(SELECT @rowNO:=0) b;
+----+------------+---------+------+------+------+------+------+--------+
| id | usercode     | username | A1     | A2   | A3   | A4   |allnum
+----+------------+---------+------+------+------+------+------+--------+
|  1 | 063105015    | brand    | 789.00 | 1074.50 | 998.00 | 850.00 |  
......
+----+------------+---------+------+------+------+------+------+--------+
6217 rows in set  (12.745 sec)

我這邊省略了查詢結果，實際上結果輸出6000多條資料，在約50W的資料中進行統計與合并，輸出6000多條資料，花費了将近13秒，這明顯是不合理的。

我們來分析下是什麼原因：

1 mysql> explain select (@rowNO := @rowNo+1) AS id,bdata.* from 
 2 (
 3 select distinct a.usercode,a.username,
 4 @A1:=IFNULL((select sum(c.ltimenum) from `salinvest` c where c.usercode=a.usercode AND c.gravalue='A1' 
 5 and c.logdate between '2020-01-01' and '2020-12-31'),0) as A1,
 6 @A2:=IFNULL((select sum(c.ltimenum) from `salinvest` c where c.usercode=a.usercode AND c.gravalue='A2' 
 7 and c.logdate between '2020-01-01' and '2020-12-31'),0) as A2,
 8 @A3:=IFNULL((select sum(c.ltimenum) from `salinvest` c where c.usercode=a.usercode AND c.gravalue='A3' 
 9 and c.logdate between '2020-01-01' and '2020-12-31'),0) as A3,
10 @A4:=IFNULL((select sum(c.ltimenum) from `salinvest` c where c.usercode=a.usercode AND c.gravalue='A4' 
11 and c.logdate between '2020-01-01' and '2020-12-31'),0) as A4,
12 ,(@A1+@A2+@A3+@A4) as allnum 
13 from userinfo a 
14 inner JOIN `salinvest` b on a.usercode = b.usercode  
15 where b.logdate between '2020-01-01' and '2020-12-31'
16 order by allnum desc
17 ) as bdata,(SELECT @rowNO:=0) b;
18 +----+--------------------+------------+------------+--------+------------------------+------------------------+---------+-----------------------+------+----------+-----------------------------------------------------------+
19 | id | select_type        | table      | partitions | type   | possible_keys          | key                    | key_len | ref                   | rows | filtered | Extra                                                     |
20 +----+--------------------+------------+------------+--------+------------------------+------------------------+---------+-----------------------+------+----------+-----------------------------------------------------------+
21 |  1 | PRIMARY            | <derived8> | NULL       | system | NULL                   | NULL                   | NULL    | NULL                  |    1 |      100 | NULL                                                      |
22 |  1 | PRIMARY            | <derived2> | NULL       | ALL    | NULL                   | NULL                   | NULL    | NULL                  |    2 |      100 | NULL                                                      |
23 |  8 | DERIVED            | NULL       | NULL       | NULL   | NULL                   | NULL                   | NULL    | NULL                  | NULL | NULL     | No tables used                                            |
24 |  2 | DERIVED            | b          | NULL       | index  | idx_salinvest_complex | idx_salinvest_complex | 170     | NULL                  |    5 |       20 | Using where; Using index; Using temporary; Using filesort |
25 |  7 | DEPENDENT SUBQUERY | c          | NULL       | ALL    | idx_salinvest_complex | NULL                   | NULL    | NULL                  |    5 |       20 | Using where                                               |
26 |  6 | DEPENDENT SUBQUERY | c          | NULL       | ALL    | idx_salinvest_complex | NULL                   | NULL    | NULL                  |    5 |       20 | Using where                                               |
27 |  5 | DEPENDENT SUBQUERY | c          | NULL       | ALL    | idx_salinvest_complex | NULL                   | NULL    | NULL                  |    5 |       20 | Using where                                               |
28 |  4 | DEPENDENT SUBQUERY | c          | NULL       | ALL    | idx_salinvest_complex | NULL                   | NULL    | NULL                  |    5 |       20 | Using where                                               |
29 +----+--------------------+------------+------------+--------+------------------------+------------------------+---------+-----------------------+------+----------+-----------------------------------------------------------+
30 9 rows in set

看最後四條資料，看他的possible_key和實際的key，預估是走 idx_salinvest_complex 索引，實際是走了空索引，這個是為什麼呢？看前面的select_type 字段，值是 DEPENDENT SUBQUERY，了然了。

官方對 DEPENDENT SUBQUERY 的說明：子查詢中的第一個SELECT，取決于外面的查詢。

什麼意思呢？它意味着兩步：

第一步，MySQL 根據 select distinct a.usercode,a.username 得到一個大結果集 t1，這就是我們上圖提示的6000使用者。

第二步，上面的大結果集 t1 中的每一條記錄，等同于與子查詢 SQL 組成新的查詢語句： select sum(c.ltimenum) from `salinvest` c where c.usercode in (select distinct a.usercode from userinfo a) 。

也就是說，每個子查詢要比較6000次，幾十萬的資料啊……即使這兩步驟查詢都用到了索引，但還是會很慢。

這種情況下，子查詢的執行效率受制于外層查詢的記錄數，還不如拆成兩個獨立查詢順序執行呢。

這種慢查詢的解決辦法，網上有很多方案，最常用的辦法是用聯合查詢代替子查詢，可以自己去查一下。

總結

上面給出了兩種典型的問題，一種是沒有使用索引使用原則進行索引建構，一種是遇到坑導緻索引無效。我們在實際的應用中遇到過很多問題。比如：

1、不多的資料結果集，但是涉及到超多個表join的低效腳本。

2、無效的聯表查詢，就是其中一張聯表沒有任何使用，但是資料基數極大。（這也可能是某個時候業務變更導緻的sql腳本忘了調整）。

3、varchar類型字段等值比較沒有寫單引号，巨量基數笛卡爾積查詢直接把從庫搞死，在索引字段上做計算導緻索引失效的。

所有的這些案例都隻是一些經驗積累，隻有熟悉查詢優化器、索引的内部原理，了解索引優化的政策，才能定位這些問題的原因并加以解決。

為幫助開發者們提升面試技能、有機會入職BATJ等大廠公司，特别制作了這個專輯——這一次整體放出。

大緻内容包括了： Java 集合、JVM、多線程、并發程式設計、設計模式、Spring全家桶、Java、MyBatis、ZooKeeper、Dubbo、Elasticsearch、Memcached、MongoDB、Redis、MySQL、RabbitMQ、Kafka、Linux、Netty、Tomcat等大廠面試題等、等技術棧！

歡迎大家關注公衆号【Java爛豬皮】，回複【666】，擷取以上最新Java後端架構VIP學習資料以及視訊學習教程，然後一起學習，一文在手，面試我有。

每一個專欄都是大家非常關心，和非常有價值的話題，如果我的文章對你有所幫助，還請幫忙點贊、好評、轉發一下，你的支援會激勵我輸出更高品質的文章，非常感謝！

深入解析MySQL：建構高性能索引（案例分析篇）

索引的十大原則

查詢優化分析器 - explain

慢查詢優化基本步驟

查詢案例分析

複雜查詢條件的分析

無效索引的分析

總結

繼續閱讀

2022秋招面試總結（cpp+java+測開）百度測開一面位元組後端一面蝦皮後端一面蝦皮後端二面

資料庫之DDL操作資料庫DDL操作資料庫DDL操作資料表

資料庫之DQL操作資料庫

mysql優化（sql優化）

資料遷移方法資料遷移原則資料遷移之雙寫方案資料遷移之級聯同步方案

redis叢集資料一緻性_RedisRaft為Redis叢集帶來強大的資料一緻性

寶塔面闆mysql恢複2018.1.8更新

Centos7 MySQL 5.7 安裝MySQL 5.7 安裝

查找入職員工時間排名倒數第三的員工所有資訊

Hibernate使用Hibernate的“3個準備，7個步驟”Hibernate API簡介操作實體對象對象識别

雲計算面試題——mysql/存儲引擎/備份

SQL語言基礎：常用的資料查詢語句

Ubuntu16.04安裝Apache+MySQL+PHP1. 安裝Apache2. 安裝MySQL3. 安裝PHP4. 安裝phpMyAdmin

MySQL的4種隔離級别？出現問題

neo4j之cypher使用文檔

mysql使用source指令導入.sql檔案