MySQL 8.0 Server層最新架構詳解

作者 | 道客

來源 | 阿裡技術公衆号

一背景和架構

本文基于MySQL 8.0.25源碼進行分析和總結。這裡MySQL Server層指的是MySQL的優化器、執行器部分。我們對MySQL的了解還建立在5.6和5.7版本的了解之上，更多的是對比PostgreSQL或者傳統資料庫。然而從MySQL 8.0開始，持續每三個月的疊代和重構工作，使得MySQL Server層的整體架構有了質的飛越。下面來看下MySQL最新的架構。

我們可以看到最新的MySQL的分層架構和其他資料庫并沒有太大的差別，另外值得一提的是從圖中可以看出MySQL現在更多的加強InnoDB、NDB叢集和RAPID(HeatWave clusters)記憶體叢集架構的演進。下面我們就看下具體細節，我們這次不随着官方的Feature實作和重構順序進行了解，本文更偏向于從優化器、執行器的流程角度來演進。

二 MySQL 解析器Parser

首先從Parser開始，官方MySQL 8.0使用Bison進行了重寫，生成Parser Tree，同時Parser Tree會contextualize生成MySQL抽象文法樹（Abstract Syntax Tree）。

MySQL抽象文法樹和其他資料庫有些不同，是由比較讓人拗口的SELECT_LEX_UNIT/SELECT_LEX類交替構成的，然而這兩個結構在最新的版本中已經重命名成标準的SELECT_LEX -> Query_block和SELECT_LEX_UNIT -> Query_expression。Query_block是代表查詢塊，而Query_expression是包含多個查詢塊的查詢表達式，包括UNION AND/OR的查詢塊（如SELECT FROM t1 union SELECT FROM t2）或者有多Level的ORDER BY/LIMIT (如SELECT * FROM t1 ORDER BY a LIMIT 10) ORDER BY b LIMIT 5。

例如，來看一個複雜的嵌套查詢：

(SELECT *
   FROM ttt1)
UNION ALL
  (SELECT *
   FROM
     (SELECT *
      FROM ttt2) AS a,
     (SELECT *
      FROM ttt3
      UNION ALL SELECT *
      FROM ttt4) AS b)

在MySQL中就可以用下面方式表達：

經過解析和轉換後的文法樹仍然建立在Query_block和Query_expression的架構下，隻不過有些LEVEL的query block被消除或者合并了，這裡不再詳細展開。

三 MySQL prepare/rewrite階段

接下來我們要經過resolve和transformation過程Query_expression::prepare->Query_block::prepare，這個過程包括（按功能分而非完全按照執行順序）：

1 Setup and Fix

setup_tables：Set up table leaves in the query block based on list of tables.
resolve_placeholder_tables/merge_derived/setup_table_function/setup_materialized_derived：Resolve derived table, view or table function references in query block.
setup_natural_join_row_types：Compute and store the row types of the top-most NATURAL/USING joins.
setup_wild：Expand all '*' in list of expressions with the matching column references.
setup_base_ref_items：Set query_block's base_ref_items.
setup_fields：Check that all given fields exists and fill struct with current data.
setup_conds：Resolve WHERE condition and join conditions.
setup_group：Resolve and set up the GROUP BY list.
m_having_cond->fix_fields：Setup the HAVING clause.
resolve_rollup：Resolve items in SELECT list and ORDER BY list for rollup processing.
resolve_rollup_item：Resolve an item (and its tree) for rollup processing by replacing items matching grouped expressions with Item_rollup_group_items and updating properties (m_nullable, PROP_ROLLUP_FIELD). Also check any GROUPING function for incorrect column.
setup_order：Set up the ORDER BY clause.
resolve_limits：Resolve OFFSET and LIMIT clauses.
Window::setup_windows1：Set up windows after setup_order() and before setup_order_final().
setup_order_final：Do final setup of ORDER BY clause, after the query block is fully resolved.
setup_ftfuncs：Setup full-text functions after resolving HAVING.
resolve_rollup_wfs : Replace group by field references inside window functions with references in the presence of ROLLUP.

2 Transformation

remove_redundant_subquery_clause : Permanently remove redundant parts from the query if 1) This is a subquery 2) Not normalizing a view. Removal should take place when a query involving a view is optimized, not when the view is created.
remove_base_options：Remove SELECT_DISTINCT options from a query block if can skip distinct.
resolve_subquery : Resolve predicate involving subquery, perform early unconditional subquery transformations.
- Convert subquery predicate into semi-join, or
- Mark the subquery for execution using materialization, or
- Perform IN->EXISTS transformation, or
- Perform more/less ALL/ANY -> MIN/MAX rewrite
- Substitute trivial scalar-context subquery with its value
transform_scalar_subqueries_to_join_with_derived：Transform eligible scalar subqueries to derived tables.
flatten_subqueries：Convert semi-join subquery predicates into semi-join join nests. Convert candidate subquery predicates into semi-join join nests. This transformation is performed once in query lifetime and is irreversible.
apply_local_transforms :
- delete_unused_merged_columns : If query block contains one or more merged derived tables/views, walk through lists of columns in select lists and remove unused columns.
- simplify_joins：Convert all outer joins to inner joins if possible
- prune_partitions：Perform partition pruning for a given table and condition.
push_conditions_to_derived_tables：Pushing conditions down to derived tables must be done after validity checks of grouped queries done by apply_local_transforms();
Window::eliminate_unused_objects：Eliminate unused window definitions, redundant sorts etc.

這裡，節省篇幅，我們隻舉例關注下和top_join_list相關的simple_joins這個函數的作用，對于Query_block中嵌套join的簡化過程。

3 對比PostgreSQL

為了更清晰的了解标準資料庫的做法，我們這裡引用了PostgreSQL的這三個過程：

Parser

下圖首先Parser把SQL語句生成parse tree。

testdb=# SELECT id, data FROM tbl_a WHERE id < 300 ORDER BY data;

Analyzer/Analyser

下圖展示了PostgreSQL的analyzer/analyser如何将parse tree通過語義分析後生成query tree。

Rewriter

Rewriter會根據規則系統中的規則把query tree進行轉換改寫。

sampledb=# CREATE VIEW employees_list 
sampledb-#      AS SELECT e.id, e.name, d.name AS department 
sampledb-#            FROM employees AS e, departments AS d WHERE e.department_id = d.id;

下圖的例子就是一個包含view的query tree如何展開成新的query tree。

sampledb=# SELECT * FROM employees_list;

四 MySQL Optimize和Planning階段

接下來我們進入了邏輯計劃生成實體計劃的過程，本文還是注重于結構的解析，而不去介紹生成的細節，MySQL過去在8.0.22之前，主要依賴的結構就是JOIN和QEP_TAB。JOIN是與之對應的每個Query_block，而QEP_TAB對應的每個Query_block涉及到的具體“表”的順序、方法和執行計劃。然而在8.0.22之後，新的基于Hypergraph的優化器算法成功的抛棄了QEP_TAB結構來表達左深樹的執行計劃，而直接使用HyperNode/HyperEdge的圖來表示執行計劃。

舉例可以看到資料結構表達的left deep tree和超圖結構表達的bushy tree對應的不同計劃展現：

| -> Inner hash join (no condition)  (cost=1.40 rows=1)
    -> Table scan on R4  (cost=0.35 rows=1)
    -> Hash
        -> Inner hash join (no condition)  (cost=1.05 rows=1)
            -> Table scan on R3  (cost=0.35 rows=1)
            -> Hash
                -> Inner hash join (no condition)  (cost=0.70 rows=1)
                    -> Table scan on R2  (cost=0.35 rows=1)
                    -> Hash
                        -> Table scan on R1  (cost=0.35 rows=1)

| -> Nested loop inner join  (cost=0.55..0.55 rows=0)
    -> Nested loop inner join  (cost=0.50..0.50 rows=0)
        -> Table scan on R4  (cost=0.25..0.25 rows=1)
        -> Filter: (R4.c1 = R3.c1)  (cost=0.35..0.35 rows=0)
            -> Table scan on R3  (cost=0.25..0.25 rows=1)
    -> Nested loop inner join  (cost=0.50..0.50 rows=0)
        -> Table scan on R2  (cost=0.25..0.25 rows=1)
        -> Filter: (R2.c1 = R1.c1)  (cost=0.35..0.35 rows=0)
            -> Table scan on R1  (cost=0.25..0.25 rows=1)

MySQL8.0.2x為了更好的相容兩種優化器，引入了新的類AccessPath，可以認為這是MySQL為了解耦執行器和不同優化器抽象出來的Plan Tree。

1 老優化器的入口

老優化器仍然走JOIN::optimize來把query block轉換成query execution plan (QEP)。

這個階段仍然做一些邏輯的重寫工作，這個階段的轉換可以了解為基于cost-based優化前做準備，詳細步驟如下：

Logical transformations
- optimize_derived : Optimize the query expression representing a derived table/view.
- optimize_cond : Equality/constant propagation.
- prune_table_partitions : Partition pruning.
- optimize_aggregated_query : COUNT(*), MIN(), MAX() constant substitution in case of implicit grouping.
- substitute_gc : ORDER BY optimization, substitute all expressions in the WHERE condition and ORDER/GROUP lists that match generated columns (GC) expressions with GC fields, if any.
Perform cost-based optimization of table order and access path selection.
- JOIN::make_join_plan() : Set up join order and initial access paths.
Post-join order optimization
- substitute_for_best_equal_field : Create optimal table conditions from the where clause and the join conditions.
- make_join_query_block : Inject outer-join guarding conditions.
- Adjust data access methods after determining table condition (several times).
- optimize_distinct_group_order : Optimize ORDER BY/DISTINCT.
- optimize_fts_query : Perform FULLTEXT search before all regular searches.
- remove_eq_conds : Removes const and eq items. Returns the new item, or nullptr if no condition.
- replace_index_subquery/create_access_paths_for_index_subquery : See if this subquery can be evaluated with subselect_indexsubquery_engine.
- setup_join_buffering : Check whether join cache could be used.
Code generation
- alloc_qep(tables) : Create QEP_TAB array.
- test_skip_sort : Try to optimize away sorting/distinct.
- make_join_readinfo : Plan refinement stage: do various setup things for the executor.
- make_tmp_tables_info : Setup temporary table usage for grouping and/or sorting.
- push_to_engines : Push (parts of) the query execution down to the storage engines if they can provide faster execution of the query, or part of it.
- create_access_paths : generated ACCESS_PATH.

2 新優化器的入口

新優化器預設不打開，必須通過set optimizer_switch="hypergraph_optimizer=on"; 來打開。主要通過FindBestQueryPlan函數來實作，邏輯如下：

先判斷是否屬于新優化器可以支援的Query文法（CheckSupportedQuery），不支援的直接傳回錯誤ER_HYPERGRAPH_NOT_SUPPORTED_YET。

轉化top_join_list變成JoinHypergraph結構。由于Hypergraph是比較獨立的算法層面的實作，JoinHypergraph結構用來更好的把資料庫的結構包裝到Hypergraph的edges和nodes的概念上的。

通過EnumerateAllConnectedPartitions實作論文中的DPhyp算法。

CostingReceiver類包含了過去JOIN planning的主要邏輯，包括根據cost選擇相應的通路路徑，根據DPhyp生成的子計劃進行評估，保留cost最小的子計劃。

得到root_path後，接下來處理group/agg/having/sort/limit的。對于Group by操作，目前Hypergraph使用sorting first + streaming aggregation的方式。

舉例看下Plan（AccessPath）和SQL的關系：

最後生成Iterator執行器架構需要的Iterator執行載體，AccessPath和Iterator是一對一的關系（Access paths are a query planning structure that correspond 1:1 to iterators）。

Query_expression::m_root_iterator = CreateIteratorFromAccessPath(......)

unique_ptr_destroy_only<RowIterator> CreateIteratorFromAccessPath(
     THD *thd, AccessPath *path, JOIN *join, bool eligible_for_batch_mode) {
......
   switch (path->type) {
     case AccessPath::TABLE_SCAN: {
       const auto &param = path->table_scan();
       iterator = NewIterator<TableScanIterator>(
           thd, param.table, path->num_output_rows, examined_rows);
       break;
     }
     case AccessPath::INDEX_SCAN: {
       const auto &param = path->index_scan();
       if (param.reverse) {
         iterator = NewIterator<IndexScanIterator<true>>(
             thd, param.table, param.idx, param.use_order, path->num_output_rows,
             examined_rows);
       } else {
         iterator = NewIterator<IndexScanIterator<false>>(
             thd, param.table, param.idx, param.use_order, path->num_output_rows,
             examined_rows);
       }
       break;
     }
     case AccessPath::REF: {
......
}

testdb=# EXPLAIN SELECT * FROM tbl_a WHERE id < 300 ORDER BY data;
                          QUERY PLAN                           
---------------------------------------------------------------
 Sort  (cost=182.34..183.09 rows=300 width=8)
   Sort Key: data
   ->  Seq Scan on tbl_a  (cost=0.00..170.00 rows=300 width=8)
         Filter: (id < 300)
(4 rows)

五總結

本文主要focus在MySQL最新版本官方的源碼上，重點分析了官方的重構在多階段和各階段結構上的變化和聯系，更多的是為了讓大家了解一個全新的MySQL的發展。

關于我們

PolarDB 是阿裡巴巴自主研發的雲原生分布式關系型資料庫，于2020年進入Gartner全球資料庫Leader象限，并獲得了2020年中國電子學會頒發的科技進步一等獎。PolarDB 基于雲原生分布式資料庫架構，提供大規模線上事務處理能力，兼具對複雜查詢的并行處理能力，在雲原生分布式資料庫領域整體達到了國際領先水準，并且得到了廣泛的市場認可。在阿裡巴巴集團内部的最佳實踐中，PolarDB還全面支撐了2020年天貓雙十一，并重新整理了資料庫處理峰值記錄，高達1.4億TPS。歡迎有志之士加入我們，履歷請投遞到[email protected]，期待與您共同打造世界一流的下一代雲原生分布式關系型資料庫。

資料庫技術圖譜

資料庫技術圖譜由阿裡雲資料庫專家團出品，含7個知識點，17個課程，6個體驗場景，9個公開課，帶你從基礎到進階玩轉常見資料庫，同時深入實戰，學習阿裡雲使用各資料庫的最佳實踐！

點選這裡

，開始學習吧~

MySQL 8.0 Server層最新架構詳解

一背景和架構

二 MySQL 解析器Parser

三 MySQL prepare/rewrite階段

1 Setup and Fix

2 Transformation

3 對比PostgreSQL

四 MySQL Optimize和Planning階段

1 老優化器的入口

2 新優化器的入口

五總結

關于我們

繼續閱讀

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

spark/scala關于【資源檔案】加載方法概述外部檔案加載方案測試資源檔案打包入jar包中小結

NOSQL安全攻擊

mybatis_入門程式Mybatis入門

mysql使用source指令導入.sql檔案

AOP程式設計_Android優雅權限架構(1)概念基礎，2021金三銀四前言正文大綱正文

登入plsql 報錯 the account is locked --使用者被鎖

sqlServer根據經緯查距離

Effective Java 8:通用程式設計

SequoiaDB巨杉資料庫C++驅動概述

OOM三種類型

工廠模式-三種類型

【遞歸】高效率求2的n次幂

win10本地scala和spark安裝安裝scala安裝spark

scala (3) Function 和 Method

hdu7108哈希

MySQL 8.0 Server層最新架構詳解

一 背景和架構

二 MySQL 解析器Parser

三 MySQL prepare/rewrite階段

1 Setup and Fix

2 Transformation

3 對比PostgreSQL

四 MySQL Optimize和Planning階段

1 老優化器的入口

2 新優化器的入口

五 總結

關于我們

繼續閱讀

一背景和架構

五總結