天天看点

【clickhouse】Clickhouse 原生执行计划1.概述2.执行计划:3.AST 语法树:4. syntax语法优化:

文章目录

  • 1.概述
  • 2.执行计划:
  • 3.AST 语法树:
  • 4. syntax语法优化:
【clickhouse】Clickhouse 原生执行计划1.概述2.执行计划:3.AST 语法树:4. syntax语法优化:

1.概述

转载:https://vkingnew.blog.csdn.net/article/details/107390097

在clickhouse 20.6版本之前要查看SQL语句的执行计划需要设置日志级别为trace才能可以看到,在20.6版本引入了原生的执行计划的语法。在20.6.3版本成为正式版本的功能。

Clickhouse> select version();
 
SELECT version()
 
┌─version()───┐
│ 20.6.1.4066 │
└─────────────┘
 
1 rows in set. Elapsed: 0.007 sec. 
 
           

执行计划的语法:

PLAN     用于查看执行计划,默认值。
  header       打印计划中各个步骤的 head 说明,默认关闭,默认值0;
  description  打印计划中各个步骤的描述,默认开启,默认值1;
  actions      打印计划中各个步骤的详细信息,默认关闭,默认值0。
AST      用于查看语法树;
SYNTAX   用于优化语法;
PIPELINE 用于查看 PIPELINE 计划。
  header     打印计划中各个步骤的 head 说明,默认关闭;
  graph     用DOT图形语言描述管道图,默认关闭,需要查看相关的图形需要配合graphviz查看;
  actions   如果开启了graph,紧凑打印打,默认开启。
           

PLAN 和 PIPELINE 还可以进行额外的显示设置,如上所示。

2.执行计划:

简单的执行计划示例:

Clickhouse> explain plan select arrayJoin([1,2,3,null,null]);
 
EXPLAIN
SELECT arrayJoin([1, 2, 3, NULL, NULL])
FORMAT TSV
 
Union
  Expression (Projection)
    Expression (Before ORDER BY and SELECT)
      ReadFromStorage (Read from SystemOne)
 
4 rows in set. Elapsed: 0.005 sec. 
 
           

复杂SQL的执行计划:

Clickhouse> explain select database,table,count(1) cnt from system.parts where database in ('datasets','system') group by database,table order by database,cnt desc limit 2 by database;
 
EXPLAIN
SELECT 
    database,
    table,
    count(1) AS cnt
FROM system.parts
WHERE database IN ('datasets', 'system')
GROUP BY 
    database,
    table
ORDER BY 
    database ASC,
    cnt DESC
LIMIT 2 BY database
FORMAT TSV
 
————————————————
Union
  Expression (Projection)
    LimitBy
      Expression (Before LIMIT BY)
        MergingSorted (Merge sorted streams for ORDER BY)
          MergeSorting (Merge sorted blocks for ORDER BY)
            PartialSorting (Sort each block for ORDER BY)
              Expression (Before ORDER BY and SELECT)
                Aggregating
                  Expression (Before GROUP BY)
                    Filter (WHERE)
                      ReadFromStorage (Read from SystemParts)
 
12 rows in set. Elapsed: 0.015 sec. 
           

可以看到explain 默认等价于explain plan。

打开全部的参数的执行计划:

Clickhouse> EXPLAIN header=1, actions=1,description=1 SELECT number from system.numbers limit 10;;
 
EXPLAIN header = 1, actions = 1, description = 1
SELECT number
FROM system.numbers
LIMIT 10
FORMAT TSV
 
Union
Header: number UInt64 UInt64(size = 0)
  Expression (Projection)
  Header: number UInt64 UInt64(size = 0)
  Actions: PROJECT number
    Limit (preliminary LIMIT)
    Header: number UInt64 UInt64(size = 0)
    Limit 10
    Offset 0
      Expression (Before ORDER BY and SELECT)
      Header: number UInt64 UInt64(size = 0)
        ReadFromStorage (Read from SystemNumbers)
        Header: number UInt64 UInt64(size = 0)
 
13 rows in set. Elapsed: 0.004 sec. 
           

3.AST 语法树:

Clickhouse> EXPLAIN AST SELECT number from system.numbers limit 10;;
 
EXPLAIN AST
SELECT number
FROM system.numbers
LIMIT 10
FORMAT TSV
 
SelectWithUnionQuery (children 1)
 ExpressionList (children 1)
  SelectQuery (children 3)
   ExpressionList (children 1)
    Identifier number
   TablesInSelectQuery (children 1)
    TablesInSelectQueryElement (children 1)
     TableExpression (children 1)
      Identifier system.numbers
   Literal UInt64_10
 
10 rows in set. Elapsed: 0.004 sec. 
           

4. syntax语法优化:

Clickhouse> SELECT number,number%3 = 1 ? 'Clickhouse' : (number%3 = 2 ? 'TiDB' : 'Unkown') stmt FROM numbers(10);
 
SELECT 
    number,
    if((number % 3) = 1, 'Clickhouse', if((number % 3) = 2, 'TiDB', 'Unkown')) AS stmt
FROM numbers(10)
 
┌─number─┬─stmt───────┐
│      0 │ Unkown     │
│      1 │ Clickhouse │
│      2 │ TiDB       │
│      3 │ Unkown     │
│      4 │ Clickhouse │
│      5 │ TiDB       │
│      6 │ Unkown     │
│      7 │ Clickhouse │
│      8 │ TiDB       │
│      9 │ Unkown     │
└────────┴────────────┘
 
10 rows in set. Elapsed: 0.016 sec. 
 
           

语法优化:

Clickhouse> SET optimize_if_chain_to_multiif = 1;
 
SET optimize_if_chain_to_multiif = 1
 
Ok.
 
0 rows in set. Elapsed: 0.011 sec. 
 
Clickhouse> explain SYNTAX  SELECT number,number%3 = 1 ? 'Clickhouse' : (number%3 = 2 ? 'TiDB' : 'Unkown') stmt FROM numbers(10);
 
EXPLAIN SYNTAX
SELECT 
    number,
    if((number % 3) = 1, 'Clickhouse', if((number % 3) = 2, 'TiDB', 'Unkown')) AS stmt
FROM numbers(10)
FORMAT TSV
 
SELECT 
    number,
    multiIf((number % 3) = 1, \'Clickhouse\', (number % 3) = 2, \'TiDB\', \'Unkown\') AS stmt
FROM numbers(10)
 
4 rows in set. Elapsed: 0.013 sec. 
 
 
 
Clickhouse> SELECT     number,     multiIf((number % 3) = 1, 'Clickhouse', (number % 3) = 2, 'TiDB', 'Unkown') AS stmt FROM numbers(10);
 
SELECT 
    number,
    multiIf((number % 3) = 1, 'Clickhouse', (number % 3) = 2, 'TiDB', 'Unkown') AS stmt
FROM numbers(10)
 
┌─number─┬─stmt───────┐
│      0 │ Unkown     │
│      1 │ Clickhouse │
│      2 │ TiDB       │
│      3 │ Unkown     │
│      4 │ Clickhouse │
│      5 │ TiDB       │
│      6 │ Unkown     │
│      7 │ Clickhouse │
│      8 │ TiDB       │
│      9 │ Unkown     │
└────────┴────────────┘
 
10 rows in set. Elapsed: 0.003 sec. 
 
           

可以看到优化后的语法查询速度更加快了

等效的语句:

Clickhouse> SELECT     number,     case number%3 when 1 then 'clickhouse' when 2 then 'TiDB' else 'Unkown' end  AS stmt FROM numbers(10);
 
SELECT 
    number,
    caseWithExpression(number % 3, 1, 'clickhouse', 2, 'TiDB', 'Unkown') AS stmt
FROM numbers(10)
 
优化为:
Clickhouse> explain SYNTAX SELECT     number,     case number%3 when 1 then 'clickhouse' when 2 then 'TiDB' else 'Unkown' end  AS stmt FROM numbers(10);
 
EXPLAIN SYNTAX
SELECT 
    number,
    caseWithExpression(number % 3, 1, 'clickhouse', 2, 'TiDB', 'Unkown') AS stmt
FROM numbers(10)
FORMAT TSV
 
SELECT 
    number,
    caseWithExpression(number % 3, 1, \'clickhouse\', 2, \'TiDB\', \'Unkown\') AS stmt
FROM numbers(10)
 
 4.pipeline:
Clickhouse> EXPLAIN PIPELINE SELECT sum(number) FROM numbers_mt(10000) GROUP BY number%5;
 
EXPLAIN PIPELINE
SELECT sum(number)
FROM numbers_mt(10000)
GROUP BY number % 5
FORMAT TSV
 
(Union)
Converting
  (Expression)
  ExpressionTransform
    (Expression)
    ExpressionTransform
      (Aggregating)
      AggregatingTransform
        (Expression)
        ExpressionTransform
          (ReadFromStorage)
          Limit
            Numbers 0 → 1
 
13 rows in set. Elapsed: 0.004 sec. 
 
           

带参数的:需要结合graphviz 图形工具查看。

Clickhouse> EXPLAIN PIPELINE header=1,graph=1 SELECT sum(number) FROM numbers_mt(10000) GROUP BY number%5;
 
EXPLAIN PIPELINE header = 1, graph = 1
SELECT sum(number)
FROM numbers_mt(10000)
GROUP BY number % 5
FORMAT TSV
 
digraph
{
  rankdir="LR";
  { node [shape = box]
    subgraph cluster_0 {
      label ="Expression";
      style=filled;
      color=lightgrey;
      node [style=filled,color=white];
      { rank = same;
        n5 [label="ExpressionTransform"];
      }
    }
    subgraph cluster_1 {
      label ="ReadFromStorage";
      style=filled;
      color=lightgrey;
      node [style=filled,color=white];
      { rank = same;
        n2 [label="Limit"];
        n1 [label="Numbers"];
      }
    }
    subgraph cluster_2 {
      label ="Aggregating";
      style=filled;
      color=lightgrey;
      node [style=filled,color=white];
      { rank = same;
        n4 [label="AggregatingTransform"];
      }
    }
    subgraph cluster_3 {
      label ="Expression";
      style=filled;
      color=lightgrey;
      node [style=filled,color=white];
      { rank = same;
        n3 [label="ExpressionTransform"];
      }
    }
    subgraph cluster_4 {
      label ="Expression";
      style=filled;
      color=lightgrey;
      node [style=filled,color=white];
      { rank = same;
        n6 [label="ExpressionTransform"];
      }
    }
    subgraph cluster_5 {
      label ="Union";
      style=filled;
      color=lightgrey;
      node [style=filled,color=white];
      { rank = same;
        n7 [label="Converting"];
      }
    }
  }
  n5 -> n6 [label="
modulo(number, 5) UInt8 UInt8(size = 0)
sum(number) UInt64 UInt64(size = 0)"];
  n2 -> n3 [label="
number UInt64 UInt64(size = 0)"];
  n1 -> n2 [label="
number UInt64 UInt64(size = 0)"];
  n4 -> n5 [label="
modulo(number, 5) UInt8 UInt8(size = 0)
sum(number) UInt64 UInt64(size = 0)"];
  n3 -> n4 [label="
number UInt64 UInt64(size = 0)
modulo(number, 5) UInt8 UInt8(size = 0)"];
  n6 -> n7 [label="
sum(number) UInt64 UInt64(size = 0)"];
}
 
76 rows in set. Elapsed: 0.004 sec. 
           

参考:

https://github.com/ClickHouse/ClickHouse/pull/11873

继续阅读