天天看点

Clickhouse Explain

目录

​一、Explain​

​1. 功能版本及语法​

​​1.1 版本​​

​​1.2 语法​​

​2.Explain Types​

​​2.1 Types​​

​2.2.AST​

​​simple query​​

​​complex query​​

​2.3.SYNTAX​

​​多表联查会优化成 CROSS JOIN​​

​​三元运算符优化​​

​2.4.PLAN​

​​open all param​​

​​2.5.PIPLINE​​

​​二、Explain Estimate​​

​三、老版本查看执行计划​

​​执行计划​​

AST — Abstract syntax tree (抽象语法树)

SYNTAX — Query text after AST-level optimizations(在 AST 级优化后查询文本。)

PLAN — Query execution plan(查询执行计划)

PIPLINE — Query execution pipeline(查询执行管道)

simple query

<code>EXPLAIN AST SELECT 1; ┌─explain───────────────────────────┐ │ SelectWithUnionQuery (children 1) │ │ ExpressionList (children 1) │ │ SelectQuery (children 1) │ │ ExpressionList (children 1) │ │ Literal UInt64_1 │ └───────────────────────────────────┘</code>

complex query

<code>EXPLAIN AST ALTER TABLE t1 DELETE WHERE date = today(); ┌─explain──────────────────────────┐ │ AlterQuery t1 (children 1) │ │ ExpressionList (children 1) │ │ AlterCommand 32 (children 1) │ │ Function equals (children 1) │ │ ExpressionList (children 2) │ │ Identifier date │ │ Function today (children 1) │ │ ExpressionList │ └──────────────────────────────────┘</code>

返回语法优化后的查询,这个是我们开发过程中经常用到的,可以看下面两个例子

多表联查会优化成 CROSS JOIN

<code>EXPLAIN SYNTAX SELECT * FROM system.numbers AS a, system.numbers AS b, system.numbers AS c; ┌─explain────────────────────────────┐ │ SELECT │ │ `--a.number` AS `a.number`, │ │ `--b.number` AS `b.number`, │ │ number AS `c.number` │ │ FROM │ │ ( │ │ SELECT │ │ number AS `--a.number`, │ │ b.number AS `--b.number` │ │ FROM system.numbers AS a │ │ CROSS JOIN system.numbers AS b │ │ ) AS `--.s` │ │ CROSS JOIN system.numbers AS c │ └────────────────────────────────────┘</code>

三元运算符优化

<code>#查看优化 EXPLAIN SYNTAX SELECT number = 1 ? 'ck' : (number = 2 ? 'hbase' : 'hive') FROM numbers(10); ┌─explain──────────────────────────────────────────────────────┐ │ SELECT if(number = 1, 'ck', if(number = 2, 'hbase', 'hive')) │ │ FROM numbers(10) │ └──────────────────────────────────────────────────────────────┘ #开启三元运算符优化 SET optimize_if_chain_to_multiif = 1; #再次查看优化后效果 EXPLAIN SYNTAX SELECT number = 1 ? 'ck' : (number = 2 ? 'hbase' : 'hive') FROM numbers(10); ┌─explain───────────────────────────────────────────────────────┐ │ SELECT multiIf(number = 1, 'ck', number = 2, 'hbase', 'hive') │ │ FROM numbers(10) │ └───────────────────────────────────────────────────────────────┘ 可以看到优化后的已经将 多个 `if()`函数替换为 `multiIf()` 函数</code>

查看执行计划,相对hive比较简洁

<code>explain plan select arrayJoin([1,2,3,null,null]); ┌─explain───────────────────────────────────────────────────────────────────┐ │ Expression ((Projection + Before ORDER BY)) │ │ SettingQuotaAndLimits (Set limits and quota after reading from storage) │ │ ReadFromStorage (SystemOne) │ └───────────────────────────────────────────────────────────────────────────┘</code>

<code>explain select database,table,count(1) cnt from system.parts where database in ('datasets','system') group by database,table order by database,cnt desc limit 2 by database; ┌─explain─────────────────────────────────────────────────────────────────────────────────────┐ │ Expression (Projection) │ │ LimitBy │ │ Expression (Before LIMIT BY) │ │ MergingSorted (Merge sorted streams for ORDER BY) │ │ MergeSorting (Merge sorted blocks for ORDER BY) │ │ PartialSorting (Sort each block for ORDER BY) │ │ Expression (Before ORDER BY) │ │ Aggregating │ │ Expression (Before GROUP BY) │ │ Filter (WHERE) │ │ SettingQuotaAndLimits (Set limits and quota after reading from storage) │ │ ReadFromStorage (SystemParts) │ └─────────────────────────────────────────────────────────────────────────────────────────────┘</code>

大体的一个执行路线

limit -&gt; merge sort (order by 归并排序) -&gt; agg(count() 聚合) -&gt; filter(where 过滤 )

​ 另外, explain 默认就是explain plain

open all param

打开其他参数

<code>EXPLAIN header=1, actions=1,description=1 SELECT number from system.numbers limit 10; ┌─explain───────────────────────────────────────────────────────────────────┐ │ Expression ((Projection + Before ORDER BY)) │ │ Header: number UInt64 │ │ Actions: INPUT :: 0 -&gt; number UInt64 : 0 │ │ Positions: 0 │ │ SettingQuotaAndLimits (Set limits and quota after reading from storage) │ │ Header: number UInt64 │ │ Limit (preliminary LIMIT (without OFFSET)) │ │ Header: number UInt64 │ │ Limit 10 │ │ Offset 0 │ │ ReadFromStorage (SystemNumbers) │ │ Header: number UInt64 │ └───────────────────────────────────────────────────────────────────────────┘</code>

查看执行管道

不带参数

<code>EXPLAIN PIPELINE SELECT sum(number) FROM numbers_mt(10000) GROUP BY number % 20; ┌─explain───────────────────────┐ │ (Expression) │ │ ExpressionTransform │ │ (Aggregating) │ │ AggregatingTransform │ │ (Expression) │ │ ExpressionTransform │ │ (SettingQuotaAndLimits) │ │ (ReadFromStorage) │ │ Limit │ │ Numbers 0 → 1 │ └───────────────────────────────┘</code>

其他参数

<code>EXPLAIN PIPELINE header=1,graph=1 SELECT sum(number) FROM numbers_mt(10000) GROUP BY number%20; ┌─explain─────────────────────────────────────┐ │ digraph │ │ { │ │ rankdir="LR"; │ │ { node [shape = rect] │ │ n2 [label="Limit"]; │ │ n1 [label="Numbers"]; │ │ subgraph cluster_0 { │ │ label ="Aggregating"; │ │ style=filled; │ │ color=lightgrey; │ │ node [style=filled,color=white]; │ │ { rank = same; │ │ n4 [label="AggregatingTransform"]; │ │ } │ │ } │ │ subgraph cluster_1 { │ │ label ="Expression"; │ │ style=filled; │ │ color=lightgrey; │ │ node [style=filled,color=white]; │ │ { rank = same; │ │ n3 [label="ExpressionTransform"]; │ │ } │ │ } │ │ subgraph cluster_2 { │ │ label ="Expression"; │ │ style=filled; │ │ color=lightgrey; │ │ node [style=filled,color=white]; │ │ { rank = same; │ │ n5 [label="ExpressionTransform"]; │ │ } │ │ } │ │ } │ │ n2 -&gt; n3 [label=" │ │ number UInt64 UInt64(size = 0)"]; │ │ n1 -&gt; n2 [label=" │ │ number UInt64 UInt64(size = 0)"]; │ │ n4 -&gt; n5 [label=" │ │ modulo(number, 20) UInt8 UInt8(size = 0) │ │ sum(number) UInt64 UInt64(size = 0)"]; │ │ n3 -&gt; n4 [label=" │ │ number UInt64 UInt64(size = 0) │ │ modulo(number, 20) UInt8 UInt8(size = 0)"]; │ │ } │ └─────────────────────────────────────────────┘</code>

​ 显示在处理查询时要从表中读取的估计行数、标记数和部分数,适用于​​MergeTree​​ family.

创建表

<code>CREATE TABLE ttt (i Int64) ENGINE = MergeTree() ORDER BY i SETTINGS index_granularity = 16, write_final_mark = 0; INSERT INTO ttt SELECT number FROM numbers(128); OPTIMIZE TABLE ttt;</code>

查询

<code>EXPLAIN ESTIMATE SELECT * FROM ttt;</code>

结果

<code>┌─database─┬─table─┬─parts─┬─rows─┬─marks─┐ │ default │ ttt │ 1 │ 128 │ 8 │ └──────────┴───────┴───────┴──────┴───────┘</code>

执行计划

<code>clickhouse-client -h 主机名 --send_logs_level=trace &lt;&lt;&lt; "sql" &gt; /dev/null</code>

send_logs_level 设置日志等级,&lt;&lt;&lt;将SQL语句重定向至clickhouse-client进行查询,&gt; /dev/null将查询结果重定向到空设备吞掉,以便观察日志

另外,日志级别除了trace可以,也可以为debug;此外,这种方式只能是在日志中查看,并且只能在SQL语句真正执行的时候,如果线上数据量比较大,建议添加 limit

继续阅读