天天看點

MySQL視窗函數

一、Mysql8.0 視窗函數

1、視窗函數與聚合函數的差別

MySQL從8.0開始支援視窗函數,視窗函數也叫分析函數。視窗函數更多的是業務中需要對資料做排序/分組排序,環比計算,百分比計算等需求。與聚合函數明顯的不同是,聚合函數通過會将所有記錄進行分類聚合;而視窗函數是對所有資料記錄按照指定視窗進行計算,不會進行聚合。

在MySQL8.0之前我們其實也可以通過一定的辦法實作視窗函數的功能,具體如何實作請見下文。

2、視窗函數基本文法

視窗函數主要由視窗函數、over子句、window子句、partition子句、order子句、frame子句。

示例:

SELECT time,subject,val,
    FIRST_VALUE(val) OVER w AS 'first',         //視窗函數以及over子句
    LAST_VALUE(val) OVER w AS 'last', 
    NTH_VALUE(val, 2) OVER w AS 'second', 
    NTH_VALUE(val, 4) OVER w AS 'fourth'
FROM observations 
WINDOW w AS (                                   //window子句  
PARTITION BY subject                            //partition子句
ORDER BY time                                   //order子句
ROWS UNBOUNDED PRECEDING                        //frame子句
);
+----------+---------+------+-------+------+--------+--------+
| time     | subject | val  | first | last | second | fourth |
+----------+---------+------+-------+------+--------+--------+
| 07:00:00 | st113   |   10 |    10 |   10 |   NULL |   NULL |
| 07:15:00 | st113   |    9 |    10 |    9 |      9 |   NULL |
| 07:30:00 | st113   |   25 |    10 |   25 |      9 |   NULL |
| 07:45:00 | st113   |   20 |    10 |   20 |      9 |     20 |
| 07:00:00 | xh458   |    0 |     0 |    0 |   NULL |   NULL |
| 07:15:00 | xh458   |   10 |     0 |   10 |     10 |   NULL |
| 07:30:00 | xh458   |    5 |     0 |    5 |     10 |   NULL |
| 07:45:00 | xh458   |   30 |     0 |   30 |     10 |     30 |
| 08:00:00 | xh458   |   25 |     0 |   25 |     10 |     30 |
+----------+---------+------+-------+------+--------+--------+           
  • 視窗函數

按照具體業務需求指定具體的視窗函數,常見視窗函數如下:

函數 函數意義
row_number() 排序函數,并列記錄也按照+1進行正常排序
rank() row 2 排序函數,并列記錄同序号,并列記錄後的記錄按照總記錄+1進行正常排序
dense_rank() 排序函數,并列記錄同序号,并列記錄後的記錄按照并列記錄+1進行正常排序
percent_rank() 分布函數,計算目前記錄排序的(rank() - 1) 除以 (分組總記錄數 - 1)的百分比
cume_dist() 分布函數,計算目前記錄排序rank()的總記錄數除以分組總記錄數的百分比
lag(col,N) 按照分組排序,顯示該記錄的前N個的col值
lead(col,N) 按照分組排序,顯示該記錄的後N個的col值
first_value(col) 按照分組排序,顯示排序第一的col值
last_value(col) 按照分組排序,顯示截止到該記錄排序最後的col值
nth_value(col,N) 按照分組排序下,截止到目前記錄第N排序的col值
nfile(N) 按照分組排序,将所有記錄分為N份
  • over子句

關鍵字,over() w as col_name中,w為别名,指定了具體的分組排序規則,确定視窗函數的計算範圍,也可以将partion子句、order子句寫在over()中直接使用。

over(partition by aa order by bb) w as col_name
<=>
over() w as col_name
window w as (partition by aa order by bb)           
  • window子句

指定了具體的分組排序規則,确定視窗函數的計算範圍,若SQL中需要多個視窗時可使用不同的别名來區分

  • partition子句
window w as (partition by aa)     //按照aa進行分組,視窗函數每個分組單獨計算           

指定分組規則,若不需要可不寫

  • order子句

    指定排序規則,可以單獨使用,也可以配合partition使用

window w as (partition by aa order by bb) //按照aa進行分組,每組記錄按照bb進行順序排序
window w as (order by bb desc)            //符合where條件的全部記錄按照bb倒叙排序           
  • frame子句

指定視窗函數的計算範圍

文法:ROWS + 邊界關鍵字

CURRENT ROW 以目前記錄為左/右側邊界
UNBOUNDED PRECEDING 以分組中的第一條記錄為左側邊界
UNBOUNDED FOLLOWING 以分組中的最後一條記錄為右側邊界
N PRECEDING  以目前記錄的前N條記錄為左側邊界
N FOLLOWING  以目前記錄的後N條記錄為右側邊界           
window w as (rows between N preceding and M following)    // 視窗範圍是目前記錄的前N條記錄到後M條記錄
window w as (rows unbounded preceding)      //視窗範圍是目前記錄到分組中的最後一條記錄。
window w as (rows between unbounded preceding and unbounded following) //預設該模式,視窗範圍是目前分組中所有記錄範圍進行計算           

二、MySQL8.0之前實作視窗函數的辦法

在MySQL8.0之前時沒有視窗函數的,MySQL也可以實作部分視窗函數的功能,以下是部分示例:

1、 row_number()實作

MySQL8.0

無分組排序:
root@mysql 22:27:  [test]> select row_number() over w as row_num,class_num,user_id,score from score where subject_name='Math' window  w as ( order by score desc) ;
+---------+-----------+---------+--------+
| row_num | class_num | user_id | score  |
+---------+-----------+---------+--------+
|       1 | 001       |       1 | 100.00 |
|       2 | 002       |       5 |  98.00 |
|       3 | 002       |       8 |  98.00 |
|       4 | 001       |       3 |  92.00 |
|       5 | 001       |       2 |  89.00 |
|       6 | 002       |       9 |  89.00 |
|       7 | 001       |      10 |  89.00 |
|       8 | 002       |       4 |  87.00 |
|       9 | 002       |       6 |  77.00 |
|      10 | 001       |       7 |  73.00 |
+---------+-----------+---------+--------+
10 rows in set (0.00 sec)

分組排序:
root@mysql 22:50:  [test]> select row_number() over w as row_num,class_num,user_id,score from score where subject_name='Math' window  w as ( partition by class_num order by score desc) ;
+---------+-----------+---------+--------+
| row_num | class_num | user_id | score  |
+---------+-----------+---------+--------+
|       1 | 001       |       1 | 100.00 |
|       2 | 001       |       3 |  92.00 |
|       3 | 001       |       2 |  89.00 |
|       4 | 001       |      10 |  89.00 |
|       5 | 001       |       7 |  73.00 |
|       1 | 002       |       5 |  98.00 |
|       2 | 002       |       8 |  98.00 |
|       3 | 002       |       9 |  89.00 |
|       4 | 002       |       4 |  87.00 |
|       5 | 002       |       6 |  77.00 |
+---------+-----------+---------+--------+
10 rows in set (0.00 sec)           

MySQL之前

無分組排序:
root@mysql 22:27:  [test]> select @r:=@r+1 as `row_number`,class_num,user_id,score from score,(select @r:=0) temp order by score desc;
+------------+-----------+---------+--------+
| row_number | class_num | user_id | score  |
+------------+-----------+---------+--------+
|          1 | 001       |       1 | 100.00 |
|          2 | 002       |       5 |  98.00 |
|          3 | 002       |       8 |  98.00 |
|          4 | 001       |       3 |  92.00 |
|          5 | 001       |       2 |  89.00 |
|          6 | 002       |       9 |  89.00 |
|          7 | 001       |      10 |  89.00 |
|          8 | 002       |       4 |  87.00 |
|          9 | 002       |       6 |  77.00 |
|         10 | 001       |       7 |  73.00 |
+------------+-----------+---------+--------+
10 rows in set, 2 warnings (0.00 sec)

分組排序:
root@mysql 22:48:  [test]> select `row_number`,class_num,user_id,score from (select if(@class=class_num,@r:=@r+1,@r:=1) as `row_number`,@class:=class_num,class_num,user_id,score from score,(select @r:=0,@class=NULL) temp order by class_num,score desc) tmp2;
+------------+-----------+---------+--------+
| row_number | class_num | user_id | score  |
+------------+-----------+---------+--------+
|          1 | 001       |       1 | 100.00 |
|          2 | 001       |       3 |  92.00 |
|          3 | 001       |       2 |  89.00 |
|          4 | 001       |      10 |  89.00 |
|          5 | 001       |       7 |  73.00 |
|          1 | 002       |       5 |  98.00 |
|          2 | 002       |       8 |  98.00 |
|          3 | 002       |       9 |  89.00 |
|          4 | 002       |       4 |  87.00 |
|          5 | 002       |       6 |  77.00 |
+------------+-----------+---------+--------+
10 rows in set, 4 warnings (0.00 sec)           

2、dense_rank()實作

無分組排序:
root@mysql 22:27:  [test]> select dense_rank() over w as `dense_rank`,class_num,user_id,score from score where subject_name='Math' window  w as ( order by score desc) ;
+------------+-----------+---------+--------+
| dense_rank | class_num | user_id | score  |
+------------+-----------+---------+--------+
|          1 | 001       |       1 | 100.00 |
|          2 | 002       |       5 |  98.00 |
|          2 | 002       |       8 |  98.00 |
|          3 | 001       |       3 |  92.00 |
|          4 | 001       |       2 |  89.00 |
|          4 | 002       |       9 |  89.00 |
|          4 | 001       |      10 |  89.00 |
|          5 | 002       |       4 |  87.00 |
|          6 | 002       |       6 |  77.00 |
|          7 | 001       |       7 |  73.00 |
+------------+-----------+---------+--------+
10 rows in set (0.00 sec)

分組排序:
root@mysql 23:11:  [test]> select dense_rank() over w as `dense_rank`,class_num,user_id,score from score where subject_name='Math' window  w as ( partition by class_num order by score desc) ;
+------------+-----------+---------+--------+
| dense_rank | class_num | user_id | score  |
+------------+-----------+---------+--------+
|          1 | 001       |       1 | 100.00 |
|          2 | 001       |       3 |  92.00 |
|          3 | 001       |       2 |  89.00 |
|          3 | 001       |      10 |  89.00 |
|          4 | 001       |       7 |  73.00 |
|          1 | 002       |       5 |  98.00 |
|          1 | 002       |       8 |  98.00 |
|          2 | 002       |       9 |  89.00 |
|          3 | 002       |       4 |  87.00 |
|          4 | 002       |       6 |  77.00 |
+------------+-----------+---------+--------+
10 rows in set (0.00 sec)           

Mysql8.0之前

無分組排序:
root@mysql 23:09:  [test]> select case when  @s = score then @r WHEN @s := score then @r := @r + 1 end as `dense_rank`,class_num,user_id,score from score,(select @r := 0, @s = NULL) temp where subject_name = 'Math' order by score desc ;
+------------+-----------+---------+--------+
| dense_rank | class_num | user_id | score  |
+------------+-----------+---------+--------+
|          1 | 001       |       1 | 100.00 |
|          2 | 002       |       5 |  98.00 |
|          2 | 002       |       8 |  98.00 |
|          3 | 001       |       3 |  92.00 |
|          4 | 001       |       2 |  89.00 |
|          4 | 002       |       9 |  89.00 |
|          4 | 001       |      10 |  89.00 |
|          5 | 002       |       4 |  87.00 |
|          6 | 002       |       6 |  77.00 |
|          7 | 001       |       7 |  73.00 |
+------------+-----------+---------+--------+
10 rows in set, 3 warnings (0.00 sec)

分組排序:
root@mysql 23:10:  [test]> select `dense_rank`,class_num,user_id,score from (select if(@class=class_num,case when  @s = score then @r when @s := score then @r := @r + 1 end,@r:=1) as `dense_rank`,@class:=class_num,@s:=score,class_num,user_id,score FROM score, ( SELECT @r := 0, @s = NULL,@class=NULL ) temp where subject_name = 'Math' order by  class_num,score desc) temp2;
+------------+-----------+---------+--------+
| dense_rank | class_num | user_id | score  |
+------------+-----------+---------+--------+
|          1 | 001       |       1 | 100.00 |
|          2 | 001       |       3 |  92.00 |
|          3 | 001       |       2 |  89.00 |
|          3 | 001       |      10 |  89.00 |
|          4 | 001       |       7 |  73.00 |
|          1 | 002       |       5 |  98.00 |
|          1 | 002       |       8 |  98.00 |
|          2 | 002       |       9 |  89.00 |
|          3 | 002       |       4 |  87.00 |
|          4 | 002       |       6 |  77.00 |
+------------+-----------+---------+--------+
10 rows in set, 6 warnings (0.00 sec)           

3、rank()實作

root@mysql 22:33:  [test]> select rank() over w as `rank`,class_num,user_id,score from score where subject_name='Math' window  w as ( order by score desc) ;
+------+-----------+---------+--------+
| rank | class_num | user_id | score  |
+------+-----------+---------+--------+
|    1 | 001       |       1 | 100.00 |
|    2 | 002       |       5 |  98.00 |
|    2 | 002       |       8 |  98.00 |
|    4 | 001       |       3 |  92.00 |
|    5 | 001       |       2 |  89.00 |
|    5 | 002       |       9 |  89.00 |
|    5 | 001       |      10 |  89.00 |
|    8 | 002       |       4 |  87.00 |
|    9 | 002       |       6 |  77.00 |
|   10 | 001       |       7 |  73.00 |
+------+-----------+---------+--------+
10 rows in set (0.00 sec)           
root@mysql 23:12:  [test]> select `rank`,class_num,user_id,score from (SELECT  @r:=if(@s = score,@r,@c)  AS `rank`,@s:=score,@c:=@c+1,class_num,user_id,score from score, ( select @r := 0, @s = NULL,@c:=1 ) r where subject_name = 'Math' order by score desc) temp;
+------+-----------+---------+--------+
| rank | class_num | user_id | score  |
+------+-----------+---------+--------+
|    1 | 001       |       1 | 100.00 |
|    2 | 002       |       5 |  98.00 |
|    2 | 002       |       8 |  98.00 |
|    4 | 001       |       3 |  92.00 |
|    5 | 001       |       2 |  89.00 |
|    5 | 002       |       9 |  89.00 |
|    5 | 001       |      10 |  89.00 |
|    8 | 002       |       4 |  87.00 |
|    9 | 002       |       6 |  77.00 |
|   10 | 001       |       7 |  73.00 |
+------+-----------+---------+--------+
10 rows in set, 5 warnings (0.00 sec)