天天看點

MaxCompute SQL使用小技巧之多列轉多行

我們知道SQL中有個explode函數,經常使用的也是該函數,但還有個函數posexplode用法和其類似,使用好這個函數,可以簡化不少工作。話不多說,直接上示例。

案例:有一張表,其資料如下

seat_no bd_time
A,B,C,D 8:00,8:10,8:20,8:30
H,I,J,K 9:10,9:20,9:30,9:40
...

展示如下:

sn bt
A 8:00
B 8:10
C 8:20
D 8:30

分析:看到字段資料,可以想到使用lateral view explode展開,我們先使用explode展開看下

with tb1 as (
    select
        seat_no,
        bd_time
    from values('A,B,C,D','8:00,8:10,8:20,8:30'),
               ('H,I,J,K','9:10,9:20,9:30,9:40') 
               t(seat_no,bd_time)
)
select 
    seat_no,
    bd_time,
    exp_sn
from tb1 
    lateral view explode(split(seat_no,',')) t as exp_sn;
--隻展開seat_no 結果如下
seat_no bd_time                         exp_sn
A,B,C,D 8:00,8:10,8:20,8:30 A
A,B,C,D 8:00,8:10,8:20,8:30 B
A,B,C,D 8:00,8:10,8:20,8:30 C
A,B,C,D 8:00,8:10,8:20,8:30 D
H,I,J,K 9:10,9:20,9:30,9:40 H
H,I,J,K 9:10,9:20,9:30,9:40 I
H,I,J,K 9:10,9:20,9:30,9:40 J
H,I,J,K 9:10,9:20,9:30,9:40 K

select 
    seat_no,
    bd_time,
    exp_sn,
    exp_bt
from tb1 
    lateral view explode(split(seat_no,',')) t as exp_sn
    lateral view explode(split(bd_time,',')) t as exp_bt;
--兩個字段均展開
seat_no bd_time                         exp_sn  exp_bt
A,B,C,D 8:00,8:10,8:20,8:30 A               8:00
A,B,C,D 8:00,8:10,8:20,8:30 A               8:10
A,B,C,D 8:00,8:10,8:20,8:30 A               8:20
A,B,C,D 8:00,8:10,8:20,8:30 A               8:30
A,B,C,D 8:00,8:10,8:20,8:30 B               8:00
A,B,C,D 8:00,8:10,8:20,8:30 B               8:10
A,B,C,D 8:00,8:10,8:20,8:30 B               8:20
A,B,C,D 8:00,8:10,8:20,8:30 B               8:30
A,B,C,D 8:00,8:10,8:20,8:30 C               8:00
A,B,C,D 8:00,8:10,8:20,8:30 C               8:10
A,B,C,D 8:00,8:10,8:20,8:30 C               8:20
A,B,C,D 8:00,8:10,8:20,8:30 C               8:30
A,B,C,D 8:00,8:10,8:20,8:30 D               8:00
A,B,C,D 8:00,8:10,8:20,8:30 D               8:10
A,B,C,D 8:00,8:10,8:20,8:30 D               8:20
A,B,C,D 8:00,8:10,8:20,8:30 D               8:30
H,I,J,K 9:10,9:20,9:30,9:40 H               9:10
H,I,J,K 9:10,9:20,9:30,9:40 H               9:20
H,I,J,K 9:10,9:20,9:30,9:40 H               9:30
H,I,J,K 9:10,9:20,9:30,9:40 H               9:40
H,I,J,K 9:10,9:20,9:30,9:40 I               9:10
H,I,J,K 9:10,9:20,9:30,9:40 I               9:20
H,I,J,K 9:10,9:20,9:30,9:40 I               9:30
H,I,J,K 9:10,9:20,9:30,9:40 I               9:40
H,I,J,K 9:10,9:20,9:30,9:40 J               9:10
H,I,J,K 9:10,9:20,9:30,9:40 J               9:20
H,I,J,K 9:10,9:20,9:30,9:40 J               9:30
H,I,J,K 9:10,9:20,9:30,9:40 J               9:40
H,I,J,K 9:10,9:20,9:30,9:40 K               9:10
H,I,J,K 9:10,9:20,9:30,9:40 K               9:20
H,I,J,K 9:10,9:20,9:30,9:40 K               9:30
H,I,J,K 9:10,9:20,9:30,9:40 K               9:40      

從上述結果可以看到,按兩個字段展開相當于直接笛卡爾積了,在之前的文章中我們也分析過生成索引的posexplode函數,接下來我們加上索引試試看

select 
    seat_no,
    bd_time,
    inx_sn,
    exp_sn
from tb1 
    lateral view posexplode(split(seat_no,',')) t as inx_sn,exp_sn;
--隻展開seat_no 結果如下
seat_no bd_time inx_sn  exp_sn
A,B,C,D 8:00,8:10,8:20,8:30 0   A
A,B,C,D 8:00,8:10,8:20,8:30 1   B
A,B,C,D 8:00,8:10,8:20,8:30 2   C
A,B,C,D 8:00,8:10,8:20,8:30 3   D
H,I,J,K 9:10,9:20,9:30,9:40 0   H
H,I,J,K 9:10,9:20,9:30,9:40 1   I
H,I,J,K 9:10,9:20,9:30,9:40 2   J
H,I,J,K 9:10,9:20,9:30,9:40 3   K

select 
    seat_no,
    bd_time,
    inx_sn,
    exp_sn,
    inx_bt,
    exp_bt
from tb1 
    lateral view posexplode(split(seat_no,',')) t as inx_sn,exp_sn
    lateral view posexplode(split(bd_time,',')) t as inx_bt,exp_bt;
--兩個字段均展開
seat_no bd_time inx_sn  exp_sn  inx_bt  exp_bt
A,B,C,D 8:00,8:10,8:20,8:30 0   A   0   8:00
A,B,C,D 8:00,8:10,8:20,8:30 0   A   1   8:10
A,B,C,D 8:00,8:10,8:20,8:30 0   A   2   8:20
A,B,C,D 8:00,8:10,8:20,8:30 0   A   3   8:30
A,B,C,D 8:00,8:10,8:20,8:30 1   B   0   8:00
A,B,C,D 8:00,8:10,8:20,8:30 1   B   1   8:10
A,B,C,D 8:00,8:10,8:20,8:30 1   B   2   8:20
A,B,C,D 8:00,8:10,8:20,8:30 1   B   3   8:30
A,B,C,D 8:00,8:10,8:20,8:30 2   C   0   8:00
A,B,C,D 8:00,8:10,8:20,8:30 2   C   1   8:10
A,B,C,D 8:00,8:10,8:20,8:30 2   C   2   8:20
A,B,C,D 8:00,8:10,8:20,8:30 2   C   3   8:30
A,B,C,D 8:00,8:10,8:20,8:30 3   D   0   8:00
A,B,C,D 8:00,8:10,8:20,8:30 3   D   1   8:10
A,B,C,D 8:00,8:10,8:20,8:30 3   D   2   8:20
A,B,C,D 8:00,8:10,8:20,8:30 3   D   3   8:30
H,I,J,K 9:10,9:20,9:30,9:40 0   H   0   9:10
H,I,J,K 9:10,9:20,9:30,9:40 0   H   1   9:20
H,I,J,K 9:10,9:20,9:30,9:40 0   H   2   9:30
H,I,J,K 9:10,9:20,9:30,9:40 0   H   3   9:40
H,I,J,K 9:10,9:20,9:30,9:40 1   I   0   9:10
H,I,J,K 9:10,9:20,9:30,9:40 1   I   1   9:20
H,I,J,K 9:10,9:20,9:30,9:40 1   I   2   9:30
H,I,J,K 9:10,9:20,9:30,9:40 1   I   3   9:40
H,I,J,K 9:10,9:20,9:30,9:40 2   J   0   9:10
H,I,J,K 9:10,9:20,9:30,9:40 2   J   1   9:20
H,I,J,K 9:10,9:20,9:30,9:40 2   J   2   9:30
H,I,J,K 9:10,9:20,9:30,9:40 2   J   3   9:40
H,I,J,K 9:10,9:20,9:30,9:40 3   K   0   9:10
H,I,J,K 9:10,9:20,9:30,9:40 3   K   1   9:20
H,I,J,K 9:10,9:20,9:30,9:40 3   K   2   9:30
H,I,J,K 9:10,9:20,9:30,9:40 3   K   3   9:40      

從上述結果看,posexplode的索引的确省下了不少事,可以利用兩個索引值相同進行過濾,可得到想要的結果

select 
    seat_no,
    bd_time,
    exp_sn,
    exp_bt
from tb1 
    lateral view posexplode(split(seat_no,',')) t as inx_sn,exp_sn
    lateral view posexplode(split(bd_time,',')) t as inx_bt,exp_bt
where inx_sn = inx_bt;
--結果如下
seat_no bd_time exp_sn  exp_bt
A,B,C,D 8:00,8:10,8:20,8:30 A   8:00
A,B,C,D 8:00,8:10,8:20,8:30 B   8:10
A,B,C,D 8:00,8:10,8:20,8:30 C   8:20
A,B,C,D 8:00,8:10,8:20,8:30 D   8:30
H,I,J,K 9:10,9:20,9:30,9:40 H   9:10
H,I,J,K 9:10,9:20,9:30,9:40 I   9:20
H,I,J,K 9:10,9:20,9:30,9:40 J   9:30
H,I,J,K 9:10,9:20,9:30,9:40 K   9:40      

關于posexplode具體細節用法,可參考阿裡雲文檔

https://help.aliyun.com/document_detail/293597.html#section-2yc-ymd-p11

拜了個拜