我們知道SQL中有個explode函數,經常使用的也是該函數,但還有個函數posexplode用法和其類似,使用好這個函數,可以簡化不少工作。話不多說,直接上示例。
案例:有一張表,其資料如下
seat_no | bd_time |
A,B,C,D | 8:00,8:10,8:20,8:30 |
H,I,J,K | 9:10,9:20,9:30,9:40 |
... |
展示如下:
sn | bt |
A | 8:00 |
B | 8:10 |
C | 8:20 |
D | 8:30 |
分析:看到字段資料,可以想到使用lateral view explode展開,我們先使用explode展開看下
with tb1 as (
select
seat_no,
bd_time
from values('A,B,C,D','8:00,8:10,8:20,8:30'),
('H,I,J,K','9:10,9:20,9:30,9:40')
t(seat_no,bd_time)
)
select
seat_no,
bd_time,
exp_sn
from tb1
lateral view explode(split(seat_no,',')) t as exp_sn;
--隻展開seat_no 結果如下
seat_no bd_time exp_sn
A,B,C,D 8:00,8:10,8:20,8:30 A
A,B,C,D 8:00,8:10,8:20,8:30 B
A,B,C,D 8:00,8:10,8:20,8:30 C
A,B,C,D 8:00,8:10,8:20,8:30 D
H,I,J,K 9:10,9:20,9:30,9:40 H
H,I,J,K 9:10,9:20,9:30,9:40 I
H,I,J,K 9:10,9:20,9:30,9:40 J
H,I,J,K 9:10,9:20,9:30,9:40 K
select
seat_no,
bd_time,
exp_sn,
exp_bt
from tb1
lateral view explode(split(seat_no,',')) t as exp_sn
lateral view explode(split(bd_time,',')) t as exp_bt;
--兩個字段均展開
seat_no bd_time exp_sn exp_bt
A,B,C,D 8:00,8:10,8:20,8:30 A 8:00
A,B,C,D 8:00,8:10,8:20,8:30 A 8:10
A,B,C,D 8:00,8:10,8:20,8:30 A 8:20
A,B,C,D 8:00,8:10,8:20,8:30 A 8:30
A,B,C,D 8:00,8:10,8:20,8:30 B 8:00
A,B,C,D 8:00,8:10,8:20,8:30 B 8:10
A,B,C,D 8:00,8:10,8:20,8:30 B 8:20
A,B,C,D 8:00,8:10,8:20,8:30 B 8:30
A,B,C,D 8:00,8:10,8:20,8:30 C 8:00
A,B,C,D 8:00,8:10,8:20,8:30 C 8:10
A,B,C,D 8:00,8:10,8:20,8:30 C 8:20
A,B,C,D 8:00,8:10,8:20,8:30 C 8:30
A,B,C,D 8:00,8:10,8:20,8:30 D 8:00
A,B,C,D 8:00,8:10,8:20,8:30 D 8:10
A,B,C,D 8:00,8:10,8:20,8:30 D 8:20
A,B,C,D 8:00,8:10,8:20,8:30 D 8:30
H,I,J,K 9:10,9:20,9:30,9:40 H 9:10
H,I,J,K 9:10,9:20,9:30,9:40 H 9:20
H,I,J,K 9:10,9:20,9:30,9:40 H 9:30
H,I,J,K 9:10,9:20,9:30,9:40 H 9:40
H,I,J,K 9:10,9:20,9:30,9:40 I 9:10
H,I,J,K 9:10,9:20,9:30,9:40 I 9:20
H,I,J,K 9:10,9:20,9:30,9:40 I 9:30
H,I,J,K 9:10,9:20,9:30,9:40 I 9:40
H,I,J,K 9:10,9:20,9:30,9:40 J 9:10
H,I,J,K 9:10,9:20,9:30,9:40 J 9:20
H,I,J,K 9:10,9:20,9:30,9:40 J 9:30
H,I,J,K 9:10,9:20,9:30,9:40 J 9:40
H,I,J,K 9:10,9:20,9:30,9:40 K 9:10
H,I,J,K 9:10,9:20,9:30,9:40 K 9:20
H,I,J,K 9:10,9:20,9:30,9:40 K 9:30
H,I,J,K 9:10,9:20,9:30,9:40 K 9:40
從上述結果可以看到,按兩個字段展開相當于直接笛卡爾積了,在之前的文章中我們也分析過生成索引的posexplode函數,接下來我們加上索引試試看
select
seat_no,
bd_time,
inx_sn,
exp_sn
from tb1
lateral view posexplode(split(seat_no,',')) t as inx_sn,exp_sn;
--隻展開seat_no 結果如下
seat_no bd_time inx_sn exp_sn
A,B,C,D 8:00,8:10,8:20,8:30 0 A
A,B,C,D 8:00,8:10,8:20,8:30 1 B
A,B,C,D 8:00,8:10,8:20,8:30 2 C
A,B,C,D 8:00,8:10,8:20,8:30 3 D
H,I,J,K 9:10,9:20,9:30,9:40 0 H
H,I,J,K 9:10,9:20,9:30,9:40 1 I
H,I,J,K 9:10,9:20,9:30,9:40 2 J
H,I,J,K 9:10,9:20,9:30,9:40 3 K
select
seat_no,
bd_time,
inx_sn,
exp_sn,
inx_bt,
exp_bt
from tb1
lateral view posexplode(split(seat_no,',')) t as inx_sn,exp_sn
lateral view posexplode(split(bd_time,',')) t as inx_bt,exp_bt;
--兩個字段均展開
seat_no bd_time inx_sn exp_sn inx_bt exp_bt
A,B,C,D 8:00,8:10,8:20,8:30 0 A 0 8:00
A,B,C,D 8:00,8:10,8:20,8:30 0 A 1 8:10
A,B,C,D 8:00,8:10,8:20,8:30 0 A 2 8:20
A,B,C,D 8:00,8:10,8:20,8:30 0 A 3 8:30
A,B,C,D 8:00,8:10,8:20,8:30 1 B 0 8:00
A,B,C,D 8:00,8:10,8:20,8:30 1 B 1 8:10
A,B,C,D 8:00,8:10,8:20,8:30 1 B 2 8:20
A,B,C,D 8:00,8:10,8:20,8:30 1 B 3 8:30
A,B,C,D 8:00,8:10,8:20,8:30 2 C 0 8:00
A,B,C,D 8:00,8:10,8:20,8:30 2 C 1 8:10
A,B,C,D 8:00,8:10,8:20,8:30 2 C 2 8:20
A,B,C,D 8:00,8:10,8:20,8:30 2 C 3 8:30
A,B,C,D 8:00,8:10,8:20,8:30 3 D 0 8:00
A,B,C,D 8:00,8:10,8:20,8:30 3 D 1 8:10
A,B,C,D 8:00,8:10,8:20,8:30 3 D 2 8:20
A,B,C,D 8:00,8:10,8:20,8:30 3 D 3 8:30
H,I,J,K 9:10,9:20,9:30,9:40 0 H 0 9:10
H,I,J,K 9:10,9:20,9:30,9:40 0 H 1 9:20
H,I,J,K 9:10,9:20,9:30,9:40 0 H 2 9:30
H,I,J,K 9:10,9:20,9:30,9:40 0 H 3 9:40
H,I,J,K 9:10,9:20,9:30,9:40 1 I 0 9:10
H,I,J,K 9:10,9:20,9:30,9:40 1 I 1 9:20
H,I,J,K 9:10,9:20,9:30,9:40 1 I 2 9:30
H,I,J,K 9:10,9:20,9:30,9:40 1 I 3 9:40
H,I,J,K 9:10,9:20,9:30,9:40 2 J 0 9:10
H,I,J,K 9:10,9:20,9:30,9:40 2 J 1 9:20
H,I,J,K 9:10,9:20,9:30,9:40 2 J 2 9:30
H,I,J,K 9:10,9:20,9:30,9:40 2 J 3 9:40
H,I,J,K 9:10,9:20,9:30,9:40 3 K 0 9:10
H,I,J,K 9:10,9:20,9:30,9:40 3 K 1 9:20
H,I,J,K 9:10,9:20,9:30,9:40 3 K 2 9:30
H,I,J,K 9:10,9:20,9:30,9:40 3 K 3 9:40
從上述結果看,posexplode的索引的确省下了不少事,可以利用兩個索引值相同進行過濾,可得到想要的結果
select
seat_no,
bd_time,
exp_sn,
exp_bt
from tb1
lateral view posexplode(split(seat_no,',')) t as inx_sn,exp_sn
lateral view posexplode(split(bd_time,',')) t as inx_bt,exp_bt
where inx_sn = inx_bt;
--結果如下
seat_no bd_time exp_sn exp_bt
A,B,C,D 8:00,8:10,8:20,8:30 A 8:00
A,B,C,D 8:00,8:10,8:20,8:30 B 8:10
A,B,C,D 8:00,8:10,8:20,8:30 C 8:20
A,B,C,D 8:00,8:10,8:20,8:30 D 8:30
H,I,J,K 9:10,9:20,9:30,9:40 H 9:10
H,I,J,K 9:10,9:20,9:30,9:40 I 9:20
H,I,J,K 9:10,9:20,9:30,9:40 J 9:30
H,I,J,K 9:10,9:20,9:30,9:40 K 9:40
關于posexplode具體細節用法,可參考阿裡雲文檔
https://help.aliyun.com/document_detail/293597.html#section-2yc-ymd-p11拜了個拜