union的巧妙用法，使用union也可以橫向合并sql查詢結果

為什麼在使用union效率會比較高，因為hive對union做了優化。

我們都知道union是縱向連接配接查詢結果，join是橫向，但是用union連接配接大家試過嗎，未優化sql：

select ep.productid,productname,count(st.tduserid),count(distinct sl.tduserid),count(distinct sn.tduserid),avg(sl.interval_level)
from(select productid,productname from enterprise.product where productid = '3006090') ep
join(select tduserid,productid from tdanalytics.stg_td_web_page_ex where l_date <= '2019-04-07' and l_date >= date_add('2019-04-07', -6)) st
on ep.productid=st.productid
join(select tduserid,interval_level,productid from tdanalytics.stg_td_web_launch_ex where l_date <= '2019-04-07' and l_date >= date_add('2019-04-07', -6)) sl
on st.productid=sl.productid
join(select tduserid,productid from tdanalytics.stg_td_web_newuser_ex where l_date <= '2019-04-07' and l_date >= date_add('2019-04-07', -6)) sn
on sl.productid=sn.productid
group by ep.productid,productname;

然後用了union，豁然開朗，隻跑了1m26s，簡直好用到爆，寫起來可能複雜些，但是可以把分析師寫的語句直接放進來，自己隻需要改改就ok了，不多說代碼如下：

select '2019-04-07' dates,
       '3006090' productid,
       max(pro) productname,
       sum(pv) pv,
       sum(uv) uv,
       cast(sum(duration) as decimal(10,4)) duration,
       sum(new_uv) new_uv
from 
(select productname pro,
       '0' pv,
       '0' uv,
       '0' duration,
       '0' new_uv
 from enterprise.product where productid = '3006090'
union all
select '0' pro,
       count(tduserid) pv,
       '0' uv,
       '0' duration,
       '0' new_uv
from tdanalytics.stg_td_web_page_ex where l_date <= '2019-04-07' and l_date >= date_add('2019-04-07', -6) and
       productid = '3006090'
union all
select '0' pro,
       '0' pv,
       count(distinct tduserid) uv,
       avg(interval_level) duration,
       '0' new_uv
from tdanalytics.stg_td_web_launch_ex where l_date <= '2019-04-07' and l_date >= date_add('2019-04-07', -6) and
       productid = '3006090'
union all
select '0' pro,
       '0' pv,
       '0' uv,
       '0' duration,
       count(distinct tduserid) new_uv

from tdanalytics.stg_td_web_newuser_ex where l_date <= ‘2019-04-07’ and l_date >= date_add(‘2019-04-07’, -6) and

productid = ‘3006090’

) t;

那麼數值的可以用sum求和，當有漢字時怎麼辦呢，用max就可以解決這個問題：

select cast('${n_date}' as date) dates,
       '3006090' productid,
       max(pro) productname,
       sum(pv) pv,
       sum(uv) uv,
       cast(sum(duration) as decimal(10,2)) duration,
       sum(new_uv) new_uv
from 
(select productname pro,
       '0' pv,
       '0' uv,
       '0' duration,
       '0' new_uv
 from enterprise.product where productid = '3006090'
union all
select '0' pro,
       count(tduserid) pv,
       '0' uv,
       '0' duration,
       '0' new_uv
from tdanalytics.stg_td_web_page_ex where l_date <= '${n_date}' and l_date >= date_add('${n_date}', -6) and
       productid = '3006090'
union all
select '0' pro,
       '0' pv,
       count(distinct tduserid) uv,
       avg(interval_level) duration,
       '0' new_uv
from tdanalytics.stg_td_web_launch_ex where l_date <= '${n_date}' and l_date >= date_add('${n_date}', -6) and
       productid = '3006090'
union all
select '0' pro,
       '0' pv,
       '0' uv,
       '0' duration,
       count(distinct tduserid) new_uv
from tdanalytics.stg_td_web_newuser_ex where l_date <= '${n_date}' and l_date >= date_add('${n_date}', -6) and
       productid = '3006090'
) t;

還有幹貨一點，join篩選後的表要比join并篩選快，因為可以選自己想要的字段，代碼如下：

join (select url,displayname from enterprise.pagename where productid = '3006090') p on p.displayname=b.displayname   更優化
對比
   join  p on p.displayname=b.displayname where p.productid = '3006090'

巧用union all 優化HiveSqlunion的巧妙用法，使用union也可以橫向合并sql查詢結果

union的巧妙用法，使用union也可以橫向合并sql查詢結果

繼續閱讀

ARTS Share9 Oracle中的Union、Union All、Intersect、Minus

【python】集合操作set函數去重與集合、數組、元組中交集(intersection)、并集(union)、差集(difference)、對稱差集(sysmmetric difference)

第一個Hive UDAF函數

C程式設計語言（第二版）-讀書筆記

SparkCore算子（執行個體）之---- 交集、差集、并集（intersection, subtract, union, distinct, subtractByKey）

C++ union enum 探究

共用體Union一、什麼是共用體？二、共同體的用途

【C語言】——union共用體形式特點執行個體與結構體差別擴充

内連接配接·外連接配接·UNION聯結UNION

C++:struct和union 記憶體位元組對齊問題

struct中的記憶體對齊 && union

struct union資料對齊和sizeof大小

正品FearofGodxunion聯名款EternalSStee字母休閑圓領圓領合身直筒短袖T恤尺碼：SMLXL現貨秒發

sql UNION ALL 和 UNION

union和struct類型大小的跨平台讨論

Hive視窗函數Over和排序函數Rank