Rank
1.函數說明
RANK() 排序(排名)相同時會重複,總數不會變
DENSE_RANK() 排序(排名)相同時會重複,總數會減少
ROW_NUMBER() 依次進行排名
2.資料準備
表6-7 資料準備
name | subject | score |
孫悟空 | 國文 | 87 |
孫悟空 | 數學 | 95 |
孫悟空 | 英語 | 68 |
大海 | 國文 | 94 |
大海 | 數學 | 56 |
大海 | 英語 | 84 |
宋宋 | 國文 | 64 |
宋宋 | 數學 | 86 |
宋宋 | 英語 | 84 |
婷婷 | 國文 | 65 |
婷婷 | 數學 | 85 |
婷婷 | 英語 | 78 |
3.需求
計算每門學科成績排名
4.建立本地score.txt,導入資料
vi score.txt
5.建立hive表并導入資料
create table score(
name string,
subject string,
score int)
row format delimited fields terminated by "\t";
load data local inpath '/opt/module/datas/score.txt' into table score;
6.按需求查詢資料
select name,
subject,
score,
rank() over(partition by subject order by score desc) rank,
dense_rank() over(partition by subject order by score desc) dense_rank,
row_number() over(partition by subject order by score desc) row_number
from tmp.score;
擴充:求出每門學科前三名的學生?
select * from
(
select
name,
subject,
score,
rank() over(partition by subject order by score desc ) rank,
dense_rank() over(partition by subject order by score desc) dense_rank,
row_number() over(partition by subject order by score desc) row_number
from tmp.score
) s where row_number<=3
注意:partition後邊有哪個字段隻會影響最後的rank的排名,跟去重沒關系,select後邊字段可以指定多個,不一定跟partition後邊的字段一緻。
舉例:
=================sql1:
select
create_time,creator_id,fullname,expectCity,positionName,keywords,row_number() over(partition by creator_id,fullname order by create_time desc) as rank
from (
select from_unixtime(ceil(c.create_time/1000),'yyyy-MM-dd') create_time,c.creator_id,e.fullname,
c.expectCity,c.positionName,c.keywords,c.degree,c.minage,c.maxage,c.isNationalUnified,c.isFamous,c.companyName,c.expectIndustry,c.workCity,c.gender,c.minCurrentSalary,c.maxCurrentSalary,c.minWorkYear,c.maxWorkYear
--,concat(expectCity,':',positionName,'(',keywords,' ',degree,' ',minage,'-',maxage,')') conditions
from tmp.t_conditions c
--select c.creator_id,explode(split(regexp_replace(regexp_replace(expectCity,'\\[',''),'\\]',''),',')) city from tmp.t_conditions c
left join ods.ods_aimsen_walre_anlle_base_employees e
on c.creator_id=e.number
where c.creator_id not in ('130000062','130000063','130000071','430000003')
and c.tag in(2)
and c.phone = ''
and c.resumeId = ''
and c.name = ''
and from_unixtime(ceil(c.create_time/1000),'yyyy-MM-dd') > '2020-07-20'
--and expectCity !='' and degree !='' and (maxAge!='' or minAge!='') and (positionName !='' or keyWords!='')
and creator_id=100400287
group by create_time,c.creator_id,e.fullname,c.expectCity,c.positionName,c.keywords,c.degree,c.minage,c.maxage,c.isNationalUnified,c.isFamous,c.companyName,c.expectIndustry,c.workCity,c.gender,c.minCurrentSalary,c.maxCurrentSalary,c.minWorkYear,c.maxWorkYear--,concat(expectCity,':',positionName,'(',keywords,' ',degree,' ',minage,'-',maxage,')')
) x
=================sql2:
select
create_time,creator_id,fullname,expectCity,positionName,keywords,row_number() over(partition by creator_id,fullname,expectCity order by create_time desc) as rank
from (
select from_unixtime(ceil(c.create_time/1000),'yyyy-MM-dd') create_time,c.creator_id,e.fullname,
c.expectCity,c.positionName,c.keywords,c.degree,c.minage,c.maxage,c.isNationalUnified,c.isFamous,c.companyName,c.expectIndustry,c.workCity,c.gender,c.minCurrentSalary,c.maxCurrentSalary,c.minWorkYear,c.maxWorkYear
--,concat(expectCity,':',positionName,'(',keywords,' ',degree,' ',minage,'-',maxage,')') conditions
from tmp.t_conditions c
--select c.creator_id,explode(split(regexp_replace(regexp_replace(expectCity,'\\[',''),'\\]',''),',')) city from tmp.t_conditions c
left join ods.ods_aimsen_walre_anlle_base_employees e
on c.creator_id=e.number
where c.creator_id not in ('130000062','130000063','130000071','430000003')
and c.tag in(2)
and c.phone = ''
and c.resumeId = ''
and c.name = ''
and from_unixtime(ceil(c.create_time/1000),'yyyy-MM-dd') > '2020-07-20'
--and expectCity !='' and degree !='' and (maxAge!='' or minAge!='') and (positionName !='' or keyWords!='')
and creator_id=100400287
group by create_time,c.creator_id,e.fullname,c.expectCity,c.positionName,c.keywords,c.degree,c.minage,c.maxage,c.isNationalUnified,c.isFamous,c.companyName,c.expectIndustry,c.workCity,c.gender,c.minCurrentSalary,c.maxCurrentSalary,c.minWorkYear,c.maxWorkYear--,concat(expectCity,':',positionName,'(',keywords,' ',degree,' ',minage,'-',maxage,')')
) x
sql1和sql2差別為:
sql2多個expectCity字段
create_time,creator_id,fullname,expectCity,positionName,keywords,row_number() over(partition by creator_id,fullname,expectCity order by create_time desc) as rank
sql1查詢效果:
sql2查詢效果:
rank排名發生了變化,但最終查詢出的結果條數并沒有改變。