当前位置:   article > 正文

hive:函数:排名函数:Rank(笔记)_hive partion by rank =1

hive partion by rank =1

Rank

1.函数说明

  1. RANK() 排序(排名)相同时会重复,总数不会变
  2. DENSE_RANK() 排序(排名)相同时会重复,总数会减少
  3. ROW_NUMBER() 依次进行排名

2.数据准备

表6-7 数据准备

name

subject

score

孙悟空

语文

87

孙悟空

数学

95

孙悟空

英语

68

大海

语文

94

大海

数学

56

大海

英语

84

宋宋

语文

64

宋宋

数学

86

宋宋

英语

84

婷婷

语文

65

婷婷

数学

85

婷婷

英语

78

3.需求

计算每门学科成绩排名

4.创建本地score.txt,导入数据

 vi score.txt

5.创建hive表并导入数据

  1. create table score(
  2. name string,
  3. subject string,
  4. score int)
  5. row format delimited fields terminated by "\t";
  6. load data local inpath '/opt/module/datas/score.txt' into table score;

6.按需求查询数据

  1. select name,
  2. subject,
  3. score,
  4. rank() over(partition by subject order by score desc) rank,
  5. dense_rank() over(partition by subject order by score desc) dense_rank,
  6. row_number() over(partition by subject order by score desc) row_number
  7. from tmp.score;

 

 

扩展:求出每门学科前三名的学生?

  1. select * from
  2. (
  3. select
  4. name,
  5. subject,
  6. score,
  7. rank() over(partition by subject order by score desc ) rank,
  8. dense_rank() over(partition by subject order by score desc) dense_rank,
  9. row_number() over(partition by subject order by score desc) row_number
  10. from tmp.score
  11. ) s where row_number<=3

 

注意:partition后边有哪个字段只会影响最后的rank的排名,跟去重没关系,select后边字段可以指定多个,不一定跟partition后边的字段一致。

举例:

  1. =================sql1
  2. select
  3. create_time,creator_id,fullname,expectCity,positionName,keywords,row_number() over(partition by creator_id,fullname order by create_time desc) as rank
  4. from (
  5. select from_unixtime(ceil(c.create_time/1000),'yyyy-MM-dd') create_time,c.creator_id,e.fullname,
  6. c.expectCity,c.positionName,c.keywords,c.degree,c.minage,c.maxage,c.isNationalUnified,c.isFamous,c.companyName,c.expectIndustry,c.workCity,c.gender,c.minCurrentSalary,c.maxCurrentSalary,c.minWorkYear,c.maxWorkYear
  7. --,concat(expectCity,':',positionName,'(',keywords,' ',degree,' ',minage,'-',maxage,')') conditions
  8. from tmp.t_conditions c
  9. --select c.creator_id,explode(split(regexp_replace(regexp_replace(expectCity,'\\[',''),'\\]',''),',')) city from tmp.t_conditions c
  10. left join ods.ods_aimsen_walre_anlle_base_employees e
  11. on c.creator_id=e.number
  12. where c.creator_id not in ('130000062','130000063','130000071','430000003')
  13. and c.tag in(2)
  14. and c.phone = ''
  15. and c.resumeId = ''
  16. and c.name = ''
  17. and from_unixtime(ceil(c.create_time/1000),'yyyy-MM-dd') > '2020-07-20'
  18. --and expectCity !='' and degree !='' and (maxAge!='' or minAge!='') and (positionName !='' or keyWords!='')
  19. and creator_id=100400287
  20. group by create_time,c.creator_id,e.fullname,c.expectCity,c.positionName,c.keywords,c.degree,c.minage,c.maxage,c.isNationalUnified,c.isFamous,c.companyName,c.expectIndustry,c.workCity,c.gender,c.minCurrentSalary,c.maxCurrentSalary,c.minWorkYear,c.maxWorkYear--,concat(expectCity,':',positionName,'(',keywords,' ',degree,' ',minage,'-',maxage,')')
  21. ) x
  22. =================sql2
  23. select
  24. create_time,creator_id,fullname,expectCity,positionName,keywords,row_number() over(partition by creator_id,fullname,expectCity order by create_time desc) as rank
  25. from (
  26. select from_unixtime(ceil(c.create_time/1000),'yyyy-MM-dd') create_time,c.creator_id,e.fullname,
  27. c.expectCity,c.positionName,c.keywords,c.degree,c.minage,c.maxage,c.isNationalUnified,c.isFamous,c.companyName,c.expectIndustry,c.workCity,c.gender,c.minCurrentSalary,c.maxCurrentSalary,c.minWorkYear,c.maxWorkYear
  28. --,concat(expectCity,':',positionName,'(',keywords,' ',degree,' ',minage,'-',maxage,')') conditions
  29. from tmp.t_conditions c
  30. --select c.creator_id,explode(split(regexp_replace(regexp_replace(expectCity,'\\[',''),'\\]',''),',')) city from tmp.t_conditions c
  31. left join ods.ods_aimsen_walre_anlle_base_employees e
  32. on c.creator_id=e.number
  33. where c.creator_id not in ('130000062','130000063','130000071','430000003')
  34. and c.tag in(2)
  35. and c.phone = ''
  36. and c.resumeId = ''
  37. and c.name = ''
  38. and from_unixtime(ceil(c.create_time/1000),'yyyy-MM-dd') > '2020-07-20'
  39. --and expectCity !='' and degree !='' and (maxAge!='' or minAge!='') and (positionName !='' or keyWords!='')
  40. and creator_id=100400287
  41. group by create_time,c.creator_id,e.fullname,c.expectCity,c.positionName,c.keywords,c.degree,c.minage,c.maxage,c.isNationalUnified,c.isFamous,c.companyName,c.expectIndustry,c.workCity,c.gender,c.minCurrentSalary,c.maxCurrentSalary,c.minWorkYear,c.maxWorkYear--,concat(expectCity,':',positionName,'(',keywords,' ',degree,' ',minage,'-',maxage,')')
  42. ) x

sql1和sql2区别为:

sql2多个expectCity字段

create_time,creator_id,fullname,expectCity,positionName,keywords,row_number() over(partition by creator_id,fullname,expectCity order by create_time desc) as rank

sql1查询效果:

sql2查询效果:

​rank排名发生了变化,但最终查询出的结果条数并没有改变。

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/很楠不爱3/article/detail/558588
推荐阅读
相关标签
  

闽ICP备14008679号