赞
踩
将下表中id去重,并把去重后所有字段输出
说道去重相信大部分人脑海中第一反应是:
SELECT id,month,label FROM t4 GROUP BY id;
结果:
但是Hive做同样的操作就会报错:
FAILED: SemanticException [Error 10025]: Line 1:42 Expression not in GROUP BY key 'month','label'
select id,month,flag from (select id,month,flag,row_number()
over (partition by id order by month desc) as rn from view1) t where t.rn = 1;
结果:
+---+------+----+
| id| month|flag|
+---+------+----+
|133|201901| 1|
|134|201812| 1|
+---+------+----+
Hive去重,最好使用row_number()函数
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。