赞
踩
库(文件夹)
表(Excel名称)
字段(Excel第一行,包含字段名,字段数据类型、注释)
分区字段(sheet表,一般是日期,相当于在查询的时候提升速度)(必须限制分区,否则hive会报错)
数据地图(查寻需要的表)
KwaiBI(查询平台)
select[all | distinct] select_expr,…
from
[where]
[group by]
[having]
[order by]
[limit [offset,]rows]
select a+b as 'cnt’
from
where
group by后,必须包含group by的字段,剩余内容为分组的计算结果
select pic, count(1) as cnt
from
where p_date =
having count(1)>1000
count(*) :包括null
count(expr):不包括null
count(DISTINCT expr):去重后行数,不包括null
sum(col)
sum(DISTINCT col):去重求和
avg(col),avg(DISTINCT col):去重求平均
collect_set(col):拼成去重数组
在hive中求出一个数据表中在某天内首次登陆的人;
select a.id
from (select id,collect_set(time) as t from t_action_login where time<='20150906' group by id) as a where size(a.t)=1 and a.t[0]='20150906';
123@163.com | [“2
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。