赞
踩
窗口函数属于sql中比较高级的函数
mysql从8.0版本才支持窗口函数,5.6,5.7都没有窗口函数
oracle 里面一直支持窗口函数
hive也支持窗口函数
以下函数才是窗口函数
窗口函数(13个):
LEAD LEAD(col,n, default_val):往后第n行数据 col 列名 n 往后第几行 默认为1 默认值 默认null
LAG LAG(col,n,default_val):往前第n行数据 col 列名 n 往前第几行 默认为1 默认值 默认null
FIRST_VALUE 在当前窗口下的第一个值 FIRST_VALUE (col,true/false) 如果设置为true,则跳过空值。
LAST_VALUE 在当前窗口下的最后一个值 LAST_VALUE (col,true/false)如果设置为true,则跳过空值。
标准聚合函数:
分析排名函数
窗口函数=函数+窗口
窗口:函数在运算时,我们可以指定函数运算的数据范围
[partition by 字段] [order by 字段] [ 窗口语句]:[]的含义是可以省略,这三个都可以省略
partition by 给查出来的结果集按照某个字段分区,分区以后,开窗的大小最大不会超过分区数据的大小
一旦分区之后,我们必须在单个分区内指定窗口。
order by 给分区内的数据按照某个字段排序
(ROWS | RANGE) BETWEEN (UNBOUNDED | [num]) PRECEDING AND ([num] PRECEDING | CURRENT ROW | (UNBOUNDED | [num]) FOLLOWING)
(ROWS | RANGE) BETWEEN CURRENT ROW AND (CURRENT ROW | (UNBOUNDED | [num]) FOLLOWING)
(ROWS | RANGE) BETWEEN [num] FOLLOWING AND (UNBOUNDED | [num]) FOLLOWING
两种特殊情况
当指定ORDER BY缺少WINDOW子句时,WINDOW规范默认为RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW。
如果同时缺少ORDER BY和WINDOW子句,则WINDOW规范默认为ROW BETWEENUND UNBOUNDED PRECEDING和UNBOUNDED FOLLOWING。
以下函数在over()里面只能分区和排序,不能自定义窗口大小了,也就是不能再写window字句
排序分析函数 都不能写 例如: Rank, NTile, DenseRank, CumeDist, PercentRank.
Lead 和 Lag不能写
需求1 查询在2017年4月份购买过的顾客及总人数
select
name,
count(1) over(rows between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING)
from business
where month(orderdate) =4
group by name;
由于窗口语句有两种特殊情况,我们这种刚好符合第二种,因此可以省略掉窗口语句
select
name,
count(1) over()
from business
where month(orderdate) =4
group by name;
需求2 查询顾客的购买明细及月购买总额
SELECT
name,
orderdate,
cost,
sum(cost ) over(partition by name,month(orderdate) )
from business;
需求3 上述的场景, 将每个顾客的cost按照日期进行累加
SELECT
name,
orderdate,
cost,
sum(cost) over(partition by name order by orderdate rows between UNBOUNDED PRECEDING and CURRENT ROW) cost1,
sum(cost) over(partition by name order by orderdate) cost2
from business;
需求4 查询顾客购买明细以及上次的购买时间和下次购买时间
select
name,
orderdate,
cost,
LAG(orderdate,1,'无') over(partition by name order by orderdate) prev_time,
LEAD(orderdate,1,'无') over(partition by name order by orderdate) next_time
from business;
需求5 查询顾客每个月第一次的购买时间 和 每个月的最后一次购买时间
注意:LAST_VALUE和FIRST_VALUE 需要自定义windows字句,否则出现错误
select
name,
orderdate,
cost,
FIRST_VALUE(orderdate)
over(partition by name,month(orderdate) order by orderdate rows between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING) first_time,
LAST_VALUE(orderdate)
over(partition by name,month(orderdate) order by orderdate rows between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING) last_time
from business;
需求6 查询前20%时间的订单信息
select
t1.*
FROM
(
select
name,
orderdate,
cost,
ntile(5) over(order by orderdate ) nsort
from business
) t1
where t1.nsort = 1;
RANK() 排序相同时会重复,会跳号
DENSE_RANK() 排序相同时会重复,不会跳号
ROW_NUMBER() 会根据顺序计算
SELECT
name,
subject,
score,
rank() over(PARTITION by subject order by score desc) rp,
DENSE_RANK() over(PARTITION by subject order by score desc) drp,
ROW_NUMBER() over(PARTITION by subject order by score desc) rowp
from score;
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。