赞
踩
窗口函数分类:聚合函数、排序函数和值函数(偏移函数)。
SQL中的窗口函数是一种特殊的函数,它能够在查询结果中创建一个窗口或者窗口集合,然后对这个窗口进行计算。窗口函数可以用于分析和汇总数据,同时保留原始查询结果的行顺序。
窗口函数基于查询结果的行数据进行计算,窗口函数运行在HAVING子句之后、 ORDER BY子句之前。窗口函数需要特殊的关键字OVER子句来指定窗口即触发一个窗口函数。
AVG:该函数用于计算平均值。
COUNT:该函数用于计算记录数。
MAX:该函数用于计算最大值。
MIN:该函数用于计算最小值。
STDDEV:返回数值的总体标准差。
STDDEV_SAMP:返回一组数值(整数、小数或浮点)的样本标准差。
SUM:该函数用于计算汇总值。
-- 数据准备 WITH t_sales_info AS ( SELECT * FROM ( VALUES('A', '2023-06-01', '2023', '1', 36.0) , ('A', '2023-09-01', '2023', '2', 48.0) , ('B', '2023-09-07', '2023', '2', 48.0) , ('C', '2023-10-10', '2023', '3', 23.0) , ('A', '2023-11-11', '2023', '3', 23.0) , ('C', '2023-12-11', '2023', '3', 23.0) , ('A', '2024-01-01', '2024', '2', 48.0) , ('B', '2024-02-02', '2024', '2', 48.0) , ('C', '2024-02-04', '2024', '1', 36.0) , ('C', '2024-02-11', '2024', '1', 36.0) , ('B', '2024-03-16', '2024', '3', 23.0) , ('B', '2024-04-01', '2024', '3', 23.0) , ('A', '2024-04-14', '2024', '3', 23.0) , ('C', '2024-04-21', '2024', '3', 23.0) , ('B', '2024-05-07', '2024', '3', 23.0) ) AS tbl_name(customer_id, order_date, year, product_id, product_price) )
该函数用于计算平均值。
输入值类型:BIGINT、DOUBLE或FLOAT。
返回值类型:DOUBLE。
-- order by 可以省略
SELECT customer_id, order_date, `year`, product_id, product_price
, AVG(product_price) OVER(PARTITION BY customer_id ORDER BY customer_id) AS price_avg
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | price_avg |
---|---|---|---|---|---|
A | 2024-04-14 | 2024 | 3 | 23.0 | 35.6 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 35.6 |
A | 2023-09-01 | 2023 | 2 | 48.0 | 35.6 |
A | 2023-06-01 | 2023 | 1 | 36.0 | 35.6 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 35.6 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 33.0 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 33.0 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 33.0 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 33.0 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 33.0 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 28.2 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 28.2 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 28.2 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 28.2 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 28.2 |
-- 分组字段和排序字段不一致时, 会先分组, 再根据排序字段分组累计求平均值;
-- 如下: 先计算 A+2023 平均值, 到2024时计算 A+2023+2024 平均值
SELECT customer_id, order_date, `year`, product_id, product_price
, AVG(product_price) OVER(PARTITION BY customer_id ORDER BY `year`) AS price_avg
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | price_avg |
---|---|---|---|---|---|
A | 2023-09-01 | 2023 | 2 | 48.0 | 35.666666666666664 |
A | 2023-06-01 | 2023 | 1 | 36.0 | 35.666666666666664 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 35.666666666666664 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 35.6 |
A | 2024-04-14 | 2024 | 3 | 23.0 | 35.6 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 48.0 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 33.0 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 33.0 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 33.0 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 33.0 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 23.0 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 23.0 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 28.2 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 28.2 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 28.2 |
该函数用于计算记录数。
输入值类型:数值、字符串类型或BOOLEAN类型。
返回值类型:BIGINT。
-- 2. COUNT:该函数用于计算记录数。
SELECT customer_id, order_date, `year`, product_id, product_price
, COUNT(product_id) OVER(PARTITION BY customer_id, `year` ORDER BY `year`) AS product_ct
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | product_ct |
---|---|---|---|---|---|
A | 2023-09-01 | 2023 | 2 | 48.0 | 3 |
A | 2023-06-01 | 2023 | 1 | 36.0 | 3 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 3 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 2 |
A | 2024-04-14 | 2024 | 3 | 23.0 | 2 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 1 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 4 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 4 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 4 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 4 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 2 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 2 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 3 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 3 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 3 |
该函数用于计算最大值。
输入值类型:该函数支持输入任意类型的参数,但是BOOLEAN类型的数据不允许参与运算。
返回值类型:与该函数的输入值类型保持一致。
注意:当列中的值为NULL时,该行不参与计算。
-- 3. MAX:该函数用于计算最大值。
SELECT customer_id, order_date, `year`, product_id, product_price
, MAX(product_price) OVER(PARTITION BY customer_id ORDER BY customer_id) AS product_ct
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | product_ct |
---|---|---|---|---|---|
A | 2024-04-14 | 2024 | 3 | 23.0 | 48.0 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 48.0 |
A | 2023-09-01 | 2023 | 2 | 48.0 | 48.0 |
A | 2023-06-01 | 2023 | 1 | 36.0 | 48.0 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 48.0 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 48.0 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 48.0 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 48.0 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 48.0 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 48.0 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 36.0 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 36.0 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 36.0 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 36.0 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 36.0 |
该函数用于计算最小值。
输入值类型:该函数支持输入任意类型的参数,但是BOOLEAN类型的数据不允许参与运算。
返回值类型:与该函数的输入值类型保持一致。
注意:当列中的值为NULL时,该行不参与计算。
-- 4. MIN:该函数用于计算最小值。
SELECT customer_id, order_date, `year`, product_id, product_price
, MIN(product_price) OVER(PARTITION BY customer_id ORDER BY customer_id) AS product_ct
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | product_ct |
---|---|---|---|---|---|
A | 2024-04-14 | 2024 | 3 | 23.0 | 23.0 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 23.0 |
A | 2023-09-01 | 2023 | 2 | 48.0 | 23.0 |
A | 2023-06-01 | 2023 | 1 | 36.0 | 23.0 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 23.0 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 23.0 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 23.0 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 23.0 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 23.0 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 23.0 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 23.0 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 23.0 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 23.0 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 23.0 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 23.0 |
STDDEV:返回数值的总体标准差。
STDDEV_SAMP:返回一组数值(整数、小数或浮点)的样本标准差。
样本标准差=方差的算术平方根=s=sqrt(((x1-x)^2 +(x2-x)^2 +......(xn-x)^2)/(n-1))
总体标准差=σ=sqrt(((x1-x)^2 +(x2-x)^2 +......(xn-x)^2)/n )
-- 5. 标准差。
SELECT customer_id, order_date, `year`, product_id, product_price
, STDDEV(product_price) OVER(PARTITION BY customer_id ORDER BY customer_id) AS price_std
, STDDEV_SAMP(product_price) OVER(PARTITION BY customer_id ORDER BY customer_id) AS price_std_samp
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | price_std | price_std_samp |
---|---|---|---|---|---|---|
A | 2024-04-14 | 2024 | 3 | 23.0 | 11.18212859879549 | 12.501999840025595 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 11.18212859879549 | 12.501999840025595 |
A | 2023-09-01 | 2023 | 2 | 48.0 | 11.18212859879549 | 12.501999840025595 |
A | 2023-06-01 | 2023 | 1 | 36.0 | 11.18212859879549 | 12.501999840025595 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 11.18212859879549 | 12.501999840025595 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 12.24744871391589 | 13.693063937629153 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 12.24744871391589 | 13.693063937629153 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 12.24744871391589 | 13.693063937629153 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 12.24744871391589 | 13.693063937629153 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 12.24744871391589 | 13.693063937629153 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 6.368673331236264 | 7.12039324756716 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 6.368673331236264 | 7.12039324756716 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 6.368673331236264 | 7.12039324756716 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 6.368673331236264 | 7.12039324756716 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 6.368673331236264 | 7.12039324756716 |
该函数用于计算汇总值。
输入值类型:BIGINT、DOUBLE或FLOAT。
返回值类型:BIGINT。
-- 6. SUM:该函数用于计算汇总值。
SELECT customer_id, order_date, `year`, product_id, product_price
, SUM(product_price) OVER(PARTITION BY customer_id, `year` ORDER BY `year`) AS price_sum
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | price_sum |
---|---|---|---|---|---|
A | 2023-09-01 | 2023 | 2 | 48.0 | 107.0 |
A | 2023-06-01 | 2023 | 1 | 36.0 | 107.0 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 107.0 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 71.0 |
A | 2024-04-14 | 2024 | 3 | 23.0 | 71.0 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 48.0 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 117.0 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 117.0 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 117.0 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 117.0 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 46.0 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 46.0 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 95.0 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 95.0 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 95.0 |
CUME_DIST:返回一组数值中每个值的累计分布。
RANK:返回数据集中每个值的排名。
DENSE_RANK:返回一组数值中每个数值的排名。
ROW_NUMBER:根据行在窗口分区内的顺序,为每行数据返回一个唯一的有序行号,行号从1开始。
PERCENT_RANK:返回数据集中每个数据的排名百分比,其结果由(r - 1) / (n - 1)计算得出。其中r为RANK()计算的当前行排名, n为当前窗口分区内总的行数。
NTILE:将每个窗口分区的数据分散到桶号从1到n的n个桶中。
语法
function over (partition by a order by b RANGE|ROWS BETWEEN start AND end)
分区规范:用于将输入行分散到不同的分区中,过程和GROUP BY子句的分散过程相似。
排序规范:决定输入数据行在窗口函数中执行的顺序。
窗口区间:指定计算数据的窗口边界。
窗口区间支持RANGE、ROWS两种模式:
RANGE按照计算列值的范围进行定义。
ROWS按照计算列的行数进行范围定义。
RANGE、ROWS中可以使用BETWEEN start AND end指定边界可取值。BETWEEN start AND end取值为:
CURRENT ROW,当前行。
N PRECEDING,前n行。
UNBOUNDED PRECEDING,直到第1行。
N FOLLOWING,后n行。
UNBOUNDED FOLLOWING,直到最后1行。
返回一组数值中每个值的累计分布。
返回结果:在窗口分区中对窗口进行排序后的数据集,包括当前行和当前行之前的数据行数。排序中任何关联值均会计算成相同的分布值。
返回值类型:DOUBLE。
-- 1. CUME_DIST:返回一组数值中每个值的累计分布。
SELECT customer_id, order_date, `year`, product_id, product_price
, CUME_DIST() OVER(PARTITION BY customer_id ORDER BY order_date) AS cume_dist
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | cume_dist |
---|---|---|---|---|---|
A | 2023-06-01 | 2023 | 1 | 36.0 | 0.2 |
A | 2023-09-01 | 2023 | 2 | 48.0 | 0.4 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 0.6 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 0.8 |
A | 2024-04-14 | 2024 | 3 | 23.0 | 1.0 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 0.2 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 0.4 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 0.6 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 0.8 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 1.0 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 0.2 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 0.4 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 0.6 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 0.8 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 1.0 |
返回数据集中每个值的排名: RANK() 会跳过重复值; DENSE_RANK() 不会跳过重复值序号; ROW_NUMBER() 不包含重复值, 相同名次按顺序排列;
排名值是将当前行之前的行数加1,不包含当前行。因此,排序的关联值可能产生顺序上的空隙,而且这个排名会对每个窗口分区进行计算。
DENSE_RANK()与RANK()功能相似,但是DENSE_RANK()关联值不会产生顺序上的空隙。
返回值类型:BIGINT。
SELECT customer_id, product_price
, RANK() OVER(PARTITION BY customer_id ORDER BY product_price) AS rk
, DENSE_RANK() OVER(PARTITION BY customer_id ORDER BY product_price) AS d_rk
, ROW_NUMBER() OVER(PARTITION BY customer_id ORDER BY product_price) AS rn
FROM t_sales_info ;
customer_id | product_price | rk | d_rk | rn |
---|---|---|---|---|
A | 23.0 | 1 | 1 | 1 |
A | 23.0 | 1 | 1 | 2 |
A | 36.0 | 3 | 2 | 3 |
A | 48.0 | 4 | 3 | 4 |
A | 48.0 | 4 | 3 | 5 |
B | 23.0 | 1 | 1 | 1 |
B | 23.0 | 1 | 1 | 2 |
B | 23.0 | 1 | 1 | 3 |
B | 48.0 | 4 | 2 | 4 |
B | 48.0 | 4 | 2 | 5 |
C | 23.0 | 1 | 1 | 1 |
C | 23.0 | 1 | 1 | 2 |
C | 23.0 | 1 | 1 | 3 |
C | 36.0 | 4 | 2 | 4 |
C | 36.0 | 4 | 2 | 5 |
命令说明:返回数据集中每个数据的排名百分比,其结果由(r - 1) / (n - 1)计算得出。其中,r为RANK()计算的当前行排名, n为当前窗口分区内总的行数。
返回值类型:DOUBLE。
SELECT customer_id, order_date, `year`, product_id, product_price
, RANK() OVER(PARTITION BY customer_id ORDER BY product_price) AS rk
, PERCENT_RANK() OVER(PARTITION BY customer_id ORDER BY product_price) AS pr
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | rk | pr |
---|---|---|---|---|---|---|
A | 2024-04-14 | 2024 | 3 | 23.0 | 1 | 0.0 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 1 | 0.0 |
A | 2023-06-01 | 2023 | 1 | 36.0 | 3 | 0.5 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 4 | 0.75 |
A | 2023-09-01 | 2023 | 2 | 48.0 | 4 | 0.75 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 1 | 0.0 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 1 | 0.0 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 1 | 0.0 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 4 | 0.75 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 4 | 0.75 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 1 | 0.0 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 1 | 0.0 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 1 | 0.0 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 4 | 0.75 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 4 | 0.75 |
命令说明:将每个窗口分区的数据分散到桶号从1到n的n个桶中。
桶号值最多间隔1,如果窗口分区中的数据行数不能均匀地分散到每一个桶中,则剩余值将从第1个桶开始,每1个桶分1行数据。例如,有6行数据和4个桶, 最终桶号值为1 1 2 2 3 4。
返回值类型:BIGINT。
SELECT customer_id, order_date, `year`, product_id, product_price
, NTILE(4) OVER(PARTITION BY `year` ORDER BY order_date) AS rk
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | rk |
---|---|---|---|---|---|
A | 2023-06-01 | 2023 | 1 | 36.0 | 1 |
A | 2023-09-01 | 2023 | 2 | 48.0 | 1 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 2 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 2 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 3 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 4 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 1 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 1 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 1 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 2 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 2 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 3 |
A | 2024-04-14 | 2024 | 3 | 23.0 | 3 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 4 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 4 |
FIRST_VALUE:返回窗口分区第1行的值。
LAST_VALUE返回窗口分区最后1行的值。
LAG:返回窗口内距离当前行之前偏移offset后的值。
LEAD:返回窗口内距离当前行偏移offset后的值。
NTH_VALUE:返回窗口内偏移指定offset后的值,偏移量从1开始。
命令说明:返回窗口分区第一行的值。
返回值类型:与输入参数类型相同。
SELECT customer_id, order_date, `year`, product_id, product_price
, FIRST_VALUE(product_id) OVER(PARTITION BY customer_id ORDER BY order_date) AS fv
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | fv |
---|---|---|---|---|---|
A | 2023-06-01 | 2023 | 1 | 36.0 | 1 |
A | 2023-09-01 | 2023 | 2 | 48.0 | 1 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 1 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 1 |
A | 2024-04-14 | 2024 | 3 | 23.0 | 1 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 2 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 2 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 2 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 2 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 2 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 3 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 3 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 3 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 3 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 3 |
命令说明:返回窗口分区最后一行的值。LAST_VALUE默认统计范围是 rows between unbounded preceding and current row,即取当前行数据与当前行之前的数据进行比较。如果像FIRST_VALUE那样直接在每行数据中显示最后一行数据,需要在 order by 条件的后面加上语句:rows between unbounded preceding and unbounded following。
返回值类型:与输入参数类型相同。
SELECT customer_id, order_date, `year`, product_id, product_price
, FIRST_VALUE(product_id) OVER(PARTITION BY customer_id ORDER BY order_date) AS lv
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | lv |
---|---|---|---|---|---|
A | 2023-06-01 | 2023 | 1 | 36.0 | 1 |
A | 2023-09-01 | 2023 | 2 | 48.0 | 2 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 3 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 2 |
A | 2024-04-14 | 2024 | 3 | 23.0 | 3 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 2 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 2 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 3 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 3 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 3 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 3 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 3 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 1 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 1 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 3 |
SELECT customer_id, order_date, `year`, product_id, product_price
, LAST_VALUE(product_id) OVER(PARTITION BY customer_id ORDER BY order_date ROWS BETWEEN unbounded preceding AND unbounded following ) AS lv
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | lv |
---|---|---|---|---|---|
A | 2023-06-01 | 2023 | 1 | 36.0 | 3 |
A | 2023-09-01 | 2023 | 2 | 48.0 | 3 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 3 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 3 |
A | 2024-04-14 | 2024 | 3 | 23.0 | 3 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 3 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 3 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 3 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 3 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 3 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 3 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 3 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 3 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 3 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 3 |
LAG(x[, offset[, default_value]])
命令说明:返回窗口内距离当前行之前偏移offset后的值。
偏移量起始值是0,也就是当前数据行。偏移量可以是标量表达式,默认offset是1 。
如果偏移量的值是null或者大于窗口长度,则返回default_value;如果没有指定default_value,则返回null。
返回值类型:与输入参数类型相同。
SELECT customer_id, order_date, `year`, product_id, product_price
, LAG(product_id, 1, '无') OVER(PARTITION BY customer_id ORDER BY order_date) AS lag1
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | lag1 |
---|---|---|---|---|---|
A | 2023-06-01 | 2023 | 1 | 36.0 | 无 |
A | 2023-09-01 | 2023 | 2 | 48.0 | 1 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 2 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 3 |
A | 2024-04-14 | 2024 | 3 | 23.0 | 2 |
B | 2023-09-07 | 2023 | 2 | 48.0 | 无 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 2 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 2 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 3 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 3 |
C | 2023-10-10 | 2023 | 3 | 23.0 | 无 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 3 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 3 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 1 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 1 |
LEAD(x[,offset[, default_value]])
命令说明:返回窗口内距离当前行偏移offset后的值。
偏移量offset起始值是0,也就是当前数据行。偏移量可以是标量表达式,默认offset是1 。
如果偏移量的值是null或者大于窗口长度,则返回default_value;如果没有指定default_value,则返回null。
返回值类型:与输入参数类型相同。
SELECT customer_id, order_date, `year`, product_id, product_price
, LEAD(product_id, 1) OVER(PARTITION BY customer_id ORDER BY order_date) AS lead1
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | lead1 |
---|---|---|---|---|---|
A | 2023-06-01 | 2023 | 1 | 36.0 | 2 |
A | 2023-09-01 | 2023 | 2 | 48.0 | 3 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 2 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 3 |
A | 2024-04-14 | 2024 | 3 | 23.0 | \N |
B | 2023-09-07 | 2023 | 2 | 48.0 | 2 |
B | 2024-02-02 | 2024 | 2 | 48.0 | 3 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 3 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 3 |
B | 2024-05-07 | 2024 | 3 | 23.0 | \N |
C | 2023-10-10 | 2023 | 3 | 23.0 | 3 |
C | 2023-12-11 | 2023 | 3 | 23.0 | 1 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 1 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 3 |
C | 2024-04-21 | 2024 | 3 | 23.0 | \N |
NTH_VALUE(x, offset)
命令说明:返回窗口内偏移指定offset后的值,偏移量从1开始。
如果偏移量offset是null或者大于窗口内值的个数,则返回null;如果偏移量offset为0或者负数,则系统提示报错。
返回值类型:与输入参数类型相同。
SELECT customer_id, order_date, `year`, product_id, product_price
, NTH_VALUE(product_id, 2) OVER(PARTITION BY customer_id ORDER BY order_date) AS nv1
FROM t_sales_info ;
customer_id | order_date | year | product_id | product_price | nv1 |
---|---|---|---|---|---|
A | 2023-06-01 | 2023 | 1 | 36.0 | \N |
A | 2023-09-01 | 2023 | 2 | 48.0 | 2 |
A | 2023-11-11 | 2023 | 3 | 23.0 | 2 |
A | 2024-01-01 | 2024 | 2 | 48.0 | 2 |
A | 2024-04-14 | 2024 | 3 | 23.0 | 2 |
B | 2023-09-07 | 2023 | 2 | 48.0 | \N |
B | 2024-02-02 | 2024 | 2 | 48.0 | 2 |
B | 2024-03-16 | 2024 | 3 | 23.0 | 2 |
B | 2024-04-01 | 2024 | 3 | 23.0 | 2 |
B | 2024-05-07 | 2024 | 3 | 23.0 | 2 |
C | 2023-10-10 | 2023 | 3 | 23.0 | \N |
C | 2023-12-11 | 2023 | 3 | 23.0 | 3 |
C | 2024-02-04 | 2024 | 1 | 36.0 | 3 |
C | 2024-02-11 | 2024 | 1 | 36.0 | 3 |
C | 2024-04-21 | 2024 | 3 | 23.0 | 3 |
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。