当前位置:   article > 正文

Leetcode SQL 解析(Day1)_sql leetcode

sql leetcode

1. 游戏玩法分析2

  • 解析

  1. #方法一:嵌套查询
  2. select player_id,device_id from Activity
  3. where (player_id,event_date) in (
  4. select player_id,min(event_date)
  5. from Activity
  6. group by player_id
  7. )
  8. #方法二:时间窗口查询
  9. select player_id,device_id from
  10. (
  11. select
  12. player_id,
  13. device_id,
  14. row_number() over(partition by player_id order by event_date) as rn
  15.         #按照player_id分组,组内按照event_date排序,将组内每条数据重新排序编号为rn
  16. from Activity
  17. ) as a
  18. where rn=1

ROW_NUMBER(): 求行数,结果为1,2,3,4

RANK(): 有间隔的分级,结果为1,2,2,4

DENSE_RANK(): 无间隔的分级,结果为1,2,2,3

2. 游戏玩法分析3

  • 解析

  1. #方法一:自连接
  2. #step1:自连接,得到图中整体的表(t1将设置为现在的时间戳,t2将为了找出比当前时间戳更早的记录)
  3. select t1.player_id,t1.event_date
  4. from Activity t1, Activity t2
  5. where t1.player_id=t2.player_id
  6. #step2:筛选日期在自身之前的记录,如图中灰色部分
  7. t1.event_date>=t2.event_date
  8. #step3:用group by 对两个主键进行分组并对目标字段进行sum(),如途中棕色虚线圈部分,综合代码为:
  9. select player_id,
  10.        event_date,
  11.       sum(game_played)as game_played_so_far
  12. from Activity t1, Activity t2
  13. where t1.player_id=t2.player_id and t1.event_date>=t2.event_date #拼表+筛选
  14. group by t1.player_id,t1.event_date #分组后用sum聚合
  1. #方法二:时间窗口
  2. select
  3.     player_id,
  4.     event_date,
  5.     sum(game_played)over(partition by player_id order by event_date)as game_played_so_far
  6.     #按照player_id分组,组内按照event_date排序,将每条数据event_date之前的game_played累加
  7. from Activity
  8. group by t1.player_id,t1.event_date

sum(字段1) over(partition by 字段2 order by 字段3 rows between unbounded preceding and current row) as 新字段名

函数说明:

  • sum(tota_amount)的求和是针对后面over()窗口的求和,

  • over中partition by player_id order by event_date 针对player_id这一组按照event_date排序,

  • rows between unbounded preceding and current row 限定了行是按照在当前行不限定的往前处理(累加范围),通俗就是处理当前以及之前的所有行的sum,即3月时sum(amount)求的时1、2、3月的和,2月时sum(amount)求的是1、2月的和。unbounded意思无限的 preceding在之前的,current row当前行。

  • 这里可以改成:

  1. SUM(Salary) OVER (PARTITION BY Id ORDER BY Month ROWS 2 PRECEDING)
  2. #从当前月往前滚动2个月薪水累加(月份连续,如12月)
  3. or
  4. SUM(Salary) OVER (PARTITION BY Id ORDER BY Month range 2 PRECEDING)
  5. #从当前月往前找到2个月薪水累加(月份不连续,如13月)

3. 游戏玩法分析4

  • 解析

  1. # 方法一:用IN是否存在筛选第二天登录的记录
  2. SELECT
  3. ROUND(COUNT(DISTINCT player_id)/(SELECT COUNT(distinct player_id) FROM Activity),
  4. 2) AS fraction #分母是个固定的常数,而分子要根据下面的条件过滤
  5. FROM
  6. Activity
  7. WHERE
  8. (player_id,event_date)
  9. IN
  10. (SELECT player_id,Date(min(event_date)+1)
  11. #得到每个玩家的次日登陆时间,看原数据中是否存在(存在说明第二天登录过,则被where筛选出来)
  12. FROM Activity
  13. Group by player_id) #给每个player_id都计算第二天登陆日期,为date()聚合
  1. #方法二:JOIN+Avg()
  2. SELECT ROUND (Avg(a.player_id is not null),2) as fraction #筛选的是a表,不符合on条件的为空
  3. FROM Activity a right join #a:所有登陆记录
  4. (
  5. SELECT player_id,min(event_date) as firstDate
  6. FROM Activity
  7. Group by player_id
  8. ) b #b:把每个player_id第一天登录的时间拿出来
  9. on a.player_id=b.player_id and datediff(a.event_date,b.firstDate)=1
  10. #如果相差1天则有值,不等于1则为空,会被最开始筛选掉
  1. #方法三:窗口函数
  2. select round(count(distinct t.player_id)/(select count(distinct player_id) from Activity),2) fraction #在临时表中筛选后聚合
  3. from
  4. (
  5. select
  6. a.player_id,
  7. first_value(a.event_date) over(partition by a.player_id order by a.event_date)first_date,
  8. #按player_id进行分组,按event_date进行组内排序,取组内排序后第一个值
  9. lead(a.event_date,1)over(partition by a.player_id order by a.event_date)next_date
  10. #使用LEAD()函数,可以从当前行访问下一行的数据或下一行之后的行
  11. from Activity a) t #建立临时表
  12. where datediff(t.next_date,t.first_date) = 1 #在临时表中进行筛选

4. 员工薪水中位数

  • 解析:

  1. #方法一:窗口函数用排序和总数判断
  2. Select id,company,salary from(
  3. Select
  4. id,company,salary,
  5. row_number() over (partition by company order by salary) rk,
  6. count(*) over(partition by company) cnt
  7. from Employee e
  8. ) t
  9. where t.rk in (cnt/2,cnt/2+1,cnt/2+0.5)
  10. # 如果是8个数字,中位数分别为第(454.5)个【偶数个:最中间的两个数字都是中位数】
  11. # 如果是9个数字,中位数分别为第(4.55.55)个【奇数个:最中间的一个数字是中位数】
  1. #方法二:把rank和count拼在一起后用where筛选判断
  2. Select id,company,salary from
  3. (
  4. Select
  5. id,E1.company,salary,
  6. row_number() over(partition by company order by salary) rn,
  7. total
  8. from Employee E1
  9. inner join
  10. (
  11. Select company,
  12. count(*) total
  13. from Employee
  14. group by company
  15. )E2 on E1.company=E2.company
  16. ) T
  17. where rn BETWEEN total/2 AND total/2 + 1
  18. # 如果是8个数字,中位数位于第(45)个之间【偶数个:最中间的两个数字都是中位数】
  19. # 如果是9个数字,中位数位于第(4.55.5)之间【奇数个:最中间的一个数字是中位数】

5. 至少有5名直接下属的经理

  • 解析:

  1. #方法一:子查询
  2. select name from Employee
  3. where id in
  4. (
  5. select ManagerId
  6. from Employee
  7. group by ManagerID #将每个员工按对应经理分组
  8. having count(ManagerID)>=5 #找出管理员工数大于5的经理
  9. )
  1. # 方法二:自连接
  2. select e1.name from
  3. Employee e1 left join Employee e2
  4. on e1.id=e2.managerId
  5. #员工ID=经理ID,找出来的是所有经理员工(普通员工的ID不会等于自己经理的ID,经理员工的ID等于普通员工的ID)
  6. #eg.给id=101的员工匹配managerID=101的人,匹配出来了5个,给id=102的员工匹配managerID=102的人,匹配出来了0
  7. group by e1.id #按照员工分组
  8. having count(*)>=5 #

inner join的结果:

6. 给定数字的频率查询中位数

  • 解析

  1. select avg(num) as median from
  2. (
  3. select num,
  4. sum(frequency) over(order by num asc) as asc_amount,
  5. #num正向排序,逐行累加frequency,正向累加到哪个数字的时候超过总数的一半
  6. sum(frequency) over(order by num desc) as desc_amount,
  7. #num反向排序,逐行累加frequency,反向累加到哪个数字的时候超过总数的一半
  8. sum(frequency) over() as total_num
  9. from numbers
  10. )a
  11. where asc_amount >= total_num/2 and desc_amount >= total_num / 2

子查询排序的结果为:

7. 当选者

  • 解析

  1. #方法一:having count(#)
  2. select C.name,V.id from #C是得票者name,V是投票者id
  3. Candidate C left join Vote V
  4. on C.id=V.candidateId
  5. #得到的结果是{[["A", null], ["B", 4], ["B", 1], ["C", 3], ["D", 2], ["E", 5]]}
  6. group by C.name
  7. having count(*)>=all(select count(*) from vote group by candidateId)
  8. #按group by 的C.name来count(*)
  9. # A>all(......): A大于all()中的所有数,即大于all中的最大值
  1. #方法二:
  2. select Name from Candidate
  3. where id =
  4. (select CandidateId from Vote #找出投票表中得票数最大的id
  5. group by CandidateId
  6. order by count(CandidateId) desc #按groupby后的CandidateId来count()聚合
  7. limit 01
  8. )

8. 查询回答率最高的问题

  • 解析

  1. select question_id survey_log from SurveyLog
  2. group by question_id #按题目分组
  3. order by avg(action='answer')desc,question_id #回答率,正确率等用avg()
  4. #先按回答率排序,若回答率有相同则按question_id排序
  5. limit 1

回答率,正确率等用avg()高效

9. 查询员工的累计薪水

  1. SELECT Id, Month, Salary
  2. FROM (
  3. SELECT Id, Month,
  4. SUM(Salary) OVER (PARTITION BY Id ORDER BY Month range 2 PRECEDING) AS Salary,
  5. #按id分组,组内按月份排序从当前行开始往前滚动,累加每个月的薪水
  6. rank() OVER (PARTITION BY Id ORDER BY Month DESC) AS r
  7. #按id分组,组内按月份降序排列,将序号标记为r(对大的月份的序号r=1)
  8. FROM Employee) t
  9. WHERE r > 1 #筛选出除最近一个月(即最大月)之外的数据
  10. ORDER BY Id, Month DESC;

子查询结果为(不筛选r>1):

10. 2016年成功的投保投资

  • 解析

  1. select round(sum(TIV_2016),2) as TIV_2016 from(
  2. select *,
  3. count(PID) over (partition by TIV_2015) num_tiv,
  4.             #按TIV_2015分组一共有几组
  5. count(PID) over (partition by LAT,LON) num_city
  6.             #按LAT,LON分组一共有几组
  7. from insurance) h
  8. where h.num_tiv>1 #TIV_2015组内个数大于1
  9.     and h.num_city=1 #组内个数等于1

11. 订单最多的客户

  1. `#方法一:用limit取最大值 or 用count>all()取最大值
  2. select customer_number from Orders
  3. group by customer_number
  4. order by count(Order_number) desc
  5. #having count(Order_number)>=all(
  6. # select count(order_number) from Orders group by customer_number)
  7. #每个customer_number的订单数量
  8. limit 1
  1. #方法二:count()=max()
  2. select customer_number from Orders
  3. group by customer_number
  4. having count(order_number) = (
  5. select max(sum) from(
  6. select count(order_number) as sum from Orders group by customer_number)t
  7. )
  1. #方法三:窗口函数
  2. select customer_number from(
  3. select customer_number,
  4. rank() over ( order by count(order_number) desc) as rk
  5. from Orders
  6. group by customer_number
  7. )t
  8. where rk=1
  9. #若为rank() over (partition by customer_number order by count(order_number) desc) as rk,则为在customer_number组内排序,而我们要的customer_number的组的排序,因此用group by 分组后按rank()聚合得到每个组按count(order_number)的排名

12. 好友申请1:通过率

  1. select
  2. round(
  3. ifnull(
  4. count(distinct requester_id, accepter_id)/
  5. (select count(distinct sender_id, send_to_id) accept_rate from FriendRequest)
  6. ,0)
  7. ,2) accept_rate
  8. from RequestAccepted

13. 好友申请2:谁有最多的好友

  1. select id,sum(num1) as num
  2. from (
  3. (select requester_id as id,count(accepter_id) as num1 from RequestAccepted R1
  4. group by requester_id)
  5. union all
  6. (select accepter_id as id,count(requester_id) as num1 from RequestAccepted R2
  7. group by accepter_id)
  8. )t
  9. group by id
  10. order by num desc
  11. limit 1

14. 连续空余座位

  1. #找出下第一个和下第二个,再判断下第一个和下第二个是否空闲
  2. select distinct(c1.seat_id)
  3. from cinema c1 join cinema c2
  4. on abs(c2.seat_id-c1.seat_id)=1 #编号连续可采用abs(A.id-B.id)=1
  5. where c1.free=1 and c2.free=1
  6. order by c1.seat_id

15. 销售员

  1. #方法一:not in 子查询+左连接
  2. select name from SalesPerson S
  3. where sales_id not in (
  4. select O.sales_id from Orders O
  5. left join Company C on O.com_id=C.com_id
  6. where C.name ='RED'
  7. )
  1. #方法二:全left join
  2. select s.name, count(C.name) from SalesPerson S
  3. left join Orders O on S.sales_id=O.sales_id
  4. left join Company C on O.com_id=C.com_id and c.name='RED'
  5. #符合筛选条件的才会被匹配出来
  6. group by S.name
  7. having count(C.name)=0#找到没有被匹配出来的
  8. #结果是:[["John", 1], ["Amy", 0], ["Mark", 0], ["Pam", 1], ["Alex", 0]]}

16.树节点

  1. select id ,
  2. Case
  3.     When p_id is null Then "Root"
  4.     When id in (select distinct p_id from tree) Then "Inner"
  5.     #是别人的父节点的点,
  6.     Else "Leaf"
  7. End as Type
  8. from tree

17. 二级关注者

  1. #每一个关注者follower,作为被关注者followee拥有的关注者follower数量
  2. select followee as follower,count(distinct follower) num
  3. #被关注者followee拥有的关注者follower数量:[["Alice", 1], ["Bob", 2], ["Donald", 1]]
  4. from follow
  5. where followee in (select distinct follower from follow)
  6. #对每一个关注者follower,作为被关注者followee的身份进行筛选
  7. group by followee

18. 平均工资:部门与公司比较

  1. #1.拼表,2.按日期和部门聚合计算平均金额和总金额,3.对比
  2. select
  3. pay_month,
  4. department_id,
  5. case #3.对比
  6. when department_avg/company_avg>1 then 'higher'
  7. when department_avg/company_avg=1 then 'same'
  8. else 'lower'
  9. end as comparison
  10. from(
  11. select #2.按日期和部门聚合计算平均金额和总金额
  12. date_format(pay_date,'%Y-%m') as pay_month,
  13. department_id,
  14. avg(amount) over (partition by pay_date) as company_avg,
  15. avg(amount) over (partition by department_id,pay_date) as department_avg
  16. from salary s
  17. left join employee e on s.employee_id=e.employee_id #1.拼表
  18. )t
  19. group by department_id,pay_month
  20. order by pay_month desc, department_id

19. 学生地理信息报告

  1. select
  2. max(case when continent="America" then name else null end) as America
  3. #3. 按rk=1的三个(Jack,Xi,Pascal)聚合max(),case when得到(Jack,null,null=jack
  4. max(case when continent="Asia" then name else null end) as Asia
  5. max(case when continent="Europe" then name else null end) as Europe
  6. from
  7. (
  8. select
  9. name,
  10. continent,
  11. row_number()over (partition by continent order by name) as rk
  12. #按照continent分组,按照name排序为rk
  13. from Student
  14. )t
  15. #1. t表结果是:[["Jack", "America", 1], ["Jane", "America", 2], ["Xi", "Asia", 1], ["Pascal", "Europe", 1]]
  16. #2. rk是每个name在洲内排第几
  17. group by rk #按照rk聚合上面的max()
  18. #窗口函数中对continent进行分区排名 group by cur_rank 能把同一个洲的人分成不同的组 这样上面的max就能取出所有人 如果不加group by 的话 每个洲就只能取出一个人
  19. #max()函数的功效:(‘jack’, null, null)中得出‘jack',(null,null)中得出null。 min()函数也可以。

20. 买下所有产品的客户

  1. #问题:哪些客户买了所有产品
  2. #思路:1. 找到每个客户购买的产品的个数,2. 找到购买个数=产品个数的客户
  3. #方法一:利用join 筛选相同产品个数的
  4. select c.customer_id
  5. from
  6. ( #找到每个客户购买的产品的个数
  7. select customer_id,count(distinct product_key) as cnt
  8. #如果不加distinct的话,如果1客户买了两个5产品也会被算进去
  9. from customer
  10. group by customer_id
  11. )c
  12. join
  13. ( #总产品个数
  14. select count(*) as cnt from product
  15. )p
  16. on c.cnt=p.cnt #找出客户购买个数=总产品个数
  17. #方法二:having筛选
  18. select customer_id
  19. from customer
  20. group by customer_id
  21. having count(distinct product_key)=(select count(*) from product)

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/寸_铁/article/detail/1001934
推荐阅读
相关标签
  

闽ICP备14008679号