赞
踩
此为本人复盘使用笔记,有不足之处请大家不吝指出。
聚类:1075题、1633题、1211题、1193题、1174题、550题。
排序和分组:
SQL查找的一般步骤
select
project_id,round(sum((experience_years))/sum(if(experience_years is null, 0,1)),2) as average_years
from
Project
left join
Employee
on
Project.employee_id = Employee.employee_id
group by
project_id
(1)在round(sum((experience_years))/sum(if(experience_years is null, 0,1)),2)
需要使用sum统计工作年限不为null。
(2) 也可以将这句改成round(avg(experience_years),2) as average_years
使用聚合函数AVG求非NULL
(3)在所有聚合函数中除了count(*)都跳过空值,而非处理空值。
select
contest_id,round(count(user_id)*100/(select count(*)from users),2) as percentage
from
Register
group by
contest_id
order by
percentage desc,contest_id
(1)首先分别即group by
统计不同赛事的注册人数,再用其作分子除以User表的总人数即可。
(2)DESC为降序,ORDER BY 默认为升序。
SELECT
query_name,
ROUND(AVG(rating/position), 2) quality,
ROUND(SUM(IF(rating < 3, 1, 0)) * 100 / COUNT(*), 2) poor_query_percentage
FROM Queries
GROUP BY query_name
HAVING query_name is not null;
AVG()和SUM()以行为元素进行计算,再经由group by进行分组即可得出。
SQL的几种连接方式:
一、内连接:
INNER JOIN
只返回两个表中都存在的行。
二、外连接
LEFT JOIN 或者RIGHT JOIN
返回左(右)表中所有的行以及与右(左)表中相关联的行。如果右(左)表中没有与左(右)表中某一行对应的行,则返回空值。
三、交叉连接
可视为笛卡尔积的一种实现方式
CROSS JOIN
返回两表中所有可能存在的行组合。
1193题:每月交易一
datediff( , )前后两日期相减。
date_format( ,‘%Y-%m’)将日期改为:年-月的格式。
select
date_format(trans_date,'%Y-%m') as month,
country,
count(state) as trans_count,
count(if(state = 'approved',1,null)) as approved_count,//if条件成立,则取1;否则取null。
sum(amount) as trans_total_amount,
sum(if(state = 'approved',amount,0)) as approved_total_amount// if的条件成立,则取amount;若不成立,则取0。
from
Transactions
group by
month,country;
1174题:即时食物配送二
step1:找出用户首单数据。
step2:根据首单数据求即时订单比例。
官方解:
select round (
sum(order_date = customer_pref_delivery_date) * 100 /count(customer_id),2) as immediate_percentage
//当order_date = customer_pref_delivery_date时即为即时订单
from Delivery
where (customer_id, order_date) in (
select customer_id, min(order_date)
from delivery
group by customer_id
)//找出了在子查询中的cutomer_id和order_date,最小的日期即首单数据。
嵌套查询:SQL中一个SELECT-FROM-WHERE为一个查询块,将一个查询块嵌套在另一个查询块的WHERE子句或HAVING短语的条件中的查询称为嵌套查询。
min():最小值。
step1:题中寻找第一天和第二天都登录的人,且拥有某人第一天和第二天都登录的日期,所以建立一个“期望表”。假如某个人第一天登录了,那么他登录的第二天是第一天加一天。期望表只需player_id和包含最初登录日期的下一天。
step2:用期望表与原始表左连接,以player_id对齐。
step3:使用sum()函数统计event_date和second_date相同的个数除以player_id的不重复个数。
sum()//求和
datediff(date1,date2)//日期相减date1-date2的值
count()//计数函数,会将null一并统计。(在聚类函数中只有count()会统计null)
//在count函数中,不能直接使用子查询。可以使用子查询作为一个表,并在外部查询中引用它
min()//取最小值
DATE_ADD(date,INTERVAL expr type)//向日期中添加指定的时间间隔。
select round(sum(case when datediff(Expected.second_date,Origin.event_date)=0 then 1 else 0 end) / (select count(distinct player_id) from Activity),2) as fraction //round(,2)四舍五入到两位; //case when a then x(如果a成立则是x) else y(如果a不成立则是y) end:解中即为如果Expected.second_date-Origin.event_date==0日期相同则取1,否则为0,sum()求和统计1的数量。 //distinct去重 from ( select * from Activity ) as Origin left join ( select player_id,date_add(min(event_date),interval 1 day) as second_date from Activity group by player_id ) as Expected on Origin.player_id = Expected.player_id
step1:查询product_id和product_name。
step2:两表连接以product_id对齐,product_id 是 Product 表的外键(reference 列)。
step3:分组后筛选。
第一种:统计在范围内的数量是否和总数量相同。
第二种:直接使用min()和max()函数判断是否在范围内。
WHERE 子句:
WHERE 子句用于在查询中过滤行数据,基于给定的条件从表中选择满足条件的行。
WHERE 子句通常出现在 SELECT、UPDATE 或 DELETE 语句中。
WHERE 子句可以使用比较运算符(如等于、不等于、大于、小于等)和逻辑运算符(如 AND、OR、NOT)来指定条件。
WHERE 子句过滤的是行级数据,即它应用于表中的每一行,并决定哪些行将包含在查询结果中。
HAVING 子句:
HAVING 子句用于在查询中过滤分组数据,基于给定的条件从分组后的结果中选择满足条件的分组。
HAVING 子句通常出现在包含 GROUP BY 子句的 SELECT 语句中。
HAVING 子句可以使用聚合函数(如 SUM、AVG、COUNT 等)和比较运算符来指定条件。
HAVING 子句过滤的是分组级数据,即它应用于根据 GROUP BY 子句形成的每个分组,并决定哪些分组将包含在查询结果中
where
min(sale_date) >= '2019-01-01' and max(sale_date) <= '2019-03-31'
//这是错误的,因为where子句中是不能用聚类函数作为条件表达式的。
select
Product.product_id,Product.product_name
from
Product
left join
Sales
on
Product.product_id = Sales.product_id
group by
product_id
having
min(sale_date) >= '2019-01-01' and max(sale_date) <= '2019-03-31'
//或者使用count(sale_date between '2019-01-01' and '2019-03-31' or null) = count(*)
step1:查找满足条件的customer_id
step2:条件满足:顾客购买的产品种类和Product表中的种类相同。
select
customer_id
from
Customer
group by
customer_id
having count(distinct product_key) = (select count(*) from Product)
//统计某数量进行比较可用(select count(*) from Product)外面加括号
在力扣的题解中有人认为需要将Customer表中的product_key筛选为只在Product中存在product_key。
我认为可以根据使用场景的不同决定是否使用筛选语句,如果在商品类别多,数量多,需要单独查询某人是否购买了某一类别的全部产品则需要进行筛选;如果在商品类别单一,则不需要。
where
product_key in (select product_key from Product)
首先将表连接,以被汇报人(reports_to)和员工ID(employee_id)匹配
select
*
from
Employees as a1
left join
Employees as a2 //将表自连接
on
a1.employee_id = a2.reports_to //匹配,当未匹配时为NULL
删除a1表中null
select
*
from
Employees as a1
left join
Employees as a2
on
a1.employee_id = a2.reports_to
where
a2.employee_id is not null //删除全为null的行
根据reports_to进行分组统计汇报人(employee_id)的数量和年龄平均值,根据要求查询经理的id和名字
(employee_id 和name)
select
a1.employee_id,a1.name,count(a2.reports_to) as reports_count,round(avg(a2.age),0) as average_age
from
Employees as a1
left join
Employees as a2
on
a1.employee_id = a2.reports_to
where
a2.employee_id is not null
group by
a2.reports_to
order by
a1.employee_id
select //select在该代码所有关键字最后执行
a1.employee_id,a1.department_id
from //第一个执行
Employee as a1
left join //第三个执行
(select //第三个的第三个
employee_id,count(1) as num
from //第三个的第一个
Employee
group by //第三个的第二个
employee_id) as a2
on //第二个执行
a1.employee_id = a2.employee_id
where //第四个执行
a1.primary_flag = 'Y' or a2.num = 1 //筛选符合条件的行
满足任意两边之和大于第三边
SELECT
x,y,z,
CASE
WHEN x + y > z AND x + z > y AND y + z > x THEN 'Yes'
ELSE 'No'
END AS 'triangle'
FROM
triangle
;
若一个数与其前一个数相同并与其后一个数也相同,则满足题目要求。
//lag;向前找,即该元素在要找元素的后面
LAG(column_name, offset, default_value) OVER (PARTITION BY partition_column ORDER BY sort_column)
其中,column_name 是要访问的列名,offset 是要返回的偏移量(从 1 开始计数),default_value 是如果找不到匹配行时要返回的默认值。partition_column 是用于分组的列名,sort_column 是用于排序的列名。
//lead;向后找,即该元素在要找元素的前面
LEAD(column_name, offset) OVER (ORDER BY sort_column)
其中,column_name 是要访问的列名,offset 是要返回的偏移量(从 1 开始计数)。sort_column 是用于排序的列名。
select distinct num ConsecutiveNums
from (
select id,
num,
lag(num) over (order by id) as num_prior,
lead(num) over (order by id) as num_next
from Logs)a
where a.num = a.num_prior and num = a.num_next
首先查找每个商品修改价格的最大日期
select
product_id,max(change_date)
from
Products
where
datediff('2019-8-16',change_date)>=0
group by
product_id
嵌套查询每个商品在最大修改日期的new_price
select
product_id,new_price
from
Products
where (product_id,change_date) in (
select
product_id,max(change_date)
from
Products
where
datediff('2019-8-16',change_date)>=0
group by
product_id)
最后进行左连接查询
# # Write your MySQL query statement below select id.product_id, # case # when new_price is null then 10 # else new_price # end as price ifnull(new_price,10) as price //这里使用两种方法均可以 from (select distinct product_id from Products ) as id left join (select product_id,new_price from Products where (product_id,change_date) in ( //嵌套查询每个id的最大修改日期 select product_id,max(change_date) from Products where //将大于2019-8-16筛选掉 datediff('2019-8-16',change_date)>=0 group by product_id) )as span on id.product_id = span.product_id
使用sum(weight) over(order by turn)
根据turn排序,依次求和,直到大于1000
over (partition by xxxx order by xxxx asc/desc)
over()
时,求全体数据的和。select *,sum(weight) over() as sumWeight
from Queue
over(order by xxx)
时,对全表进行排序,依次求和select *,sum(weight) over(order by turn) as sumWeight
from Queue
over(partition by xxx)
时,分组求和select *,sum(weight) over(partition by turn) as sumWeight //根据turn分组求和,但在该例子中不同turn只有一个weight
from Queue
over(partition by xxx order by xxxx
时在每个分组内按照某顺序排序over(order by xxx)
当与from的子句order by 相同时,则相当只有over(order by xxx)
select *,sum(weight) over(order by turn) as sumWeight
from Queue
order by turn
select *,sum(weight) over(order by turn) as sumWeight
from Queue
order by person_id
select person_name
from (
select *,sum(weight) over(order by turn) as sumWeight
from Queue
) as t
where sumWeight <= 1000
order by sumWeight desc
limit 1
分别统计"Low Salary",“Average Salary”,"High Salary"最后使用UNION组合。
SELECT 'Low Salary' AS category, SUM(CASE WHEN income < 20000 THEN 1 ELSE 0 END) AS accounts_count FROM Accounts UNION SELECT 'Average Salary' category, SUM(CASE WHEN income >= 20000 AND income <= 50000 THEN 1 ELSE 0 END) AS accounts_count FROM Accounts UNION SELECT 'High Salary' category, SUM(CASE WHEN income > 50000 THEN 1 ELSE 0 END) AS accounts_count FROM Accounts
select
employee_id
from
Employees
where
salary<30000 and manager_id not in(select employee_id from Employees) //要连接子查询
order by
employee_id
1、查找评论电影数量最多的用户名,并满足:若平局则返回字典序较小的用户名。
2、查找在 February 2020 平均评分最高 的电影名称,并满足:如果出现平局,返回字典序较小的电影名称。
3、使用UNION ALL将两表合并显示
1、union 会自动压缩多个结果集合中的重复结果。
2、union all 则将所有的结果全部显示出来,不管是不是重复。
# Write your MySQL query statement below select a1.name as results from (select a2.name from (select distinct name,count(1) over(partition by MovieRating.user_id) as cnt from MovieRating left join Users on MovieRating.user_id = Users.user_id order by cnt desc,name //先按照cnt排序,再按照name排序 )a2 limit 1 )as a1 union all select p2.title as results from (select distinct b2.title,avg(b1.rating) over(partition by b1.movie_id ) as avg from MovieRating as b1 left join Movies as b2 on b1.movie_id = b2.movie_id where b1.created_at between "2020-02-01" and "2020-02-29" order by avg desc,b2.title limit 1) as p2 //from的子查询要起别名
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。