赞
踩
xxx是表名。name是有重复值的字段(即列名)。 id是自增,id越大,说明创建时间越近。
- delete from t1 using xxx as t1, xxx as t2
- where t1.name = t2.name and t1.id < t2.id;
案例如下
- CREATE TABLE `emp` (
- `id` int(11) NOT NULL AUTO_INCREMENT,
- `name` varchar(100) DEFAULT NULL,
- `update_time` datetime DEFAULT NULL,
- PRIMARY KEY (`id`)
- ) AUTO_INCREMENT=0
- INSERT INTO emp
- (name, update_time)
- VALUES
- ('张三', '2023-09-06 00:00:00'),
- ('张三', '2023-09-06 00:00:00'),
- ('张三', '2023-09-06 00:00:00'),
- ('李四', '2023-09-06 00:00:00'),
- ('李四', '2023-09-06 00:00:00'),
- ('王五', '2023-09-06 00:00:00');
结果:
- id|name|update_time |
- --+----+-------------------+
- 1|张三 |2023-09-06 00:00:00|
- 2|张三 |2023-09-06 00:00:00|
- 3|张三 |2023-09-06 00:00:00|
- 4|李四 |2023-09-06 00:00:00|
- 5|李四 |2023-09-06 00:00:00|
- 6|王五 |2023-09-06 00:00:00|
- -- 查出重复的数据 重复的名字是哪个
- select name from emp group by name having count(1)>1;
结果:
- name|
- ----+
- 张三 |
- 李四 |
- -- 查出重复的数据 哪些人名是重复的
- select * from emp where name in
- (select name from emp group by name having count(1)>1);
结果:
- id|name|update_time |
- --+----+-------------------+
- 1|张三 |2023-09-06 00:00:00|
- 2|张三 |2023-09-06 00:00:00|
- 3|张三 |2023-09-06 00:00:00|
- 4|李四 |2023-09-06 00:00:00|
- 5|李四 |2023-09-06 00:00:00|
如果我想留下最新的一条,通常根据update_time。update_time离得越近,就留下那条。但有时候批量生成的数据,时间都一样,那就只能根据id(id如果是自增,或者雪花都好办)来判断。
- -- 删除重复的人名,留下最近更新的那条(id最大即最近更新)
- delete from t1 using emp as t1, emp as t2
- where t1.name = t2.name and t1.id < t2.id;
-
- -- 如果时间不一样,想通过时间来判断
- delete from t1 using emp as t1, emp as t2
- where t1.name = t2.name and t1.update_time < t2.update_time;
先来看看他查出来的是什么样子的数据
- select * from emp as t1, emp as t2
- where t1.name = t2.name and t1.id < t2.id;
结果:
- id|name |update_time |id|name |update_time |
- --+-----+--------------------+--+-----+-------------------+
- 1|张三 |2023-09-06 00:00:00| 2|张三 |2023-09-06 00:00:00|
- 1|张三 |2023-09-06 00:00:00| 3|张三 |2023-09-06 00:00:00|
- 2|张三 |2023-09-06 00:00:00| 3|张三 |2023-09-06 00:00:00|
- 4|李四 |2023-09-06 00:00:00| 5|李四 |2023-09-06 00:00:00|
查出来的是id为1和2,正好需要删除的是1和2,留下3。
所以可以大胆的使用这种写法:
- delete from 别名1 using 表名 as 别名1, 表名 as 别名2;
-
- delete from t1 using emp as t1, emp as t2
- where t1.name = t2.name and t1.id < t2.id;
- delete from emp where id in(
- select id from (
- select t1.id from emp t1,
- (select max(id) as id,name from emp group by name having count(1)>1) t2
- where t1.name = t2.name and t1.id != t2.id
- )t
- )
这个写法也很简单,他其实是这样的查询:将重复数据中id最大的那条作为一张表和主表关联,关联之后,查出名字相同但id不同的数据,这些数据就是要删除的。
- select * from emp t1,
- (select max(id) as id,name from emp group by name having count(1)>1) t2
- where t1.name = t2.name and t1.id != t2.id
查出结果:
- id|name |update_time |id|name|
- --+------+-------------------+--+----+
- 1|张三 |2023-09-06 00:00:00| 3|张三 |
- 2|张三 |2023-09-06 00:00:00| 3|张三 |
- 4|李四 |2023-09-06 00:00:00| 5|李四 |
然后再删除这几条数据
- delete from emp where id in(
- select t1.id from emp t1,
- (select max(id) as id,name from emp group by name having count(1)>1) t2
- where t1.name = t2.name and t1.id != t2.id
- )
但这样写会报错:
SQL 错误 [1093] [HY000]: You can't specify target table 'emp' for update in FROM clause
这意思是:需要更新的那张表和条件用到的表不能是同一张表。即子查询和修改的表不能是同一个。解决的方法也很简单:就是在子查询外面再套一层,让儿子变成孙子。
mysql delete 删除重复数据_delete mysql 重复数据_globalcoding的博客-CSDN博客
====================分割线=======================
文章到此已经结束,以下是紫薯布丁
delete from t1 using xxx as t1, xxx as t2
where t1.name = t2.name and t1.id < t2.id;
CREATE TABLE `emp` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(100) DEFAULT NULL,
`update_time` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8mb4
INSERT INTO emp
(name, update_time)
VALUES('张三', '2023-09-06 00:00:00'),
('张三', '2023-09-06 00:00:00'),
('张三', '2023-09-06 00:00:00'),
('李四', '2023-09-06 00:00:00')
-- 查出重复的数据 哪些人名是重复的
select name from emp group by name having count(1)>1
-- 删除重复的人名,留下最近更新的那条(id最大即最近更新)
delete from t1 using emp as t1, emp as t2
where t1.name = t2.name and t1.id < t2.id;
;id|name|update_time |
--+----+-------------------+
1|张三 |2023-09-06 00:00:00|
2|张三 |2023-09-06 00:00:00|
3|张三 |2023-09-06 00:00:00|
4|李四 |2023-09-06 00:00:00|
name|
----+
张三 |
id|name|update_time |
--+----+-------------------+
1|张三 |2023-09-06 00:00:00|
2|张三 |2023-09-06 00:00:00|
3|张三 |2023-09-06 00:00:00|
select * from emp where name in
(select name from emp group by name having count(1)>1)
-- 如果时间不一样,想通过时间来判断
delete from t1 using emp as t1, emp as t2
where t1.name = t2.name and t1.update_time < t2.update_time;
select * from emp as t1, emp as t2
where t1.name = t2.name and t1.id < t2.id;
id|name |update_time |id|name |update_time |
--+----+-------------------+--+----+-------------------+
1|张三 |2023-09-06 00:00:00| 2|张三 |2023-09-06 00:00:00|
1|张三 |2023-09-06 00:00:00| 3|张三 |2023-09-06 00:00:00|
2|张三 |2023-09-06 00:00:00| 3|张三 |2023-09-06 00:00:00|
delete from emp where id in(
select id from (
select t1.id from emp t1,
(select max(id) as id,name from emp group by name having count(1)>1) t2
where t1.name = t2.name and t1.id != t2.id
)t
)
SQL 错误 [1093] [HY000]: You can't specify target table 'emp' for update in FROM clause
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。