赞
踩
问题描述:千辛万苦往数据库里存了几十万条数据,发现由于程序问题,有将近10万条的冗余数据,此时内心是无比崩溃的,关于怎么查询是否有冗余数据见上一篇文章(https://my.oschina.net/u/3636678/blog/2967373)。
尝试1:首先想到的当然就是delete语句啦,如下所示:
sql = "delete from all_askrep a where (a.askname, a.atime) in (select askname,atime from all_askrep group by askname,atime having count(*)>1) and askrepid not in (select min(askrepid) from all_askrep group by askname,atime having count(*)>1)"
cursor.execute(sql)
conn.commit()
然后,发现一个小时后仍然没结束,我就终止了操作,之后我加了一个rownum < 1000,先看看删除前一千行数据需要的时间,发现也需要个二十分钟,这有点过分啊!!
尝试2:PL/sql方式执行,由于不太懂pl/sql语法,又急需对数据进行删除操作,所以这次放弃了,相关例子可参考博文(https://www.cnblogs.com/nayitian/p/3238251.html)
尝试3:利用临时表删除数据,实践证明这才符合我的需求,两秒内执行完所有删除操作。在尝试第二种方法的时候安装了plsqldev工具,借助工具执行的删除操作,当然了,不用工具是一样的。代码如下(创建临时表-》借助临时表删除相关数据-》删除临时表):
create table tempaskrep as select a.askname, a.atime, a.askrepid as dataid from all_askrep a where a.askrepid not in (select max(b.askrepid) from all_askrep b group by b.askname,b.atime);
commit;
delete from all_askrep where askrepid in (select dataid from tempaskrep);
commit;
drop table tempaskrep;
commit;
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。