至顶网›软件频道 ›在SQL中删除重复记录(多种方法)

在SQL中删除重复记录(多种方法)

扫一扫
分享文章到微信
扫一扫
关注官方公众号
至顶头条

　　学习sql有一段时间了，发现在我建了一个用来测试的表（没有建索引）中出现了许多的重复记录。后来总结了一些删除重复记录的方法，在Oracle中，可以通过唯一rowid实现删除重复记录；还可以建临时表来实现...这个只提到其中的几种简单实用的方法。

作者：中国IT实验室来源：中国IT实验室 2007年10月1日

关键字： SQL 数据库 SQL Server

　　学习sql有一段时间了，发现在我建了一个用来测试的表（没有建索引）中出现了许多的重复记录。后来总结了一些删除重复记录的方法，在Oracle中，可以通过唯一rowid实现删除重复记录；还可以建临时表来实现...这个只提到其中的几种简单实用的方法，希望可以和大家分享（以表employee为例）。
　　
　　SQL> desc employee
　　
　　 Name　　　　　　　　　　　　　　　　　　　Null?　　Type
　　 ----------------------------------------- -------- ------------------
　　
　　emp_id　　　　　　　　　　　　　　　　　　　　　　　　NUMBER(10)
　　emp_name　　　　　　　　　　　　　　　　　　　　　 VARCHAR2(20)
　　
　　salary　　　　　　　　　　　　　　　　　　　　　　　　　NUMBER(10,2)
　　
　　可以通过下面的语句查询重复的记录：
　　
　　SQL> select * from employee;
　　
　　　　EMP_ID EMP_NAME　　　　　　　　　　　　　　　　　SALARY
　　
　　---------- ---------------------------------------- ----------
　　
　　　　　　 1 sunshine　　　　　　　　　　　　　　　　　　　10000
　　
　　　　　　 1 sunshine　　　　　　　　　　　　　　　　　　　10000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　 20000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　 20000
　　
　　　　　　 3 xyz　　　　　　　　　　　　　　　　　　　　　 30000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　 20000
　　
　　SQL> select distinct * from employee;
　　
　　　　EMP_ID EMP_NAME　　　　　　　　　　　　　　　　　　 SALARY
　　
　　---------- ---------------------------------------- ----------
　　
　　　　　　 1 sunshine　　　　　　　　　　　　　　　　　　　10000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　 20000
　　
　　　　　　 3 xyz　　　　　　　　　　　　　　　　　　　　　　 30000
　　
　　SQL>　select * from employee group by emp_id,emp_name,salary having count (*)>1
　　
　　　　EMP_ID EMP_NAME　　　　　　　　　　　　　　　　　　 SALARY
　　
　　---------- ---------------------------------------- ----------
　　
　　　　　　 1 sunshine　　　　　　　　　　　　　　　　　　　10000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　　20000
　　
　　SQL> select * from employee e1
　　
　　where rowid in (select max(rowid) from employe e2
　　
　　where e1.emp_id=e2.emp_id and
　　
　　e1.emp_name=e2.emp_name and e1.salary=e2.salary);
　　
　　　　EMP_ID EMP_NAME　　　　　　　　　　　　　　　　　　 SALARY
　　
　　---------- ---------------------------------------- ----------
　　
　　　　　　 1 sunshine　　　　　　　　　　　　　　　　　　　10000
　　
　　　　　　 3 xyz　　　　　　　　　　　　　　　　　　　　　　 30000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　 20000
　　
　　2. 删除的几种方法：
　　
　　（1）通过建立临时表来实现
　　
　　SQL>create table temp_emp as (select distinct * from employee)
　　
　　SQL> truncate table employee; (清空employee表的数据）
　　
　　SQL> insert into employee select * from temp_emp;　(再将临时表里的内容插回来）
　　
　　( 2）通过唯一rowid实现删除重复记录.在Oracle中，每一条记录都有一个rowid，rowid在整个数据库中是唯一的，rowid确定了每条记录是在Oracle中的哪一个数据文件、块、行上。在重复的记录中，可能所有列的内容都相同，但rowid不会相同，所以只要确定出重复记录中那些具有最大或最小rowid的就可以了，其余全部删除。
　　
　　SQL>delete from employee e2 where rowid not in (
　　select max(e1.rowid) from employee e1 where
　　
　　e1.emp_id=e2.emp_id and e1.emp_name=e2.emp_name and e1.salary=e2.salary);--这里用min(rowid)也可以。
　　
　　SQL>delete from employee e2 where rowid <(
　　select max(e1.rowid) from employee e1 where
　　e1.emp_id=e2.emp_id and e1.emp_name=e2.emp_name and
　　e1.salary=e2.salary);
　　
　　（3）也是通过rowid，但效率更高。
　　
　　SQL>delete from employee where rowid not in (
　　select max(t1.rowid) from employee t1 group by
　　
　　t1.emp_id,t1.emp_name,t1.salary);--这里用min(rowid)也可以。
　　
　　　　EMP_ID EMP_NAME　　　　　　　　　　　　　　　　　　 SALARY
　　
　　---------- ---------------------------------------- ----------
　　
　　　　　　 1 sunshine　　　　　　　　　　　　　　　　　　　10000
　　
　　　　　　 3 xyz　　　　　　　　　　　　　　　　　　　　　　 30000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　 20000
　　
　　SQL> desc employee
　　
　　 Name　　　　　　　　　　　　　　　　　　　Null?　　Type
　　 ----------------------------------------- -------- ------------------
　　
　　emp_id　　　　　　　　　　　　　　　　　　　　　　　　NUMBER(10)
　　emp_name　　　　　　　　　　　　　　　　　　　　　 VARCHAR2(20)
　　
　　salary　　　　　　　　　　　　　　　　　　　　　　　　　NUMBER(10,2)
　　
　　可以通过下面的语句查询重复的记录：
　　
　　SQL> select * from employee;
　　
　　　　EMP_ID EMP_NAME　　　　　　　　　　　　　　　　　SALARY
　　
　　---------- ---------------------------------------- ----------
　　
　　　　　　 1 sunshine　　　　　　　　　　　　　　　　　　　10000
　　
　　　　　　 1 sunshine　　　　　　　　　　　　　　　　　　　10000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　 20000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　 20000
　　
　　　　　　 3 xyz　　　　　　　　　　　　　　　　　　　　　 30000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　 20000
　　
　　SQL> select distinct * from employee;
　　
　　　　EMP_ID EMP_NAME　　　　　　　　　　　　　　　　　　 SALARY
　　
　　---------- ---------------------------------------- ----------
　　
　　　　　　 1 sunshine　　　　　　　　　　　　　　　　　　　10000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　 20000
　　
　　　　　　 3 xyz　　　　　　　　　　　　　　　　　　　　　　 30000
　　
　　SQL>　select * from employee group by emp_id,emp_name,salary having count (*)>1
　　
　　　　EMP_ID EMP_NAME　　　　　　　　　　　　　　　　　　 SALARY
　　
　　---------- ---------------------------------------- ----------
　　
　　　　　　 1 sunshine　　　　　　　　　　　　　　　　　　　10000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　　20000
　　
　　
　　SQL> select * from employee e1
　　
　　where rowid in (select max(rowid) from employe e2
　　where e1.emp_id=e2.emp_id and
　　
　　e1.emp_name=e2.emp_name and e1.salary=e2.salary);
　　
　　　　EMP_ID EMP_NAME　　　　　　　　　　　　　　　　　　 SALARY
　　
　　---------- ---------------------------------------- ----------
　　
　　　　　　 1 sunshine　　　　　　　　　　　　　　　　　　　10000
　　
　　　　　　 3 xyz　　　　　　　　　　　　　　　　　　　　　　 30000
　　
　　　　　　 2 semon　　　　　　　　　　　　　　　　　　　　 20000
　　
　　2. 删除的几种方法：
　　
　　（1）通过建立临时表来实现
　　
　　SQL>create table temp_emp as (select distinct * from employee)
　　
　　SQL> truncate table employee; (清空employee表的数据）
　　
　　SQL> insert into employee select * from temp_emp;　(再将临时表里的内容插回来）
　　
　　( 2）通过唯一rowid实现删除重复记录.在Oracle中，每一条记录都有一个rowid，rowid在整个数据库中是唯一的，rowid确定了每条记录是在Oracle中的哪一个数据文件、块、行上。在重复的记录中，可能所有列的内容都相同，但rowid不会相同，所以只要确定出重复记录中那些具有最大或最小rowid的就可以了，其余全部删除。
　　
　　SQL>delete from employee e2 where rowid not in (
　　select max(e1.rowid) from employee e1 where
　　
　　e1.emp_id=e2.emp_id and e1.emp_name=e2.emp_name and e1.salary=e2.salary);--这里用min(row

查看本文来源

VIP专区

VIP用户

普通用户

邮件订阅

如果您非常迫切的想了解IT领域最新产品与技术信息，那么订阅至顶网技术邮件将是您的最佳途径之一。

重磅专题

往期文章

业界热点:

数字化转型东数西算自动驾驶智能制造云计算元宇宙虚拟人物联网算力网络数字孪生人工智能区块链开源大数据