今天碰到一個在上萬條記錄裡,存在着些相同的記錄,隻是主鍵和某個列不同,如何能用SQL語句,删除掉重複的呢?問題解答了很高興,總結下。
1、如果記錄所有字段的值都一樣的,就是完全重複的話,我也覺得用SELECT DISTINCT * FROM 生成一個臨時表,删掉舊表的做法比較好。
但表的記錄并非完全重複,比如有個主id是唯一的,要求也是把另一個字段name相同的去掉,隻保留一個。
delete from table as a where a.id not in(select top 1 id from table as b where a.name =b.name)
2、
有兩個意義上的重複記錄,一是完全重複的記錄,也即所有字段均重複的記錄,二是部分關鍵字段重複的記錄,比如Name字段重複,而其他字段不一定重複或都重複可以忽略。
a、對于第一種重複,比較容易解決,使用
1)、a)、查找表中多餘的重複記錄,重複記錄是根據單個字段(peopleId)來判斷
select * from people
where peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1)
b)、删除表中多餘的重複記錄,重複記錄是根據單個字段(peopleId)來判斷,隻留有rowid最小的記錄
delete from people
where peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1)
and rowid not in (select min(rowid) from people group by peopleId having count(peopleId )>1)
c)、查找表中多餘的重複記錄(多個字段)
select * from vitae a
where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
d)、删除表中多餘的重複記錄(多個字段),隻留有rowid最小的記錄
delete from vitae a
where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)
e)、查找表中多餘的重複記錄(多個字段),不包含rowid最小的記錄
select * from vitae a
where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)
and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)
比方說在A表中存在一個字段“name”,而且不同記錄之間的“name”值有可能會相同,
現在就是需要查詢出在該表中的各記錄之間,“name”值存在重複的項;
Select Name,Count(*) From A Group By Name Having Count(*) > 1
select distinct * from tableName
就可以得到無重複記錄的結果集。
如果該表需要删除重複的記錄(重複記錄保留1條),可以按以下方法删除
select distinct * into #Tmp from tableName
drop table tableName
select * into tableName from #Tmp
drop table #Tmp
發生這種重複的原因是表設計不周産生的,增加唯一索引列即可解決。
b、這類重複問題通常要求保留重複記錄中的第一條記錄,操作方法如下
假設有重複的字段為Name,Address,要求得到這兩個字段唯一的結果集
select identity(int,1,1) as autoID, * into #Tmp from tableName
select min(autoID) as autoID into #Tmp2 from #Tmp group by Name,autoID
select * from #Tmp where autoID in(select autoID from #tmp2)
3、執行個體:
A表結構:
ID RQ SJ C
--------------------------------------------
1 2010-07-14 14:20:50 A1
2 2010-02-15 05:12:23 A1
3 2010-07-14 14:20:50 A1
4 2010-06-16 16:16:16 A2
5 2010-06-16 16:16:16 A2
6 2010-05-18 05:10:35 A3
7 2010-02-15 05:12:23 A1
--------------------------------------------
求SQL語句一條,把表A中 RQ,SJ,C 三個字段有相同的重複記錄删除.
得到的結果:
ID RQ SJ C
--------------------------------------------
1 2010-07-14 14:20:50 A1
2 2010-02-15 05:12:23 A1
4 2010-06-16 16:16:16 A2
6 2010-05-18 05:10:35 A3
--------------------------------------------
Delete from A Where ID Not In (Select Min(ID) from A Group By RQ,SJ,C )
Delete a from tb a inner join tb as b on a.fid <b.fid and a.c=b.c and a.rq=b.rq and a.sj=b.sj
delete from A t
where exists(select 1 from A where ID <A.ID and SJ=t.SJ and RQ=t.RQ and C=t.c)