SQL SERVER – Delete Duplicate Records – Rows

March 1, 2007

Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.

DELETE FROM MyTable WHERE ID NOT IN ( SELECT MAX(ID) FROM MyTable GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)

Watch the view to see the above concept in action:

[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]

Reference : Pinal Dave (https://blog.sqlauthority.com)

Duplicate Records, SQL Scripts

SQL SERVER – T-SQL Script to find the CD key from Registry

SQL SERVER – QUOTED_IDENTIFIER ON/OFF and ANSI_NULL ON/OFF Explanation

MySQL Performance – Slow Query and innodb_buffer_pool_size

August 9, 2021

SQL SERVER – Error 33222, Level 16 – Audit ‘MyAudit’ Failed to Start

March 31, 2017

SQL SERVER – Query to Find Seed Values, Increment Values and Current Identity Column value of the table

April 23, 2007

450 Comments. Leave new

Divya Mahendra Sikarwar
February 16, 2009 2:17 am
Execute these queries for learning how to remove duplicate record.
CREATE TABLE dbo.Test1 (
[ID] [int] ,
[FirstName] [varchar](25),
[LastName] [varchar](25)
) ON [PRIMARY]
INSERT INTO Test1 VALUES(1, ‘Bob’,’Smith’)
INSERT INTO Test1 VALUES(2, ‘Dave’,’Jones’)
INSERT INTO Test1 VALUES(3, ‘Karen’,’White’)
INSERT INTO Test1 VALUES(1, ‘Bob’,’Smith’)
INSERT INTO Test1 VALUES(4, ‘Bobby’,’Smita’)
select identity(int,1,1) as SlNo,* into #temp from Test1
DELETE
FROM #temp
WHERE SlNo NOT IN
(
SELECT MAX(SlNo)
FROM #temp
GROUP BY ID,FirstName,lastname
)
drop table test1
select * into test1 from #temp
alter table test1 drop column SlNo
select * from test1 order by id
Reply
Praney
February 24, 2009 8:25 pm
–Try this to delete multiple records
;with t1 as
(
select id,Row_number() over (partition by a, b, c order by a, b, c) as rnum
from mytable with(nolock)
)
delete from t1 where rnum>1
Reply
gaurav moolani
March 1, 2009 9:57 am
hi guys
i have a problem ??? what if the entire row is duplicate ??
with i dentity column also same .
i am transfering data from a table having duplicate rows to a table having a primary key
pls help me out
Reply
doug in fl
March 2, 2009 6:36 am
gaurav,
Praney’s sql in the comments will help you out. I have the same situation and that solved the problem.
;with t1 as
(
select id,Row_number() over (partition by a, b, c order by a, b, c) as rnum
from mytable with(nolock)
)
delete from t1 where rnum>1
Reply
Luke
March 4, 2009 12:25 pm
Hi Gaurav,
I have duplicate records in a table,till recently i added a
datetime field in my table called timestamp.Now when
duplicates go into this table i will be able to see the current
date in the timestamp
Firstname Lastname memberno timestamp
Jack Hill 11111 2009-12-11
Jack Hill 11111 null
Simon Philip 222222 2009-12-11
Simon Philip 222222 2009-12-12
Sam Wheat 33333 nulll
Sam Wheat 33333 null
In this table i want to fetch a duplicate set which has
one record with a valid timestamp and other timestamp
as null – something like
Firstname Lastname memberno timestamp
Jack Hill 11111 2009-12-11
Jack Hill 11111 null
select max(timestamp) from member group by firstname,lastname having count(*)>1
How do i modify my display duplicate statement to have the resultset which includes duplicates with and without
a valid timestamp
Reply
Luke
March 4, 2009 12:28 pm
Hi Pinal
With reference to the above question
Can anyone help me out with a query which can display only duplicates with or witout a timestamp, they should
not include duplicates members all having timestamps
Reply
tejasnshah
March 4, 2009 5:06 pm
Hi Luke,
You can use ROW_Number() to identify Duplicate Rows
Here You need to do as:
;with CTE AS(
select Row_NUmber() (OVER PARTITION BY FirstName, LastName, MemberNo Order BY TimeStamp ASC) AS RowID
FROM table
)
select * from cte where rowID=1
this will give you Records with NULL. If you want data with Dates then you need to change “ORDER BY” to “Order BY TimeStamp DESC”
For further details, you can read it on my blog:
Thanks,
Tejas
Reply
Krish
March 13, 2009 6:16 pm
Hi Pinal,
Your articles have always helped me. Just curious if I could rationalize data in the example below using a variant of this example (I have an option to use SSIS Fuzzy lookup, but its way off my application scope. So it is ruled out.)
Cheers,
Krish
Reply
VIKAS
April 23, 2009 12:47 pm
HI PINAL,
I HAVE READ YOUR ARTICLES AND RESPONSES FOR THE QUERIES AND I FOUND IT GOOD.
I TOO HAVE ONE QUERY.
THERE IS TABLE1 AND TABLE2
TABLE 1 CONTAINS 100 RECORDS.
TABLE 2 CONTAINS 150 RECORDS.
TABLE 2 CONTAINS SOME RECORDS WHICH ARE ALREADY PRESENT IN TABLE 1.
NOW WHEN I MERGE BOTH IT ADDS ALL THE RECORDS GIVING TOTAL NO. TO 250.
FROM THESE I WANT TO REMOVE THE DUPLICATE DATA WHICH I GOT FROM TABLE 2 AND ARE ALREADY PRESET IN TABLE 1.
I have tried above methods but did not yield required results.
Be Helpful…
Reply
VIKAS
April 23, 2009 4:49 pm
I got the solution of above problem
FRIEND in a NEED
Another Question.
i am joining TABLE1 (older) And TABLE2 (newer).
i found some records which are common in both table.
i want to remove found common data from TABLE2(newer)
How can i perform that operation?
GENERAL QUESTION
How can we perform the Update,Delete operation on query of join
–common records in two tables using inner join
select *
from TABLE1 as e
inner join TABLE2 as c
on e.respondentid = c.respondentid
Reply
Brian Tkatch
April 23, 2009 5:33 pm
Vikas, will UNION work for you?
SELECT … FROM Table_1
UNION
SELECT … FROM Table_2
Reply
Brian Tkatch
April 23, 2009 6:36 pm
Vikas, for the second question:
DELETE FROM Table2 WHERE EXISTS(SELECT * FROM Table1 WHERE Table1.respondentid = Table2.respondentid)
Reply
VIKAS
April 24, 2009 2:47 pm
thnx buddy tkatch.
Reply
VIKAS
April 28, 2009 2:42 pm
i have table TABLE1 with data inside.
i want to add one column name as “Serial Number”.
inside this i want to give Automatic Serial No.
Say there are 5000 records then the Serial Number Column should contain1,2,3,4…..5000 at last record.
Be Beedful
Reply
RAJ
April 28, 2009 3:02 pm
i got the solution of above query.
“alter table table1 add ID INT IDENTITY(1,1)”
But this inserts the column at the end.
i want this column to be inserted at the start.
i.e. First Column should be of identity Column
Reply
Nayan
April 29, 2009 2:02 pm
i have a table TABLE1 of Five Column.
I want to insert one more column in this table after First Column.
How can i insert the New Column from SQL Editor in between the Columns Present not at the end?
Reply
Sandy
April 29, 2009 2:24 pm
Hi Pinal,
I HAVE READ YOUR ARTICLES AND RESPONSES FOR THE QUERIES AND I FOUND IT GOOD.
Actully i m new in SQL SERVER2005.
I stucked in a query, if find the solution:-
Table1 Table2
col1|col2 col1|col2
10 100 10 100
11 150 15 200
15 200 13 210
21 210 23 100
Output should be:
col1|col2
10 100
11 150
21 210
13 210
23 100
Reply
Sandy
April 29, 2009 2:26 pm
Hi Pinal,
i have one more query:
Delete the duplicates rows from a table if there is not any primary key.
Please find the solution and Thanks in advance.
Reply
Gates
May 1, 2009 12:30 pm
To aLL,
I have one table TABLE! , which contains one column named “EMPLOYEE NAME”.
There are various values inside it.
i want to check every field of EMPLOYEE NAME and find out if there is any Numeric Value inside.
EG.
EMPLOYEE NAME
Billgates
George
A1dam –Here 1 is present in between
Gem9ini –Here 9 is present in between
i want to identify this types of name which contains the numeric value in the field.
Reply
Domenic
May 6, 2009 9:30 pm
using the instructions that you posted about adding an id column:
“Good Question. I have been asking this question in interview many times to new candidates. Answer is:
1) Add Identity Col and perform the operation displayed in this blog and drop Identity Col.
2) User Cursor and save your values in temp table. If you receive the same row which you put in temp table delete it.
3) You can use while loop as well do the same as cursor”
i get the error message from sql telling me it cannot insert the value NULL into column “id” column does not allow nulls (i have got this after trying to set the primary key on the id field)
Reply