SQL SERVER – Delete Duplicate Rows

June 23, 2009

I had previously penned down two popular snippets regarding deleting duplicate rows and counting duplicate rows. Today, we will examine another very quick code snippet where we will delete duplicate rows using CTE and ROW_NUMBER() feature of SQL Server 2005 and SQL Server 2008.

This method is improved over the earlier method as it not only uses CTE and ROW_NUMBER, but also demonstrates the power of CTE with DELETE statement. We will have a comprehensive discussion about it later in this article. For now, let us first create a sample table from which we will delete records.

/*Create Table with 7 entries - 3 are duplicate entries */
CREATE TABLE DuplicateRcordTable (Col1 INT, Col2 INT)
INSERT INTO DuplicateRcordTable
SELECT 1, 1
UNION ALL
SELECT 1, 1 --duplicate
UNION ALL
SELECT 1, 1 --duplicate
UNION ALL
SELECT 1, 2
UNION ALL
SELECT 1, 2 --duplicate
UNION ALL
SELECT 1, 3
UNION ALL
SELECT 1, 4
GO

The above table has total 7 records, out of which 3 are duplicate records. Once the duplicates are removed we will have only 4 records left.

/* It should give you 7 rows */
SELECT *
FROM DuplicateRcordTable
GO

SQL SERVER - Delete Duplicate Rows dup1

The most interesting part of this is yet to come. We will use CTE that will re-generate the same table with additional column, which is row number. In our case, we have Col1 and Col2 and both the columns qualify as duplicate rows. It may be a different set of rows for each different query like this. Another point to note here is that once CTE is created DELETE statement can be run on it. We will put a condition here – when we receive more than one rows of record, we will remove the row which is not the first one. When DELETE command is executed over CTE it in fact deletes from the base table used in CTE.

/* Delete Duplicate records */
WITH CTE (COl1,Col2, DuplicateCount)
AS
(
SELECT COl1,Col2,
ROW_NUMBER() OVER(PARTITION BY COl1,Col2 ORDER BY Col1) AS DuplicateCount
FROM DuplicateRcordTable
)
DELETE
FROM CTE
WHERE DuplicateCount > 1
GO

It is apparent that after delete command has been run, we will have only 4 records, which is almost the same result which we would have got with DISTINCT, with this resultset. If we had more than 2 columns and we had to run unique on only two columns, our distinct might have not worked here . In this case, we would have to use above the mentioned method.

/* It should give you Distinct 4 records */
SELECT *
FROM DuplicateRcordTable
GO

SQL SERVER - Delete Duplicate Rows dup2

This method is a breeze and we can use this for SQL Server version 2005 and the later versions.

Watch the view to see the above concept in action:

Reference : Pinal Dave (https://blog.sqlauthority.com)

Duplicate Records, SQL Function, SQL Scripts, SQL Server

SQL SERVER – Interesting Observation – Using sqlcmd From SSMS Query Editor

SQL SERVER – Connecting to Server Using Windows Authentication by SQLCMD

SQL SERVER – Stored Procedure Parameters and Single Quotes

December 10, 2020

SQL SERVER – Color Coding SQL Server Management Studio Status Bar – SQL in Sixty Seconds #023 – Video

August 22, 2012

SQL SERVER – FIX: Msg 15170, Level 16, This login is the Owner of 1 Job(s). You Must Delete or Reassign these Jobs Before the Login can be Dropped

June 21, 2017

146 Comments. Leave new

Michele
February 27, 2010 3:04 am
THANK YOU VERY MUCH!!! Your sharing saves me so much time and helps me resolve the issue quickly.
Reply
Ramavtar Rajput
March 3, 2010 3:57 pm
delete a from tr_myactivity a where (select count(*) from tr_myactivity b where a.activitydate = b.activitydate)>1
Reply
- Madhivanan
  March 10, 2010 3:53 pm
  Did you test your query with row_number() approach?
  I think your method may take more time on the large set of data
  Reply
Lalit
March 23, 2010 5:01 pm
Really a Good article for beginner.
Reply
veeru
April 15, 2010 5:12 pm
try this:
create table dupdata(id int,name varchar(30),sal money)
insert into dupdata values(1,’veeru’,2100)
insert into dupdata values(1,’veeru’,2100)
insert into dupdata values(2,’sud’,2100)
insert into dupdata values(2,’sud’,2100)
alter table dupdata add rowid int identity(1,1)
delete from dupdata where rowid in(select max(rowid) from dupdata group by id,name,sal
having count(*)>1)
alter table dupdata drop column rowid
select * from dupdata
Reply
John
May 8, 2010 2:20 am
This CTE trick is the best thing since sliced bread!
Unfortunately, I had invested (Wasted) several hours trying to delete duplicate rows by using inner join and by first inserting all the duplicates into a Temp table. Whew!
This may have saved my sanity :)
THANK YOU Dave!
Reply
Conrad
May 20, 2010 11:03 am
Exactly what I needed. THANK YOU VERY MUCH!
Reply
Awais
July 6, 2010 12:34 am
Can any body guide how to use CTE if table has several columns. I am using following query but getting error.
WITH CTE ([MSC-S KPI],*,DuplicateCount)
AS
(
SELECT [MSC-S KPI],*
ROW_NUMBER() OVER(PARTITION BY [MSC-S KPI],* ORDER BY [MSC-S KPI]) AS DuplicateCount
FROM dbo.KPI_pl$
)
DELETE
FROM CTE
WHERE DuplicateCount > 1
Go
Msg 102, Level 15, State 1, Line 1
Incorrect syntax near ‘*’.
Msg 102, Level 15, State 1, Line 5
Incorrect syntax near ‘ROW_NUMBER’.
Is it necessary to enlist all the columns in CTE expression.?
Reply
sam
July 8, 2010 6:18 am
How can I add mins on current time and display
time incrementing by added mins
For eg:
13:25 + 15 mins = 13:14
or
13:45 + 15 mins = 14:00
or
13:50 + 15 mins = 14:05
Thank you in advance
Sam
Reply
- sakaravarthi
  September 30, 2011 2:16 pm
  select GETDATE(),DATEADD(Minute,15,getdate())
  Reply
sam
July 15, 2010 5:01 am
I actually figured out the answer:
Select Convert(char(5),DateAdd(mi,15,GetDate()),114)
Reply
happy
September 20, 2010 9:54 am
hello sir
your duplicate rows query is very best
sir i have 6 column in table (employee)
column name
ip_code
11–12345656
11-23657896
11-34784523
11-09864584
11-54830765
11-67094587
means total column 200 only
i want a that only 5 interger value come in output
means
11-123
11-236
11-347
11-098
11-548
plz sir answer this question becoz this query is very important for me
thankyou
pinal dave (sir)
Reply
- sakaravarthi
  September 30, 2011 2:19 pm
  select top 10 LEFT(,20) from
  Reply
  - sakaravarthi
    September 30, 2011 2:19 pm
    select LEFT(,20) from
Krishna
October 19, 2010 7:10 pm
thnx ..nice post..
Reply
Subi
November 11, 2010 11:36 pm
Very short and impressive solution. Thanks a lot.
Reply
pravin
December 4, 2010 3:39 pm
nice post …
this is an interview question ..
thanks..
Reply
honeyshyam
February 7, 2011 5:23 pm
Very Useful article.
Thanks
Shyam
Reply
pankaj
February 18, 2011 7:06 pm
this is very good article.
Reply
RijalMohammed
April 12, 2011 10:24 am
I have a result set from aTable Colors like follows which having coloms Name and Color
Name color
A Green
A RED
A Yellow
Davis Bloon
Davis Red
I want to replace that the result set as follows.
Name color
A Green
RED
Yellow
Davis Bloon
Red
Reply
- madhivanan
  April 12, 2011 1:40 pm
  This is the presentation issue. Where do you want to show such formatted data?
  Reply
- sakaravarthi
  September 30, 2011 2:44 pm
  WITH cte (clr,ROwNum)as
  (SELECT clr,row_Number() OVER(PARTITION BY LEFT(clr,CHARINDEX(SPACE(1),clr)-1) ORDER BY LEFT(clr,CHARINDEX(SPACE(1),clr)-1)) FROM color)
  –select * from cte where ROwNum>1
  –select clr,CHARINDEX(SPACE(1),clr),
  –LEFT(clr,CHARINDEX(SPACE(1),clr)-1),
  UPDATE cte SET clr=Right(clr,len(clr)-CHARINDEX(SPACE(1),clr))–AS ToUpdate from cte
  WHERE ROwNum>1
  Reply
Manish
May 10, 2011 9:35 am
This post is very very helpful.
Reply
aryan
May 20, 2011 11:41 am
Hi Panel but i still confuse about ur query
i try at home and get the result
but i don’ t understand why
we write
DELETE FROM CTS WHERE DC>1
it means the delete the records where dc is greater than 1
but in ur row number column no records is greater than 1
then how ur query work
Please help me out
Short Knowledge is Bigger hurdle in Success.
Reply
eddie jasinski
June 7, 2011 7:45 pm
Great post!!
Thanks
Reply
l
June 19, 2011 10:16 pm
what is CTE? Can you please elaborate
Reply
- sakaravarthi
  September 30, 2011 2:46 pm
  comman Table Expression
  https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2008-r2/ms190766(v=sql.105)
  Reply