SQL SERVER – Delete Duplicate Records – Rows

March 1, 2007

Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.

DELETE FROM MyTable WHERE ID NOT IN ( SELECT MAX(ID) FROM MyTable GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)

Watch the view to see the above concept in action:

[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]

Reference : Pinal Dave (https://blog.sqlauthority.com)

Duplicate Records, SQL Scripts

SQL SERVER – T-SQL Script to find the CD key from Registry

SQL SERVER – QUOTED_IDENTIFIER ON/OFF and ANSI_NULL ON/OFF Explanation

SQL SERVER – Denali – Introduction to SEQUENCE – Simple Example of SEQUENCE

January 29, 2011

SQL SERVER – Event ID 7000 – The System Cannot Find the File Specified

January 16, 2020

SQL SERVER – Comma Separated Values (CSV) from Table Column – Part 2

December 21, 2009

450 Comments. Leave new

Ashish Gilhotra
February 2, 2010 10:17 am
@Rathin
You can do something like that
Select * from
(SELECT *,row_number() Over(partition by products_model order by products_last_modified desc) as RN from youttable)t
where t.RN=1
Reply
Gulshan Bareja
February 2, 2010 6:46 pm
Its really good…I was looking for delete duplicate record query finally I got this here.
Thanks
Gulshan Bareja
Reply
rathin
February 2, 2010 8:50 pm
hi,
Sorry i am mentioned that having the problem with Mysql DB
(The problem is i am having the columns of products_model, products_id(primary key), products_last_modified_date……etc
Here i have duplicate records of products_model, so i needs to remove old duplicate records and retain the new records based on products_last_modified_date column
kindly replay me)
Is it possible , can we solve this by using Group by & having function?
Reply
Souvik
February 5, 2010 5:46 pm
Hi Pinal,
Thanks for your post. I have another idea.
It can also be done in this way and it will work even if the Table does not have any identity column.
Query:
WITH xTab AS
(
SELECT
DuplicateColumn1,
DuplicateColumn2,
ROW_NUMBER() OVER (PARTITION BY DuplicateColumn3 ORDER BY Column3) AS [Count]
FROM MyTable
)
DELETE FROM xTab WHERE [Count] > 1
Reply
- Pinal Dave
  February 5, 2010 10:25 pm
  Hello Souvik,
  Thank you for sharing the method of getting duplicate using CTE. I appreciate this method as it is easy and better.
  But the PARTITION BY clause should include all columns of table while you have used only one column.
  Regards,
  Pinal Dave
  Reply
Souvik
February 5, 2010 5:48 pm
Hi,
Please replace the sixth line of the query with this:
ROW_NUMBER() OVER (PARTITION BY DuplicateColumn3 ORDER BY DuplicateColumn3 )
Reply
pramod
March 17, 2010 11:41 am
Deleting all the duplicate rows from a table keeping only the distinct values(try out this).
create table test_a
(a int
)
insert into test_a values(1)
insert into test_a values(1)
insert into test_a values(3)
insert into test_a values(4)
insert into test_a values(5)
insert into test_a values(1)
insert into test_a values(2)
insert into test_a values(3)
insert into test_a values(4)
insert into test_a values(5)
WITH [test_a rowid]
AS (SELECT a.a,
Row_number()
OVER(ORDER BY a.a) num
FROM test_a a)
DELETE FROM [test_a rowid]
WHERE num NOT IN (SELECT Rank()
OVER(ORDER BY b.a)
FROM test_a b)
Reply
gaurav
April 13, 2010 3:48 pm
very very good query
Reply
lakshmi
April 14, 2010 11:59 am
very nice query..
thanks..
Reply
Rajesh
April 21, 2010 11:38 am
This is a nice blog.Provide some assignment to get sql skills.This is a nice blog
Reply
William Walsh
May 12, 2010 2:10 am
Here’s my dynamic dupes removal script. All it requires is the table name, make sure you have your db selected in the query editor dropdown. Just make @tablename=” and this’ll take care of the rest.
Set ansi_nulls ON if you want the script to ignore null values. If some of your dupes include nulls then leave it set to ansi_nulls off.
script:
—————————————————————–
DECLARE @tablename VARCHAR(255)
DECLARE @columnnames VARCHAR(MAX)
DECLARE @maxlength VARCHAR(max)
DECLARE @datatype VARCHAR(max)
DECLARE @columnlist VARCHAR(MAX)
DECLARE @select VARCHAR(MAX)
DECLARE @rowcount int
DECLARE @variablelist VARCHAR(MAX)
DECLARE @currentvariablename VARCHAR(MAX)
DECLARE @declarestatement VARCHAR(MAX)
DECLARE @singledeclare VARCHAR(MAX)
DECLARE @FINALscript VARCHAR(MAX)
DECLARE @wherestatement VARCHAR(MAX)
DECLARE @ansinulls VARCHAR(MAX)
SET @ansinulls = ‘SET ANSI_NULLS OFF’
SET @tablename=’tSTG_CNT_MVDB_SurveyExtract_Presurvey’
SET @columnlist= ”
SET @wherestatement = ”
SET @rowcount = ”
SET @maxlength = ‘asdfadf’
SET @columnnames = ”
SET @datatype = ”
SET @select = ”
SET @variablelist = ”
SET @currentvariablename = ”
SET @declarestatement = ”
SET @singledeclare = ”
SET @FINALscript = ”
SET @rowcount = (SELECT COUNT(column_name)
FROM information_schema.columns
WHERE table_name =@tablename)
DECLARE dublicate_cursor1 CURSOR READ_ONLY FOR
SELECT column_name,
DATA_TYPE,
CHARACTER_MAXIMUM_LENGTH
FROM information_schema.columns
WHERE table_name = @tablename
OPEN dublicate_cursor1
FETCH NEXT FROM dublicate_cursor1 INTO @columnnames,@datatype,@maxlength
WHILE @@FETCH_STATUS = 0
BEGIN
SET @currentvariablename = ‘@’ + @columnnames
IF @rowcount = 1
BEGIN
SET @columnlist = @columnlist + @columnnames
SET @variablelist = @variablelist + @currentvariablename + ‘,@Count’
IF @maxlength IS NULL
Begin
SET @singledeclare = ‘DECLARE ‘ + @currentvariablename + ‘ ‘ + @datatype
END
ELSE
BEGIN
SET @singledeclare = ‘DECLARE ‘ + @currentvariablename + ‘ ‘ + @datatype + ‘ (‘ + @maxlength + ‘)’
END
SET @declarestatement = @declarestatement + @singledeclare + ‘ DECLARE @Count int’
SET @wherestatement = @wherestatement + @columnnames + ‘=’ + @currentvariablename
END
ELSE
BEGIN
–PRINT ‘ELSE’
SET @columnlist = @columnlist + @columnnames + ‘,’
SET @variablelist = @variablelist + @currentvariablename + ‘,’
IF @maxlength IS NULL
Begin
SET @singledeclare = ‘DECLARE ‘ + @currentvariablename + ‘ ‘ + @datatype
END
ELSE
BEGIN
SET @singledeclare = ‘DECLARE ‘ + @currentvariablename + ‘ ‘ + @datatype + ‘ (‘ + @maxlength + ‘)’
END
SET @declarestatement = @declarestatement + ‘ ‘ + @singledeclare + ‘ ‘
SET @wherestatement = @wherestatement + @columnnames + ‘=’ + @currentvariablename + ‘ AND ‘
END
SET @rowcount = @rowcount – 1
FETCH NEXT FROM dublicate_cursor1 INTO @columnnames,@datatype,@maxlength
END
CLOSE dublicate_cursor1
DEALLOCATE dublicate_cursor1
SET @select = ‘SELECT ‘ + @columnlist + ‘,COUNT(*) -1 FROM ‘ + @tablename + ‘ GROUP BY ‘ + @columnlist + ‘ HAVING Count(*) > 1’
SET @FINALscript = @ansinulls + @declarestatement
SET @FINALscript = @FINALscript + ‘ DECLARE dublicate_cursor CURSOR FAST_FORWARD FOR ‘ + @select
SET @FINALscript = @FINALscript + ‘ OPEN dublicate_cursor ‘ + ‘FETCH NEXT FROM dublicate_cursor INTO ‘ + @variablelist
SET @FINALscript = @FINALscript + ‘ WHILE @@FETCH_STATUS = 0 BEGIN ‘ + ‘DELETE TOP(@Count) FROM ‘ + @tablename
SET @FINALscript = @FINALscript + ‘ WHERE ‘ + @wherestatement
SET @FINALscript = @FINALscript + ‘ FETCH NEXT FROM dublicate_cursor INTO ‘ + @variablelist + ‘ END ‘
SET @FINALscript = @FINALscript + ‘CLOSE dublicate_cursor DEALLOCATE dublicate_cursor’
EXEC(@FINALscript)
Reply
Sumit
May 18, 2010 11:33 am
How to delete the last row from a table in SQL Server 2005 without knowing any column name,u know only table name and it should b in single query.
i have written this:
with t as
(
select row_number() OVER(ORDER BY (select count(*) from Emp)) as Ranking from Emp
)
delete from t where Ranking = (Select max(Ranking) from t)
this one working fine,but i need it in a single query
..
can any one help me in this.
Thanks & regards,
Sumit
Reply
- Madhivanan
  May 18, 2010 12:23 pm
  It is not possible
  Why do you want to do this in that way?
  Reply
  - Sumit
    May 18, 2010 3:47 pm
    it is possible i hav already done it but it is not in single query.It is just for my research purpose..
    can u please help me out..
    thanks,
    Sumit
  - Madhivanan
    May 18, 2010 4:30 pm
    What type of research are you doing?
    As you showed in the last example it possible with row_number() only
    Only possible method I can think of is
    delete t from
    (
    select row_number() OVER(ORDER BY (select 0)) as Ranking from emp
    ) as t
    where ranking=1
  - Sumit
    May 18, 2010 5:57 pm
    thanks 4 ur reply,but it is deleting the first row not the last row.
    I am trying to take out the most from sql server 2005 which in future i can do some work with that..
    any more suggestion plz..
    tahnks,
    Sumit
Sumit
May 18, 2010 3:49 pm
Hi Pinal,
Can u please help me regarding my query…
Thanks,
Sumit
Reply
- Madhivanan
  May 18, 2010 4:31 pm
  See my previous reply to you
  Reply
  - Sumit
    May 18, 2010 6:10 pm
    Final query will be like below:
    delete t from
    (
    select top 1 row_number() OVER(ORDER BY (select 0)) as Ranking from emp order by Ranking Desc
    ) as t
    thanks for your help
    Regards,
    Sumit
Sumit
May 18, 2010 7:21 pm
Done:
The final query will be like below:
delete t from
(
select top 1 row_number() OVER(ORDER BY (select 0)) as Ranking from emp
order by Ranking Desc
) as t
Reply
Sumit
May 21, 2010 5:55 pm
when we declare a constraint without giving name to it,then the system automatically n dynamically generates a name.
Now when we want to drop that constraint we have to get the name from system,is their any way to get that name directly from the system………
plzzzz help
thanks,
Sumit
Reply
- Madhivanan
  May 21, 2010 6:42 pm
  sp_help 'table_name'
  will list out the available constraints too
  Reply
Satyendra
May 26, 2010 3:35 pm
Hi Rahul ,
Your query is…
select * from testprice
group by id,item,price
Reply
- Madhivanan
  May 26, 2010 8:09 pm
  This may not work
  You need to explicitely name the columns
  Reply
Magesh Murugesan
June 10, 2010 11:39 am
if there is no key in the table then what ?
I have on table named [Duplicate] {ID int,FNAME varchar(10),MNAME varchar(10)}
Here there is no key and here are duplicate rows. so hoca can i delete this duplicate rows.
Check Data
ID FNAME LNAME
1 AAA CCC
2 BBB DDD
1 AAA CCC
2 BBB DDD
1 AAA CCC
2 BBB DDD
3 BCB DGD
Remove duplicate rows and keep the data in to the table like this using single query.
ID FNAME LNAME
1 AAA CCC
1 AAA CCC
2 BBB DDD
2 BBB DDD
Query :
with RemoveDuplicate as
(
select *,row_number() over(partition by ID,FNAME,LNAME order by ID) as row from DUPLICATE
)
delete from RemoveDuplicate where row>1
Reply
- Azzu
  June 15, 2010 2:13 pm
  thanks guyz it worked now……
  with RemoveDuplicate as
  (
  select *,row_number() over(partition by num order by num) as row from Duplicates
  )
  delete from RemoveDuplicate where row>1
  this is gr8 ……thanks all
  Reply
Azzu
June 15, 2010 2:06 pm
Hi Dave, u r blog is really helpful………….in my interview they asked me this question
table name is ‘Duplicates’
and column name is ‘num’
Select * from Duplicates
num
———–
1
1
1
1
2
2
2
2
2
3
3
3
3
3
(14 row(s) affected)
Q)how to remove duplicates from the table ….I tried u r given query but didnt work …..can u plz guide me the right procedure, thanks
DELETE
FROM Duplicates
WHERE num NOT IN
(
SELECT MAX(num)
FROM Duplicates
GROUP BY num)
when I tried above query its say 0 rows affected
Reply
- Azzu
  June 15, 2010 2:14 pm
  thanks guyz it worked now……
  with RemoveDuplicate as
  (
  select *,row_number() over(partition by num order by num) as row from Duplicates
  )
  delete from RemoveDuplicate where row>1
  this is gr8 ……thanks all
  Reply
Azzu
June 15, 2010 2:40 pm
Hi guyz i am just wondering is this the right method to try …….based on above example…
Select * from Duplicates
select Distinct * into Dup_Rem from Duplicates
Drop table Duplicates
if u want to change Dup_Rem to its original name then I did this
select Distinct * into Duplicates from Dup_Rem
Select * from Duplicates
the above few queries worked perfectly and gave me the same output that want it………any suggestions welcome
Reply
- Madhivanan
  June 15, 2010 6:11 pm
  Use row_number() version
  It will avoid unneccessary creation of temporary tables
  Reply
Azzu
June 15, 2010 7:25 pm
@Madhivanan
thanx for the reply bro…..i didnt know abt this function…..probably i’ll do some research…….do you have any tips for me ….coz in couple of days i am attending technical round….thanks once again…..
Reply
- Madhivanan
  June 16, 2010 11:48 am
  What kinds of tips?
  Search for interview questions here
  Reply
krishna
June 25, 2010 4:04 pm
How to delete duplicate records from the table and there is no
primary key for the table like..
id name gender
1 aaa m
2 bbb f
1 aaa m
3 ccc f
3 ccc f
1 aaa m
I need to delete all the duplicate records such as id 1 is 2 times and id 3 is one time.
thank you
Reply
- Madhivanan
  June 25, 2010 4:24 pm
  If you use version 2005 onwards use
  delete t from
  (
  select row_number() over (partition by col1,col2,col3) as sno,* from table
  ) as t
  where sno>1
  Reply