Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.
DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)
Watch the view to see the above concept in action:
[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]
Reference : Pinal Dave (https://blog.sqlauthority.com)





450 Comments. Leave new
@Rathin
You can do something like that
Select * from
(SELECT *,row_number() Over(partition by products_model order by products_last_modified desc) as RN from youttable)t
where t.RN=1
Its really good…I was looking for delete duplicate record query finally I got this here.
Thanks
Gulshan Bareja
hi,
Sorry i am mentioned that having the problem with Mysql DB
(The problem is i am having the columns of products_model, products_id(primary key), products_last_modified_date……etc
Here i have duplicate records of products_model, so i needs to remove old duplicate records and retain the new records based on products_last_modified_date column
kindly replay me)
Is it possible , can we solve this by using Group by & having function?
Hi Pinal,
Thanks for your post. I have another idea.
It can also be done in this way and it will work even if the Table does not have any identity column.
Query:
WITH xTab AS
(
SELECT
DuplicateColumn1,
DuplicateColumn2,
ROW_NUMBER() OVER (PARTITION BY DuplicateColumn3 ORDER BY Column3) AS [Count]
FROM MyTable
)
DELETE FROM xTab WHERE [Count] > 1
Hello Souvik,
Thank you for sharing the method of getting duplicate using CTE. I appreciate this method as it is easy and better.
But the PARTITION BY clause should include all columns of table while you have used only one column.
Regards,
Pinal Dave
Hi,
Please replace the sixth line of the query with this:
ROW_NUMBER() OVER (PARTITION BY DuplicateColumn3 ORDER BY DuplicateColumn3 )
Deleting all the duplicate rows from a table keeping only the distinct values(try out this).
create table test_a
(a int
)
insert into test_a values(1)
insert into test_a values(1)
insert into test_a values(3)
insert into test_a values(4)
insert into test_a values(5)
insert into test_a values(1)
insert into test_a values(2)
insert into test_a values(3)
insert into test_a values(4)
insert into test_a values(5)
WITH [test_a rowid]
AS (SELECT a.a,
Row_number()
OVER(ORDER BY a.a) num
FROM test_a a)
DELETE FROM [test_a rowid]
WHERE num NOT IN (SELECT Rank()
OVER(ORDER BY b.a)
FROM test_a b)
very very good query
very nice query..
thanks..
This is a nice blog.Provide some assignment to get sql skills.This is a nice blog
Here’s my dynamic dupes removal script. All it requires is the table name, make sure you have your db selected in the query editor dropdown. Just make @tablename=” and this’ll take care of the rest.
Set ansi_nulls ON if you want the script to ignore null values. If some of your dupes include nulls then leave it set to ansi_nulls off.
script:
—————————————————————–
DECLARE @tablename VARCHAR(255)
DECLARE @columnnames VARCHAR(MAX)
DECLARE @maxlength VARCHAR(max)
DECLARE @datatype VARCHAR(max)
DECLARE @columnlist VARCHAR(MAX)
DECLARE @select VARCHAR(MAX)
DECLARE @rowcount int
DECLARE @variablelist VARCHAR(MAX)
DECLARE @currentvariablename VARCHAR(MAX)
DECLARE @declarestatement VARCHAR(MAX)
DECLARE @singledeclare VARCHAR(MAX)
DECLARE @FINALscript VARCHAR(MAX)
DECLARE @wherestatement VARCHAR(MAX)
DECLARE @ansinulls VARCHAR(MAX)
SET @ansinulls = ‘SET ANSI_NULLS OFF’
SET @tablename=’tSTG_CNT_MVDB_SurveyExtract_Presurvey’
SET @columnlist= ”
SET @wherestatement = ”
SET @rowcount = ”
SET @maxlength = ‘asdfadf’
SET @columnnames = ”
SET @datatype = ”
SET @select = ”
SET @variablelist = ”
SET @currentvariablename = ”
SET @declarestatement = ”
SET @singledeclare = ”
SET @FINALscript = ”
SET @rowcount = (SELECT COUNT(column_name)
FROM information_schema.columns
WHERE table_name =@tablename)
DECLARE dublicate_cursor1 CURSOR READ_ONLY FOR
SELECT column_name,
DATA_TYPE,
CHARACTER_MAXIMUM_LENGTH
FROM information_schema.columns
WHERE table_name = @tablename
OPEN dublicate_cursor1
FETCH NEXT FROM dublicate_cursor1 INTO @columnnames,@datatype,@maxlength
WHILE @@FETCH_STATUS = 0
BEGIN
SET @currentvariablename = ‘@’ + @columnnames
IF @rowcount = 1
BEGIN
SET @columnlist = @columnlist + @columnnames
SET @variablelist = @variablelist + @currentvariablename + ‘,@Count’
IF @maxlength IS NULL
Begin
SET @singledeclare = ‘DECLARE ‘ + @currentvariablename + ‘ ‘ + @datatype
END
ELSE
BEGIN
SET @singledeclare = ‘DECLARE ‘ + @currentvariablename + ‘ ‘ + @datatype + ‘ (‘ + @maxlength + ‘)’
END
SET @declarestatement = @declarestatement + @singledeclare + ‘ DECLARE @Count int’
SET @wherestatement = @wherestatement + @columnnames + ‘=’ + @currentvariablename
END
ELSE
BEGIN
–PRINT ‘ELSE’
SET @columnlist = @columnlist + @columnnames + ‘,’
SET @variablelist = @variablelist + @currentvariablename + ‘,’
IF @maxlength IS NULL
Begin
SET @singledeclare = ‘DECLARE ‘ + @currentvariablename + ‘ ‘ + @datatype
END
ELSE
BEGIN
SET @singledeclare = ‘DECLARE ‘ + @currentvariablename + ‘ ‘ + @datatype + ‘ (‘ + @maxlength + ‘)’
END
SET @declarestatement = @declarestatement + ‘ ‘ + @singledeclare + ‘ ‘
SET @wherestatement = @wherestatement + @columnnames + ‘=’ + @currentvariablename + ‘ AND ‘
END
SET @rowcount = @rowcount – 1
FETCH NEXT FROM dublicate_cursor1 INTO @columnnames,@datatype,@maxlength
END
CLOSE dublicate_cursor1
DEALLOCATE dublicate_cursor1
SET @select = ‘SELECT ‘ + @columnlist + ‘,COUNT(*) -1 FROM ‘ + @tablename + ‘ GROUP BY ‘ + @columnlist + ‘ HAVING Count(*) > 1’
SET @FINALscript = @ansinulls + @declarestatement
SET @FINALscript = @FINALscript + ‘ DECLARE dublicate_cursor CURSOR FAST_FORWARD FOR ‘ + @select
SET @FINALscript = @FINALscript + ‘ OPEN dublicate_cursor ‘ + ‘FETCH NEXT FROM dublicate_cursor INTO ‘ + @variablelist
SET @FINALscript = @FINALscript + ‘ WHILE @@FETCH_STATUS = 0 BEGIN ‘ + ‘DELETE TOP(@Count) FROM ‘ + @tablename
SET @FINALscript = @FINALscript + ‘ WHERE ‘ + @wherestatement
SET @FINALscript = @FINALscript + ‘ FETCH NEXT FROM dublicate_cursor INTO ‘ + @variablelist + ‘ END ‘
SET @FINALscript = @FINALscript + ‘CLOSE dublicate_cursor DEALLOCATE dublicate_cursor’
EXEC(@FINALscript)
How to delete the last row from a table in SQL Server 2005 without knowing any column name,u know only table name and it should b in single query.
i have written this:
with t as
(
select row_number() OVER(ORDER BY (select count(*) from Emp)) as Ranking from Emp
)
delete from t where Ranking = (Select max(Ranking) from t)
this one working fine,but i need it in a single query
..
can any one help me in this.
Thanks & regards,
Sumit
It is not possible
Why do you want to do this in that way?
it is possible i hav already done it but it is not in single query.It is just for my research purpose..
can u please help me out..
thanks,
Sumit
What type of research are you doing?
As you showed in the last example it possible with row_number() only
Only possible method I can think of is
delete t from
(
select row_number() OVER(ORDER BY (select 0)) as Ranking from emp
) as t
where ranking=1
thanks 4 ur reply,but it is deleting the first row not the last row.
I am trying to take out the most from sql server 2005 which in future i can do some work with that..
any more suggestion plz..
tahnks,
Sumit
Hi Pinal,
Can u please help me regarding my query…
Thanks,
Sumit
See my previous reply to you
Final query will be like below:
delete t from
(
select top 1 row_number() OVER(ORDER BY (select 0)) as Ranking from emp order by Ranking Desc
) as t
thanks for your help
Regards,
Sumit
Done:
The final query will be like below:
delete t from
(
select top 1 row_number() OVER(ORDER BY (select 0)) as Ranking from emp
order by Ranking Desc
) as t
when we declare a constraint without giving name to it,then the system automatically n dynamically generates a name.
Now when we want to drop that constraint we have to get the name from system,is their any way to get that name directly from the system………
plzzzz help
thanks,
Sumit
sp_help 'table_name'
will list out the available constraints too
Hi Rahul ,
Your query is…
select * from testprice
group by id,item,price
This may not work
You need to explicitely name the columns
if there is no key in the table then what ?
I have on table named [Duplicate] {ID int,FNAME varchar(10),MNAME varchar(10)}
Here there is no key and here are duplicate rows. so hoca can i delete this duplicate rows.
Check Data
ID FNAME LNAME
1 AAA CCC
2 BBB DDD
1 AAA CCC
2 BBB DDD
1 AAA CCC
2 BBB DDD
3 BCB DGD
Remove duplicate rows and keep the data in to the table like this using single query.
ID FNAME LNAME
1 AAA CCC
1 AAA CCC
2 BBB DDD
2 BBB DDD
Query :
with RemoveDuplicate as
(
select *,row_number() over(partition by ID,FNAME,LNAME order by ID) as row from DUPLICATE
)
delete from RemoveDuplicate where row>1
thanks guyz it worked now……
with RemoveDuplicate as
(
select *,row_number() over(partition by num order by num) as row from Duplicates
)
delete from RemoveDuplicate where row>1
this is gr8 ……thanks all
Hi Dave, u r blog is really helpful………….in my interview they asked me this question
table name is ‘Duplicates’
and column name is ‘num’
Select * from Duplicates
num
———–
1
1
1
1
2
2
2
2
2
3
3
3
3
3
(14 row(s) affected)
Q)how to remove duplicates from the table ….I tried u r given query but didnt work …..can u plz guide me the right procedure, thanks
DELETE
FROM Duplicates
WHERE num NOT IN
(
SELECT MAX(num)
FROM Duplicates
GROUP BY num)
when I tried above query its say 0 rows affected
thanks guyz it worked now……
with RemoveDuplicate as
(
select *,row_number() over(partition by num order by num) as row from Duplicates
)
delete from RemoveDuplicate where row>1
this is gr8 ……thanks all
Hi guyz i am just wondering is this the right method to try …….based on above example…
Select * from Duplicates
select Distinct * into Dup_Rem from Duplicates
Drop table Duplicates
if u want to change Dup_Rem to its original name then I did this
select Distinct * into Duplicates from Dup_Rem
Select * from Duplicates
the above few queries worked perfectly and gave me the same output that want it………any suggestions welcome
Use row_number() version
It will avoid unneccessary creation of temporary tables
@Madhivanan
thanx for the reply bro…..i didnt know abt this function…..probably i’ll do some research…….do you have any tips for me ….coz in couple of days i am attending technical round….thanks once again…..
What kinds of tips?
Search for interview questions here
How to delete duplicate records from the table and there is no
primary key for the table like..
id name gender
1 aaa m
2 bbb f
1 aaa m
3 ccc f
3 ccc f
1 aaa m
I need to delete all the duplicate records such as id 1 is 2 times and id 3 is one time.
thank you
If you use version 2005 onwards use
delete t from
(
select row_number() over (partition by col1,col2,col3) as sno,* from table
) as t
where sno>1