Interview Question of the Week #014 – How to DELETE Duplicate Rows

April 5, 2015

SQL, SQL Interview Questions and Answers, SQL Server, SQL Tips and Tricks

Here is another very common question I often received, which candidates face in the interview.

Question: How to delete duplicate rows in SQL Server (considering there is an identity column in table)?

Answer: Well, there is no better answer than following a script. If you ask me this question in the interview, I will even have to look up online and find following script:

DELETE FROM MyTable WHERE ID NOT IN ( SELECT MAX(ID) FROM MyTable GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)

Here is the quick video I have recorded where I demonstrate the same functionality in simple words.

Reference: Pinal Dave (https://blog.sqlauthority.com)

SQL SERVER – An Observation SET vs SELECT in T-SQL

SQL SERVER – Transaction Log Backup Failing After Rebuilding Database Log File

No results found.

8 Comments. Leave new

prithvi
April 5, 2015 1:19 pm
Hi sir,
I believe that this script is good where there is primary key in table. When there is no primary key in table the approach should be changed.
Please provide clarification and best approach.
Reply
- madhivanan
  April 6, 2015 3:37 pm
  Create a serial number using row_number() function and delete the duplicates
  delete t from (select *,row_number() over (partition by keycol order by keycol) as sno from table) as t where sno>1
  Reply
Lokee
April 5, 2015 9:28 pm
Hi,
I have a table where there is no any identity column or any unique column and i want delete duplicate rows. Please help. This table contains about 30 millions data.
Reply
- madhivanan
  April 6, 2015 3:37 pm
  Create a serial number using row_number() function and delete the duplicates
  delete t from (select *,row_number() over (partition by keycol order by keycol) as sno from table) as t where sno>1
  Reply
mekalanaresh0404
April 6, 2015 10:31 am
Hi pinal,
I tried above script to delete my duplicate data,but I could not delete my data.
we can delete duplicate data using cte expression thats easy and fast.
Reply
mekalanaresh0404
April 6, 2015 10:31 am
Hi pinal,
I tried above script to delete my duplicate data,but I could not delete my data.
we can delete duplicate data using cte expression thats easy and fast.
Reply
mekalanaresh0404
April 6, 2015 10:41 am
WITH CTE AS(
SELECT id,name,ROW_NUMBER()OVER(PARTITION BY id ORDER BY id)as rn
FROM dupdata
)
DELETE FROM CTE WHERE rn > 1
Reply
- Pinal Dave
  April 7, 2015 3:29 pm
  Sure mekalanaresh0404
  Reply