Interview Question of the Week #014 – How to DELETE Duplicate Rows

Here is another very common question I often received, which candidates face in the interview.

Question: How to delete duplicate rows in SQL Server (considering there is an identity column in table)?

Answer: Well, there is no better answer than following a script. If you ask me this question in the interview, I will even have to look up online and find following script:

DELETE
FROM
MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)

Here is the quick video I have recorded where I demonstrate the same functionality in simple words.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Previous Post
SQL SERVER – An Observation SET vs SELECT in T-SQL
Next Post
SQL SERVER – Transaction Log Backup Failing After Rebuilding Database Log File

Related Posts

No results found.

8 Comments. Leave new

  • Hi sir,
    I believe that this script is good where there is primary key in table. When there is no primary key in table the approach should be changed.

    Please provide clarification and best approach.

    Reply
    • Create a serial number using row_number() function and delete the duplicates

      delete t from (select *,row_number() over (partition by keycol order by keycol) as sno from table) as t where sno>1

      Reply
  • Hi,

    I have a table where there is no any identity column or any unique column and i want delete duplicate rows. Please help. This table contains about 30 millions data.

    Reply
    • Create a serial number using row_number() function and delete the duplicates

      delete t from (select *,row_number() over (partition by keycol order by keycol) as sno from table) as t where sno>1

      Reply
  • mekalanaresh0404
    April 6, 2015 10:31 am

    Hi pinal,

    I tried above script to delete my duplicate data,but I could not delete my data.
    we can delete duplicate data using cte expression thats easy and fast.

    Reply
  • mekalanaresh0404
    April 6, 2015 10:31 am

    Hi pinal,

    I tried above script to delete my duplicate data,but I could not delete my data.
    we can delete duplicate data using cte expression thats easy and fast.

    Reply
  • mekalanaresh0404
    April 6, 2015 10:41 am

    WITH CTE AS(
    SELECT id,name,ROW_NUMBER()OVER(PARTITION BY id ORDER BY id)as rn
    FROM dupdata
    )
    DELETE FROM CTE WHERE rn > 1

    Reply

Leave a Reply

Menu