SQL SERVER – Delete Duplicate Records – Rows

Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.

DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)

Watch the view to see the above concept in action:

[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]

Reference : Pinal Dave (https://blog.sqlauthority.com)

Duplicate Records, SQL Scripts
Previous Post
SQL SERVER – T-SQL Script to find the CD key from Registry
Next Post
SQL SERVER – QUOTED_IDENTIFIER ON/OFF and ANSI_NULL ON/OFF Explanation

Related Posts

450 Comments. Leave new

  • Here is an another query where we could use to delete the duplicate rows…

    delete from table where table.%%physloc%% not in (select MIN(%%physloc%%) from table
    group by duplicate columns)

    Reply
  • We can use INTERSECTION on same table to fetch unique records by following query,

    SELECT * into DupRecordsTbl_bkp FROM
    (SELECT * FROM DupRecordsTbl INTERSECT SELECT * FROM DupRecordsTbl)B
    DELETE FROM DupRecordsTbl
    INSERT INTO DupRecordsTbl SELECT * FROM DupRecordsTbl_bkp

    Keeps the unique record into a new table. Deletes the original table records and reinsert the unique records into original table.

    Here we need no info about columns as it compares the whole row.

    Reply
  • hoe can improve query performance, hear no need use indexes or vies ect, forexample select * from my table query execution time is 5 min but i want execute it with in a 2min how can it posible, plz give me rply

    Reply
  • select distinct * into dbo.table_new from dbo.table_duplicates

    This will copy all distinct rows from one table, create a new table and insert the distinct rows only.

    Reply
  • nice post..

    Reply
  • Hi Pinal

    how there can be duplicate rows in a table with identity column.Is this importing same file more than once or some other reason….if it is b’coz of bad database structure….then what it means…plz clearify…

    Thanks
    Divya

    Reply
  • Hi all
    I am new in SQL Server and I have a bit different problem. I can’t figure out how to solve. I’ve read some comments but I couldn’t find my problem.

    Suppose that we have table “First_tbl” with column “A_clm” (with 1000 records) and table “Second_tbl” with column “B_clm” (with 200 records).
    I want to delete records which exist in “B_clm” from “A_clm”. I mean finally “A_clm” should have 800 records.
    Can anybody help me about this?

    Thanks

    Reply
  • Assuming that a table named “Actual” has redundant data which we wanted to get rid off then

    create a test table

    select * into test from Actual where 1=0

    /* in case if you have an idenity column which is the only thing that differentiates two records then create a table excluding that column. */

    Now populate the distinct records from Actual table into test using

    insert into test
    select distinct * from Actual

    Reply
  • Thanks for this original post. Very easy and saved me a ton of time.

    Reply
  • Great article, Thank you

    Reply
  • watchout4keith
    May 30, 2012 7:15 am

    Dear Dave, Once again I’m humbled by your awesomeness on this simple solution. If you need me I’ll be sitting in mud somewhere questioning my reason to exist.

    Reply
  • Hi,

    I am in urgent need of a solution. My problem is, i need to retrieve the data which has duplicate records and need to have the row number start from 1 to the number of records and start again from 1 for the next start of the duplicate record. Like from the below example, i have the first 2 columns, i need a query to generate the 3rd column

    Ex:

    ProductID Amount RowNumber
    1 200.00 1
    1 330.00 2
    2 400.00 1
    2 600.00 2
    2 350.00 3
    3 200.00 1
    4 330.00 1
    5 200.00 1
    5 330.00 2

    Please help me in solving this.

    Thanks,
    Niranjan

    Reply
  • Thank you! Very useful notes!

    Reply
  • Thank you! Very useful post!

    Reply
  • Avanti Papoo Choudhary
    July 2, 2012 3:35 pm

    Thanx Mr. Pinal it’s very useful…it has solved my major problems

    Reply
  • HI Pinal sir,
    Nice,Where i will download these type of SQL queries for interview(Pdf).please let me know the link.

    Reply
  • Ketan Barasara
    August 17, 2012 4:15 pm

    TY

    Reply
  • Ketan Barasara
    August 17, 2012 4:16 pm

    This blog is very useful to me

    Reply
  • this is used to decrease the quantity from order by expiry date

    “delete from Product where Experddate= (select Experddate from Product where proID=@prID order by Experddate ASC)”

    but it is not working “plz help me” friends

    Reply
    • Have you tried this?

      “delete from Product where Experddate in (select Experddate from Product where proID=@prID order by Experddate ASC)”

      Reply
  • HI
    you can simply use common table expression (CTE).
    in that you need to use Rank funcation

    With t AS
    (
    Select , //in which u want to delete duplicate record.
    RANK(ORDER BY PARTITION BY ) AS ‘Rank’
    from
    )
    DELETE FROM t
    WHERE t.Rank>1

    Thanks

    Reply

Leave a Reply