SQL SERVER – Delete Duplicate Records – Rows

Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.

DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)

Watch the view to see the above concept in action:

[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]

Reference : Pinal Dave (https://blog.sqlauthority.com)

Duplicate Records, SQL Scripts
Previous Post
SQL SERVER – T-SQL Script to find the CD key from Registry
Next Post
SQL SERVER – QUOTED_IDENTIFIER ON/OFF and ANSI_NULL ON/OFF Explanation

Related Posts

450 Comments. Leave new

  • if there is no key in the table then what ?

    I have on table named [Duplicate] {ID int,FNAME varchar(10),MNAME varchar(10)}

    Here there is no key and here are duplicate rows. so hoca can i delete this duplicate rows.

    Check Data

    ID FNAME LNAME
    1 AAA CCC
    2 BBB DDD
    1 AAA CCC
    2 BBB DDD
    1 AAA CCC
    2 BBB DDD
    3 BCB DGD

    Remove duplicate rows and keep the data in to the table like this using single query.

    ID FNAME LNAME
    1 AAA CCC
    2 BBB DDD
    3 BCB DGD

    PLease if possible help because i faced this question in many interviews.
    Reply me

    Reply
    • select distinct * into #temp from urtbl truncate table urtbl insert urtbl select * from #temp drop table #temp

      Reply
      • If we do as per u suggestion,can performance decreases

      • Yes. It will be a performance hit query

      • rno sname
        —————-
        1 Frieda
        1 Frieda
        1 Frieda
        4 Nathan
        6 senthil
        6 senthil
        6 senthil
        2 Senthil
        6 senthil
        1 Shanoj
        2 Shanoj
        4 Varun

        Step 1: Create duplicate table and move the duplicate value records of the original table to a duplicate table.

        SELECT DISTINCT * INTO DuplicateTb FROM dublicatetest GROUP BY sname,rno HAVING COUNT(rno) > 1

        After executed the above Query if you look the table the below result come.
        select * from DuplicateTb
        1 Frieda
        6 senthil

        select * from dublicatetest

        2 Shanoj
        1 Frieda
        2 Senthil
        4 Varun
        6 senthil
        1 Shanoj
        6 senthil
        4 Nathan
        6 senthil
        6 senthil
        1 Frieda
        1 Frieda

        Step 2: Delete all rows from the original table that also reside in the duplicate table.

        DELETE dublicatetest WHERE sname IN (SELECT sname FROM DuplicateTb)

        After executed the above Query if you look the table the below result come.

        select * from DuplicateTb
        1 Frieda
        6 senthil

        select * from dublicatetest
        2 Shanoj
        4 Varun
        1 Shanoj
        4 Nathan

        Step 3 : Move back the rows from duplicate table to original table.

        INSERT dublicatetest SELECT * FROM DuplicateTb

        After executed the above Query if you look the table the below result come.

        select * from DuplicateTb
        1 Frieda
        6 senthil

        select * from dublicatetest
        2 Shanoj
        1 Frieda
        6 senthil
        4 Varun
        1 Shanoj
        4 Nathan

        Step4: Drop the duplicate table.

        DROP TABLE DuplicateTb

        Dear friends now i hope u all are understood and got cleared.

        Good Luck
        Senthilkumar.T
        Be happy and try to make others happy

      • ivan mohapatra
        March 28, 2011 4:41 pm

        dear friends,

        here with i am sending a sample of code which might help u out in a easy manner

        select distinct * into newtable from oldtable

        truncate table oldtable

        insert into oldtable select * from newtable

        here we can have the distinct records or rows.

      • Thank for Ur code. Because it help to my interview.

      • but if Is Identity is set, in that condition, this code fails……

      • Nice code.I learn much more from your code

      • hi,

        how can we improve query performence. with same query dont use indexes view . for example select * from mytable query execute 2 min but the same query i want execute with in 1 min. hw can we do that one , please give me reply

      • dont worry man;
        make a call to me [phone number removed]

        select * from t1 where rowid=(select min(rowid) from
        t1 group by t1.c1)

      • This will throw error. You need to relate the common key from the outer table

        select * from t1 as t where rowid=(select min(rowid) from
        t1 where c1=t.c1)

      • delete from duplicate
        where fname in
        (
        select d1.fname
        from duplicate d1
        inner join duplicate d2
        on d1.fname=d2.fname and d1.Lname=d2.Lname
        group by d1.fname, d1.Lname
        having count (d1.fname)>1 and count(d1.Lname)>1
        )

      • It can be easily done with row_number() function. Refer point 6 at

      • If we assume two duplicates doesn’t this delete both records? How does your delete statement only remove one duplicate or is your goal to delete all duplicates? You can do a delete top but you must know how many duplicates

    • [ Just Copy & Paste ]

      declare @TBL table(EmpID int)

      insert into @TBL(EmpID)
      values(90020),(90020),(90021),(90022),(90023),(90022),(90025),(90022)

      select * from @TBL order by EmpID

      ;with TBLCTE(EmpID,Ranking)
      AS
      (
      select
      EmpID,
      Ranking = DENSE_RANK() over (PARTITION BY EmpID order by newID())
      from @TBL
      )
      delete from TBLCTE where Ranking > 1

      select * from @TBL order by EmpID

      ————-

      Reply
    • Ivan Mohapatra
      August 18, 2011 12:36 pm

      Dear pinal,

      If i am wrong then correct me.
      As per identity column in a table u will never have a duplicate rows in a table
      because it grows as 1,2,3,4,5
      so,we can’nt have duplicate rows in a identity column.
      but,we can have duplicate data in other column of a identity tables.

      Reply
      • That is the point of this post. That is why it is used to keep only one row for each combination of columns

    • Value1 | Value2
      1 | ‘Mahesh’
      1 | ‘Mahesh’
      2 | ‘Rahul’
      2 | ‘Rahul’

      With NewData
      as
      (
      select s.*,DENSE_RANK() OVER
      (PARTITION BY s.Value1 order by newID() ) r from SampleData s
      )

      DELETE FROM NewData where r=2

      Reply
  • Wouldn’t it be a little more simple to use Group By or Distinct to get the unique records in that situation?
    SELECT ID, FNAME, LNAME
    FROM Duplicate
    GROUP BY ID, FNAME, LNAME
    would give you the unique rows of Duplicate

    Rough Steps of one possibility to fix the table:
    1) Save the results of the group by query to a temp table
    2) Delete rows from the original table
    3) Fill original table with rows of temp table

    That you way you don’t have to use a cursor or change the design of table Duplicate. However, in the real world, you’d want to add a unique constraint to the ID column of table Duplicate to make sure it doesn’t get any more duplicates by ID.

    Reply
    • If the version is 2005 or avoe, you can simply do

      delete t from
      (
      select row_number() over (partition by ID, FNAME, LNAME
      order by ID) , * from table
      ) as t
      where sno>1

      Reply
      • Hi All,

        I have used table expression to remove duplicate records.

        Here is the solution

        Create Table emp
        (
        id int identity,
        empname varchar(20),
        empdept varchar(10)
        )

        insert into emp values(‘A’,’X’)
        insert into emp values(‘A’,’X’)
        insert into emp values(‘A’,’X’)
        insert into emp values(‘B’,’X’)
        insert into emp values(‘B’,’X’)
        insert into emp values(‘C’,’X’)

        select * from emp

        WITH Dublicates_Emp(empname, empdept,id)
        AS
        (
        SELECT empname,empdept,max(id)
        FROM EMP
        GROUP BY empname,empdept
        HAVING COUNT(empname)>1
        )

        delete from emp where id in
        (
        select emp.id
        from emp inner join Dublicates_Emp as d
        on emp.empname = d.empname
        and emp.empdept = d.empdept
        and emp.id d.id
        )

  • Whoops, I almost forgot…

    Thanks for the post about deleting duplicates! Your code was easy find with a Google search. It’s exactly what I was needing. It’s simple and elegant! I sometimes find it difficult to think of how to do things like that, so it was nice to find your post about it!

    Reply
  • While searching for this solution I found many others which use cursors and complicated SQL statements but in the end they do the same thing as your simple & powerful solution. Thank you for posting it online!

    Reply
  • Seriously thanks,

    Keep plopping my head why i didnt come up with simplicity.

    — note
    Plopping is a merely profound and uncatastrophic way of activating the tiny brain members that are usually on vacation.

    — note 2
    nevermind the note

    Reply
  • Pinal,

    Once again you are the man! I looked at other sites to answer this question and I found this SQL statement to be the best and easiest. Now I know to just come straight here.

    Reply
  • If you are looking for deleting multiple records (duplicate with more than 2 records like 3,4,5 duplicate records, etc) and without key column then the following method can be adopted.

    Lets say your table is tab1.

    SELECT DISTINCT * INTO tab2 FROM tab1 — distinct of all columns

    DROP TABLE tab1

    EXEC sp_rename ‘tab2′,’tab1’

    Note:
    Take a back up of your original table before using dropping for your reference. You may need to grant permission of tab2 as given for tab1

    Reply
    • I would say” dont drop the table”

      You can truncate it

      Reply
    • hey guys u shld try it—————————–

      ————delete duplicate record————-if we have table Male wd duplicate record just
      step 1– transfer distinct recoed of male table to female

      insert into Female(Name)
      select distinct Name from Male
      step 2–drop table Male
      step 3– Rename table female as Male

      Reply
  • If you are looking for deleting duplicates of more than 2 records, then the following code can be used…

    set nocount on
    drop table tbl1
    create table tbl1
    (
    col1 int
    )
    insert into tbl1 values(1)
    insert into tbl1 values(1)
    insert into tbl1 values(1)
    insert into tbl1 values(2)
    insert into tbl1 values(2)
    insert into tbl1 values(2)
    insert into tbl1 values(2)
    insert into tbl1 values(3)
    insert into tbl1 values(3)

    set rowcount 1
    select ‘start’
    while @@rowcount > 0 delete a from tbl1 a where (select count(*) from tbl1 b where a.col1 = b.col1)>1
    set rowcount 0
    select * from tbl1
    set nocount off

    Reply
    • This will be a time-consuming method
      Alternates are move distinct data to temp table or use row_number() function

      Reply
  • Simple is good! You Rock!

    (I found so many answers elsewhere and it’s incredible how 6 lines of code and one push of a button can do with your code)

    Reply
  • Ravi,
    your solution seems to be excelent but what about triggers and indexes defined on that table tab1.
    will they remain intact??
    Please repsond ….

    Amit

    Reply
  • Well to answer Rahul’s question and modify Ravi’s and Pinal Dave query…
    Here is a single query which will answer all your questions but this still needs Primary key or identity column to be added to the table though.
    DELETE FROM MyTable
    WHERE EXISTS (
    SELECT * FROM MyTable AS b
    WHERE
    b.[col1] = MyTable.[col1]
    AND b.[col2] = MyTable.[col2]
    AND b.[col3] = MyTable.[col3]
    GROUP BY
    b.[col1], b.[col2], b.[col3]
    HAVING
    MyTable.[ID] > MIN(b.[ID])
    )
    If you want most recent records to be present in your DB and delete multiple Old duplicate records, You can make use of
    DELETE FROM MyTable
    WHERE EXISTS (
    SELECT * FROM MyTable AS b
    WHERE
    b.[col1] = MyTable.[col1]
    AND b.[col2] = MyTable.[col2]
    AND b.[col3] = MyTable.[col3]
    GROUP BY
    b.[col1], b.[col2], b.[col3]
    HAVING
    MyTable.[ID] MIN(b.[ID])
    )
    Hope this answers all your questions.
    If you dont have Primary key or Identity Column then you can always create it and run this query and delete the Primary Key or identity column.
    “there are always better things to do to make lives easier.”
    enjoy your day
    Vamshi

    Reply
  • Senthilnathan
    July 3, 2007 1:49 am

    IN SQL SERVER 2005, This can be easily achieved without crating unique identifier by using CTE and ROW_NUMBER (), the modified query for sql server 2005 goes here
    ***********************************************
    WITH T1 AS (SELECT ROW_NUMBER ( ) OVER ( PARTITION BY ID, FNAME, LNAME ORDER BY ID ) AS RNUM FROM DUPLICATE )
    DELETE FROM T1 WHERE RNUM > 1
    ***********************************************
    To get a detail grasp on ROW_NUMBER () OVER () … Refer MSDN https://docs.microsoft.com/en-us/sql/t-sql/functions/row-number-transact-sql?view=sql-server-2017 for.

    Reply
  • I have by mistake duplicated all my records.

    Through export/import wizard I selected my database (e.g pension) and copied tables to the desination, later I came to know that source and destination where the same.

    Therefore, in all the table all rows are duplicated.

    Can someone solve the problem.

    Reply
  • satish ramahnujan
    July 13, 2007 4:28 pm

    Thanks Ravi, your information helped. I modified the query in the following manner and it worked fine:

    Select distinct * into temp from tab1;
    Delete tab1;

    Insert tab1 Select * from temp;

    drop table temp;

    Reply
  • how can we delete duplicate rows if table doesn’t have any identity column and requirement is you don’t have to insert one column as identity and you don’t have to use cursor and you don’t have to use temp table. so query should be a single

    Please reply

    Reply
    • Which version of SQL Server are you using?
      If it is later than 2000, use row_number() function

      Reply
  • this question i have asked in one of my interview, and requirement is like delete record should be in single query its ok if you can write subquery, but without use of temp table, without use cursor, and without use of insert identity column. still i am waiting for the answer,

    Reply
  • I’m trying to keep only the recent SaleDate in the following table. How can I do that?

    ID SaleDate SaleAmount
    40 6/23/2003 242
    40 12/28/2001 212
    40 6/13/1994 111
    41 11/30/2001 233
    41 10/25/1996 15
    41 4/21/1994 132
    42 6/17/2005 2765
    42 5/14/1994 147
    43 5/9/1994 145

    Thanks much

    Reply
    • select t1.* from table as t1 inner join
      (
      select id,max(saledate) as saledate from table
      group by id
      ) as t2
      on t1.id=t2.id and t1.saledate=t2.saledate

      Reply
  • Tayeb below query should work for you

    DELETE FROM #tempTab WHERE
    EXISTS
    (SELECT ID FROM #tempTab AS b WHERE #tempTab.ID = b.ID
    GROUP BY b.ID
    HAVING #tempTab.SaleDate < MAX(b.SaleDate))

    Reply
  • Depak, to answer your question here is a solution in SQL Server 2005

    CREATE TABLE #Table1 (col1 int, col2 int)
    INSERT INTO #Table1 VALUES (1, 1000)
    INSERT INTO #Table1 VALUES (2, 2000)
    INSERT INTO #Table1 VALUES (2, 2000)
    INSERT INTO #Table1 VALUES (3, 3000)
    INSERT INTO #Table1 VALUES (3, 3000)
    INSERT INTO #Table1 VALUES (4, 4000)

    SELECT * FROM #Table1;

    WITH T1 AS (SELECT (ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY col1)) AS RNum FROM #Table1)
    DELETE FROM T1 WHERE RNum IN (SELECT a.RNum FROM T1 AS a, T1 AS b WHERE a.RNum > b.RNum GROUP BY a.RNum)

    SELECT * FROM #Table1

    DROP TABLE #Table1

    ROW_NUMBER() will solve your purpose.

    Reply
  • Hi
    I had problem with deleting dupilicate rows. i have used your code. work fine. thanks a lot.
    Geetha

    Reply

Leave a Reply