SQL SERVER – Delete Duplicate Records – Rows

Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.

DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)

Watch the view to see the above concept in action:

[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]

Reference : Pinal Dave (https://blog.sqlauthority.com)

Duplicate Records, SQL Scripts
Previous Post
SQL SERVER – T-SQL Script to find the CD key from Registry
Next Post
SQL SERVER – QUOTED_IDENTIFIER ON/OFF and ANSI_NULL ON/OFF Explanation

Related Posts

450 Comments. Leave new

  • hi sir.
    is there any script for script out all the trigger in a database?

    Reply
  • delete from table1 t1
    where rowid ! = (select max(rowid)
    from table2 t2
    on t1.tabid=t2.tabid);

    Reply
  • I tried this but it removes the duplicates in the new table but the existing table still has the duplicates.. what is the solution to remove the duplicates from the existing table

    Reply
  • remove duplicate data in sql server

    Reply
  • Remove duplicate another simple query
    DELETE
    FROM MyTable
    WHERE ID IN
    (
    SELECT MIN(ID)
    FROM MyTable
    GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3
    having count(*)>1)

    Reply
    • I have not verified but looks OK to me. Thanks for sharing it.

      Reply
    • The above statement subquery will be returning only one minimum value .. which means …3 duplicate rows then .. subquery will return only one row..the another duplicate till persists…so you need to run the query 2 times..each time it will remove only one duplicate..

      Reply
    • IF EXISTS(SELECT TOP 1 1 FROM SYS.objects WHERE NAME LIKE ‘DUPLICATE’)
      DROP TABLE DUPLICATE

      CREATE TABLE DUPLICATE (ID INT IDENTITY(1,1) NOT NULL,NAME VARCHAR(100))
      GO
      INSERT INTO DUPLICATE (NAME) VALUES (‘SELVEESWARAN’)
      GO 5
      INSERT INTO DUPLICATE (NAME) VALUES (‘SELLIAH’)
      INSERT INTO DUPLICATE (NAME) VALUES (‘PINAL’)
      INSERT INTO DUPLICATE (NAME) VALUES (‘DAVE’)

      SELECT * FROM DUPLICATE

      /*** RESULT
      ID NAME
      1 SELVEESWARAN
      2 SELVEESWARAN
      3 SELVEESWARAN
      4 SELVEESWARAN
      5 SELVEESWARAN
      6 SELLIAH
      7 PINAL
      8 DAVE
      */

      –1ST RUN
      DELETE FROM DUPLICATE
      WHERE ID IN (
      SELECT MIN(ID) FROM DUPLICATE GROUP BY NAME HAVING COUNT(*) > 1
      )

      SELECT * FROM DUPLICATE
      /*** RESULT
      ID NAME
      2 SELVEESWARAN
      3 SELVEESWARAN
      4 SELVEESWARAN
      5 SELVEESWARAN
      6 SELLIAH
      7 PINAL
      8 DAVE
      */

      –2ND RUN
      DELETE FROM DUPLICATE
      WHERE ID IN (
      SELECT MIN(ID) FROM DUPLICATE GROUP BY NAME HAVING COUNT(*) > 1
      )
      SELECT * FROM DUPLICATE

      /*** RESULT
      ID NAME
      3 SELVEESWARAN
      4 SELVEESWARAN
      5 SELVEESWARAN
      6 SELLIAH
      7 PINAL
      8 DAVE

      */

      UPTO NO OF TIMES FOUND THE DUPLICATES

      Reply
  • 5. Which of the given stmt. is related to following created view?
    Create VIEW emp_dept_outerjoin1 AS SELECT empno, ename, e.deptno,dname,loc FROM emp_tab e, dept_tab d WHEN e.deptno = d.deptno(+);
    View successfully created
    Views that involve outer joins are not modifiable
    Column in base empt_tab table of emp_dept_outerjoin1 are modifiable through the view
    Columns in dept_tab table of emp_dept_outerjoin1 are modifiable through the view
    Column in base emp_tab of emp_dept_outerjoin1 are not modifiable through the view
    please tell me answer of this query

    Reply
  • 5. Which of the given stmt. is related to following created view?
    Create VIEW emp_dept_outerjoin1 AS SELECT empno, ename, e.deptno,dname,loc FROM emp_tab e, dept_tab d WHEN e.deptno = d.deptno(+);
    1.View successfully created
    2.Views that involve outer joins are not modifiable
    3.Column in base empt_tab table of emp_dept_outerjoin1 are modifiable through the view
    4.Columns in dept_tab table of emp_dept_outerjoin1 are modifiable through the view
    5.Column in base emp_tab of emp_dept_outerjoin1 are not modifiable through the view

    plz tell me answer of this question

    Reply
  • Delete query farji h phle khud check kr liya kar fir suggestion diya kr khud ko aata ni kuch.

    Reply
  • Delete * from table where uniqueValue in (

    Select unique value
    from (

    Select rownumber() over(partition by UNIQUEValue order by uniquevalue) as duplicate,* from table

    ) as s where s.dup>1

    )

    In the above unique value should be replaced by the value you use to find duplicates, like if there are 2 records with the same customerID , this is how I do it

    Reply
  • In case no id column then???I mean no primary column then ..how we can delete duplicates..the above logic will be work or not

    Reply
  • IF EXISTS(SELECT TOP 1 1 FROM SYS.objects WHERE NAME LIKE ‘DUPLICATE’)
    DROP TABLE DUPLICATE

    CREATE TABLE DUPLICATE (ID INT IDENTITY(1,1) NOT NULL,NAME VARCHAR(100))
    GO
    INSERT INTO DUPLICATE (NAME) VALUES (‘SELVEESWARAN’)
    GO 5
    INSERT INTO DUPLICATE (NAME) VALUES (‘SELLIAH’)
    INSERT INTO DUPLICATE (NAME) VALUES (‘PINAL’)
    INSERT INTO DUPLICATE (NAME) VALUES (‘DAVE’)

    –SOLUTION 1
    WITH CTE AS
    (
    SELECT ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY NAME) AS RNO,* FROM DUPLICATE
    )
    SELECT * FROM CTE WHERE RNO > 1

    –SOLUTION 2
    DELETE FROM DUPLICATE WHERE ID NOT IN (SELECT DISTINCT MIN(ID) FROM DUPLICATE GROUP BY NAME)

    –SOLUTION 3
    DELETE FROM DUPLICATE WHERE ID NOT IN (SELECT DISTINCT MAX(ID) FROM DUPLICATE GROUP BY NAME)

    –SOLUTION 4
    DELETE FROM DUPLICATE WHERE ID (SELECT MIN(ID) FROM DUPLICATE D WHERE DUPLICATE.NAME=D.NAME)

    –SOLUTION 5
    DELETE FROM DUPLICATE WHERE ID (SELECT MAX(ID) FROM DUPLICATE D WHERE DUPLICATE.NAME=D.NAME)

    –SOLUTION 6
    DELETE FROM DUPLICATE WHERE ID NOT IN (SELECT TOP 1 ID FROM DUPLICATE D WHERE DUPLICATE.NAME=D.NAME)

    Reply
  • I had a scenario where I did not have any identity column, and I need to determine duplicates based on all column values. How we can do that? I thought of using Row_Number() function, but that was also not generic enough to fit into any table definition. Is there a way to do that?

    Reply
  • Thankyou . .. . always your solution is straight and simple

    Reply

Leave a Reply