SQL SERVER – Delete Duplicate Records – Rows

Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.

DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)

Watch the view to see the above concept in action:

[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]

Reference : Pinal Dave (https://blog.sqlauthority.com)

Duplicate Records, SQL Scripts
Previous Post
SQL SERVER – T-SQL Script to find the CD key from Registry
Next Post
SQL SERVER – QUOTED_IDENTIFIER ON/OFF and ANSI_NULL ON/OFF Explanation

Related Posts

450 Comments. Leave new

  • Ashish Gilhotra
    February 2, 2010 10:17 am

    @Rathin

    You can do something like that

    Select * from
    (SELECT *,row_number() Over(partition by products_model order by products_last_modified desc) as RN from youttable)t
    where t.RN=1

    Reply
  • Gulshan Bareja
    February 2, 2010 6:46 pm

    Its really good…I was looking for delete duplicate record query finally I got this here.

    Thanks
    Gulshan Bareja

    Reply
  • hi,

    Sorry i am mentioned that having the problem with Mysql DB

    (The problem is i am having the columns of products_model, products_id(primary key), products_last_modified_date……etc

    Here i have duplicate records of products_model, so i needs to remove old duplicate records and retain the new records based on products_last_modified_date column
    kindly replay me)

    Is it possible , can we solve this by using Group by & having function?

    Reply
  • Hi Pinal,

    Thanks for your post. I have another idea.
    It can also be done in this way and it will work even if the Table does not have any identity column.

    Query:

    WITH xTab AS
    (
    SELECT
    DuplicateColumn1,
    DuplicateColumn2,
    ROW_NUMBER() OVER (PARTITION BY DuplicateColumn3 ORDER BY Column3) AS [Count]
    FROM MyTable
    )
    DELETE FROM xTab WHERE [Count] > 1

    Reply
    • Hello Souvik,

      Thank you for sharing the method of getting duplicate using CTE. I appreciate this method as it is easy and better.
      But the PARTITION BY clause should include all columns of table while you have used only one column.

      Regards,
      Pinal Dave

      Reply
  • Hi,

    Please replace the sixth line of the query with this:

    ROW_NUMBER() OVER (PARTITION BY DuplicateColumn3 ORDER BY DuplicateColumn3 )

    Reply
  • Deleting all the duplicate rows from a table keeping only the distinct values(try out this).

    create table test_a
    (a int
    )
    insert into test_a values(1)
    insert into test_a values(1)
    insert into test_a values(3)
    insert into test_a values(4)
    insert into test_a values(5)
    insert into test_a values(1)
    insert into test_a values(2)
    insert into test_a values(3)
    insert into test_a values(4)
    insert into test_a values(5)

    WITH [test_a rowid]
    AS (SELECT a.a,
    Row_number()
    OVER(ORDER BY a.a) num
    FROM test_a a)
    DELETE FROM [test_a rowid]
    WHERE num NOT IN (SELECT Rank()
    OVER(ORDER BY b.a)
    FROM test_a b)

    Reply
  • very very good query

    Reply
  • very nice query..
    thanks..

    Reply
  • This is a nice blog.Provide some assignment to get sql skills.This is a nice blog

    Reply
  • William Walsh
    May 12, 2010 2:10 am

    Here’s my dynamic dupes removal script. All it requires is the table name, make sure you have your db selected in the query editor dropdown. Just make @tablename=” and this’ll take care of the rest.

    Set ansi_nulls ON if you want the script to ignore null values. If some of your dupes include nulls then leave it set to ansi_nulls off.

    script:
    —————————————————————–
    DECLARE @tablename VARCHAR(255)
    DECLARE @columnnames VARCHAR(MAX)
    DECLARE @maxlength VARCHAR(max)
    DECLARE @datatype VARCHAR(max)
    DECLARE @columnlist VARCHAR(MAX)
    DECLARE @select VARCHAR(MAX)
    DECLARE @rowcount int
    DECLARE @variablelist VARCHAR(MAX)
    DECLARE @currentvariablename VARCHAR(MAX)
    DECLARE @declarestatement VARCHAR(MAX)
    DECLARE @singledeclare VARCHAR(MAX)
    DECLARE @FINALscript VARCHAR(MAX)
    DECLARE @wherestatement VARCHAR(MAX)
    DECLARE @ansinulls VARCHAR(MAX)

    SET @ansinulls = ‘SET ANSI_NULLS OFF’
    SET @tablename=’tSTG_CNT_MVDB_SurveyExtract_Presurvey’

    SET @columnlist= ”
    SET @wherestatement = ”
    SET @rowcount = ”
    SET @maxlength = ‘asdfadf’
    SET @columnnames = ”
    SET @datatype = ”
    SET @select = ”
    SET @variablelist = ”
    SET @currentvariablename = ”
    SET @declarestatement = ”
    SET @singledeclare = ”
    SET @FINALscript = ”

    SET @rowcount = (SELECT COUNT(column_name)
    FROM information_schema.columns
    WHERE table_name =@tablename)

    DECLARE dublicate_cursor1 CURSOR READ_ONLY FOR
    SELECT column_name,
    DATA_TYPE,
    CHARACTER_MAXIMUM_LENGTH
    FROM information_schema.columns
    WHERE table_name = @tablename

    OPEN dublicate_cursor1

    FETCH NEXT FROM dublicate_cursor1 INTO @columnnames,@datatype,@maxlength
    WHILE @@FETCH_STATUS = 0
    BEGIN
    SET @currentvariablename = ‘@’ + @columnnames

    IF @rowcount = 1
    BEGIN

    SET @columnlist = @columnlist + @columnnames
    SET @variablelist = @variablelist + @currentvariablename + ‘,@Count’
    IF @maxlength IS NULL
    Begin
    SET @singledeclare = ‘DECLARE ‘ + @currentvariablename + ‘ ‘ + @datatype
    END
    ELSE
    BEGIN
    SET @singledeclare = ‘DECLARE ‘ + @currentvariablename + ‘ ‘ + @datatype + ‘ (‘ + @maxlength + ‘)’
    END
    SET @declarestatement = @declarestatement + @singledeclare + ‘ DECLARE @Count int’
    SET @wherestatement = @wherestatement + @columnnames + ‘=’ + @currentvariablename
    END
    ELSE
    BEGIN
    –PRINT ‘ELSE’
    SET @columnlist = @columnlist + @columnnames + ‘,’
    SET @variablelist = @variablelist + @currentvariablename + ‘,’
    IF @maxlength IS NULL
    Begin
    SET @singledeclare = ‘DECLARE ‘ + @currentvariablename + ‘ ‘ + @datatype
    END
    ELSE
    BEGIN
    SET @singledeclare = ‘DECLARE ‘ + @currentvariablename + ‘ ‘ + @datatype + ‘ (‘ + @maxlength + ‘)’
    END
    SET @declarestatement = @declarestatement + ‘ ‘ + @singledeclare + ‘ ‘
    SET @wherestatement = @wherestatement + @columnnames + ‘=’ + @currentvariablename + ‘ AND ‘
    END
    SET @rowcount = @rowcount – 1
    FETCH NEXT FROM dublicate_cursor1 INTO @columnnames,@datatype,@maxlength
    END

    CLOSE dublicate_cursor1
    DEALLOCATE dublicate_cursor1

    SET @select = ‘SELECT ‘ + @columnlist + ‘,COUNT(*) -1 FROM ‘ + @tablename + ‘ GROUP BY ‘ + @columnlist + ‘ HAVING Count(*) > 1’

    SET @FINALscript = @ansinulls + @declarestatement
    SET @FINALscript = @FINALscript + ‘ DECLARE dublicate_cursor CURSOR FAST_FORWARD FOR ‘ + @select
    SET @FINALscript = @FINALscript + ‘ OPEN dublicate_cursor ‘ + ‘FETCH NEXT FROM dublicate_cursor INTO ‘ + @variablelist
    SET @FINALscript = @FINALscript + ‘ WHILE @@FETCH_STATUS = 0 BEGIN ‘ + ‘DELETE TOP(@Count) FROM ‘ + @tablename
    SET @FINALscript = @FINALscript + ‘ WHERE ‘ + @wherestatement
    SET @FINALscript = @FINALscript + ‘ FETCH NEXT FROM dublicate_cursor INTO ‘ + @variablelist + ‘ END ‘
    SET @FINALscript = @FINALscript + ‘CLOSE dublicate_cursor DEALLOCATE dublicate_cursor’

    EXEC(@FINALscript)

    Reply
  • How to delete the last row from a table in SQL Server 2005 without knowing any column name,u know only table name and it should b in single query.

    i have written this:

    with t as
    (
    select row_number() OVER(ORDER BY (select count(*) from Emp)) as Ranking from Emp
    )

    delete from t where Ranking = (Select max(Ranking) from t)

    this one working fine,but i need it in a single query
    ..
    can any one help me in this.

    Thanks & regards,
    Sumit

    Reply
    • It is not possible
      Why do you want to do this in that way?

      Reply
      • it is possible i hav already done it but it is not in single query.It is just for my research purpose..

        can u please help me out..

        thanks,
        Sumit

      • What type of research are you doing?
        As you showed in the last example it possible with row_number() only

        Only possible method I can think of is

        delete t from
        (
        select row_number() OVER(ORDER BY (select 0)) as Ranking from emp
        ) as t
        where ranking=1

      • thanks 4 ur reply,but it is deleting the first row not the last row.
        I am trying to take out the most from sql server 2005 which in future i can do some work with that..

        any more suggestion plz..

        tahnks,
        Sumit

  • Hi Pinal,

    Can u please help me regarding my query…

    Thanks,
    Sumit

    Reply
    • See my previous reply to you

      Reply
      • Final query will be like below:

        delete t from
        (
        select top 1 row_number() OVER(ORDER BY (select 0)) as Ranking from emp order by Ranking Desc
        ) as t

        thanks for your help

        Regards,
        Sumit

  • Done:
    The final query will be like below:

    delete t from
    (
    select top 1 row_number() OVER(ORDER BY (select 0)) as Ranking from emp
    order by Ranking Desc
    ) as t

    Reply
  • when we declare a constraint without giving name to it,then the system automatically n dynamically generates a name.
    Now when we want to drop that constraint we have to get the name from system,is their any way to get that name directly from the system………

    plzzzz help

    thanks,
    Sumit

    Reply
  • Hi Rahul ,

    Your query is…

    select * from testprice
    group by id,item,price

    Reply
  • Magesh Murugesan
    June 10, 2010 11:39 am

    if there is no key in the table then what ?

    I have on table named [Duplicate] {ID int,FNAME varchar(10),MNAME varchar(10)}

    Here there is no key and here are duplicate rows. so hoca can i delete this duplicate rows.

    Check Data

    ID FNAME LNAME
    1 AAA CCC
    2 BBB DDD
    1 AAA CCC
    2 BBB DDD
    1 AAA CCC
    2 BBB DDD
    3 BCB DGD

    Remove duplicate rows and keep the data in to the table like this using single query.

    ID FNAME LNAME
    1 AAA CCC
    1 AAA CCC
    2 BBB DDD
    2 BBB DDD

    Query :

    with RemoveDuplicate as
    (
    select *,row_number() over(partition by ID,FNAME,LNAME order by ID) as row from DUPLICATE
    )
    delete from RemoveDuplicate where row>1

    Reply
    • thanks guyz it worked now……

      with RemoveDuplicate as
      (
      select *,row_number() over(partition by num order by num) as row from Duplicates
      )
      delete from RemoveDuplicate where row>1

      this is gr8 ……thanks all

      Reply
  • Hi Dave, u r blog is really helpful………….in my interview they asked me this question

    table name is ‘Duplicates’
    and column name is ‘num’

    Select * from Duplicates

    num
    ———–
    1
    1
    1
    1
    2
    2
    2
    2
    2
    3
    3
    3
    3
    3

    (14 row(s) affected)

    Q)how to remove duplicates from the table ….I tried u r given query but didnt work …..can u plz guide me the right procedure, thanks

    DELETE
    FROM Duplicates
    WHERE num NOT IN
    (
    SELECT MAX(num)
    FROM Duplicates
    GROUP BY num)

    when I tried above query its say 0 rows affected

    Reply
    • thanks guyz it worked now……

      with RemoveDuplicate as
      (
      select *,row_number() over(partition by num order by num) as row from Duplicates
      )
      delete from RemoveDuplicate where row>1

      this is gr8 ……thanks all

      Reply
  • Hi guyz i am just wondering is this the right method to try …….based on above example…

    Select * from Duplicates

    select Distinct * into Dup_Rem from Duplicates

    Drop table Duplicates

    if u want to change Dup_Rem to its original name then I did this

    select Distinct * into Duplicates from Dup_Rem

    Select * from Duplicates

    the above few queries worked perfectly and gave me the same output that want it………any suggestions welcome

    Reply
  • @Madhivanan

    thanx for the reply bro…..i didnt know abt this function…..probably i’ll do some research…….do you have any tips for me ….coz in couple of days i am attending technical round….thanks once again…..

    Reply
  • How to delete duplicate records from the table and there is no
    primary key for the table like..

    id name gender
    1 aaa m
    2 bbb f
    1 aaa m
    3 ccc f
    3 ccc f
    1 aaa m

    I need to delete all the duplicate records such as id 1 is 2 times and id 3 is one time.

    thank you

    Reply
    • If you use version 2005 onwards use

      delete t from
      (
      select row_number() over (partition by col1,col2,col3) as sno,* from table
      ) as t
      where sno>1

      Reply

Leave a Reply