SQL SERVER – Delete Duplicate Records – Rows

Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.

DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)

Watch the view to see the above concept in action:

[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]

Reference : Pinal Dave (https://blog.sqlauthority.com)

Duplicate Records, SQL Scripts
Previous Post
SQL SERVER – T-SQL Script to find the CD key from Registry
Next Post
SQL SERVER – QUOTED_IDENTIFIER ON/OFF and ANSI_NULL ON/OFF Explanation

Related Posts

450 Comments. Leave new

  • Divya Mahendra Sikarwar
    February 16, 2009 2:17 am

    Execute these queries for learning how to remove duplicate record.
    CREATE TABLE dbo.Test1 (
    [ID] [int] ,
    [FirstName] [varchar](25),
    [LastName] [varchar](25)
    ) ON [PRIMARY]

    INSERT INTO Test1 VALUES(1, ‘Bob’,’Smith’)
    INSERT INTO Test1 VALUES(2, ‘Dave’,’Jones’)
    INSERT INTO Test1 VALUES(3, ‘Karen’,’White’)
    INSERT INTO Test1 VALUES(1, ‘Bob’,’Smith’)
    INSERT INTO Test1 VALUES(4, ‘Bobby’,’Smita’)

    select identity(int,1,1) as SlNo,* into #temp from Test1

    DELETE
    FROM #temp
    WHERE SlNo NOT IN
    (
    SELECT MAX(SlNo)
    FROM #temp
    GROUP BY ID,FirstName,lastname
    )

    drop table test1

    select * into test1 from #temp

    alter table test1 drop column SlNo

    select * from test1 order by id

    Reply
  • –Try this to delete multiple records

    ;with t1 as
    (
    select id,Row_number() over (partition by a, b, c order by a, b, c) as rnum
    from mytable with(nolock)
    )
    delete from t1 where rnum>1

    Reply
  • gaurav moolani
    March 1, 2009 9:57 am

    hi guys
    i have a problem ??? what if the entire row is duplicate ??
    with i dentity column also same .

    i am transfering data from a table having duplicate rows to a table having a primary key

    pls help me out

    Reply
  • gaurav,
    Praney’s sql in the comments will help you out. I have the same situation and that solved the problem.

    ;with t1 as
    (
    select id,Row_number() over (partition by a, b, c order by a, b, c) as rnum
    from mytable with(nolock)
    )
    delete from t1 where rnum>1

    Reply
  • Hi Gaurav,
    I have duplicate records in a table,till recently i added a
    datetime field in my table called timestamp.Now when
    duplicates go into this table i will be able to see the current
    date in the timestamp

    Firstname Lastname memberno timestamp
    Jack Hill 11111 2009-12-11
    Jack Hill 11111 null
    Simon Philip 222222 2009-12-11
    Simon Philip 222222 2009-12-12
    Sam Wheat 33333 nulll
    Sam Wheat 33333 null

    In this table i want to fetch a duplicate set which has
    one record with a valid timestamp and other timestamp
    as null – something like

    Firstname Lastname memberno timestamp
    Jack Hill 11111 2009-12-11
    Jack Hill 11111 null

    select max(timestamp) from member group by firstname,lastname having count(*)>1

    How do i modify my display duplicate statement to have the resultset which includes duplicates with and without
    a valid timestamp

    Reply
  • Hi Pinal

    With reference to the above question
    Can anyone help me out with a query which can display only duplicates with or witout a timestamp, they should
    not include duplicates members all having timestamps

    Reply
  • Hi Luke,

    You can use ROW_Number() to identify Duplicate Rows

    Here You need to do as:

    ;with CTE AS(
    select Row_NUmber() (OVER PARTITION BY FirstName, LastName, MemberNo Order BY TimeStamp ASC) AS RowID
    FROM table
    )
    select * from cte where rowID=1

    this will give you Records with NULL. If you want data with Dates then you need to change “ORDER BY” to “Order BY TimeStamp DESC”

    For further details, you can read it on my blog:

    Thanks,

    Tejas

    Reply
  • Hi Pinal,

    Your articles have always helped me. Just curious if I could rationalize data in the example below using a variant of this example (I have an option to use SSIS Fuzzy lookup, but its way off my application scope. So it is ruled out.)

    Cheers,
    Krish

    Reply
  • HI PINAL,

    I HAVE READ YOUR ARTICLES AND RESPONSES FOR THE QUERIES AND I FOUND IT GOOD.
    I TOO HAVE ONE QUERY.

    THERE IS TABLE1 AND TABLE2
    TABLE 1 CONTAINS 100 RECORDS.
    TABLE 2 CONTAINS 150 RECORDS.

    TABLE 2 CONTAINS SOME RECORDS WHICH ARE ALREADY PRESENT IN TABLE 1.

    NOW WHEN I MERGE BOTH IT ADDS ALL THE RECORDS GIVING TOTAL NO. TO 250.

    FROM THESE I WANT TO REMOVE THE DUPLICATE DATA WHICH I GOT FROM TABLE 2 AND ARE ALREADY PRESET IN TABLE 1.

    I have tried above methods but did not yield required results.

    Be Helpful…

    Reply
  • I got the solution of above problem

    FRIEND in a NEED

    Another Question.

    i am joining TABLE1 (older) And TABLE2 (newer).

    i found some records which are common in both table.

    i want to remove found common data from TABLE2(newer)

    How can i perform that operation?

    GENERAL QUESTION
    How can we perform the Update,Delete operation on query of join

    –common records in two tables using inner join
    select *
    from TABLE1 as e
    inner join TABLE2 as c
    on e.respondentid = c.respondentid

    Reply
  • Vikas, will UNION work for you?

    SELECT … FROM Table_1
    UNION
    SELECT … FROM Table_2

    Reply
  • Vikas, for the second question:

    DELETE FROM Table2 WHERE EXISTS(SELECT * FROM Table1 WHERE Table1.respondentid = Table2.respondentid)

    Reply
  • thnx buddy tkatch.

    Reply
  • i have table TABLE1 with data inside.
    i want to add one column name as “Serial Number”.
    inside this i want to give Automatic Serial No.
    Say there are 5000 records then the Serial Number Column should contain1,2,3,4…..5000 at last record.

    Be Beedful

    Reply
  • i got the solution of above query.

    “alter table table1 add ID INT IDENTITY(1,1)”

    But this inserts the column at the end.

    i want this column to be inserted at the start.

    i.e. First Column should be of identity Column

    Reply
  • i have a table TABLE1 of Five Column.
    I want to insert one more column in this table after First Column.
    How can i insert the New Column from SQL Editor in between the Columns Present not at the end?

    Reply
  • Hi Pinal,

    I HAVE READ YOUR ARTICLES AND RESPONSES FOR THE QUERIES AND I FOUND IT GOOD.
    Actully i m new in SQL SERVER2005.

    I stucked in a query, if find the solution:-

    Table1 Table2

    col1|col2 col1|col2
    10 100 10 100
    11 150 15 200
    15 200 13 210
    21 210 23 100

    Output should be:

    col1|col2
    10 100
    11 150
    21 210
    13 210
    23 100

    Reply
  • Hi Pinal,

    i have one more query:

    Delete the duplicates rows from a table if there is not any primary key.

    Please find the solution and Thanks in advance.

    Reply
  • To aLL,

    I have one table TABLE! , which contains one column named “EMPLOYEE NAME”.
    There are various values inside it.
    i want to check every field of EMPLOYEE NAME and find out if there is any Numeric Value inside.

    EG.
    EMPLOYEE NAME
    Billgates
    George
    A1dam –Here 1 is present in between
    Gem9ini –Here 9 is present in between

    i want to identify this types of name which contains the numeric value in the field.

    Reply
  • using the instructions that you posted about adding an id column:

    “Good Question. I have been asking this question in interview many times to new candidates. Answer is:
    1) Add Identity Col and perform the operation displayed in this blog and drop Identity Col.
    2) User Cursor and save your values in temp table. If you receive the same row which you put in temp table delete it.
    3) You can use while loop as well do the same as cursor”

    i get the error message from sql telling me it cannot insert the value NULL into column “id” column does not allow nulls (i have got this after trying to set the primary key on the id field)

    Reply

Leave a Reply