SQL SERVER – Delete Duplicate Records – Rows

Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.

DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)

Watch the view to see the above concept in action:

[youtube=http://www.youtube.com/watch?v=ioDJ0xVOHDY]

Reference : Pinal Dave (https://blog.sqlauthority.com)

Duplicate Records, SQL Scripts
Previous Post
SQL SERVER – T-SQL Script to find the CD key from Registry
Next Post
SQL SERVER – QUOTED_IDENTIFIER ON/OFF and ANSI_NULL ON/OFF Explanation

Related Posts

450 Comments. Leave new

  • Imran Mohammed
    May 7, 2009 8:36 am

    @Domenic

    I don’t think, I understood your question completely,

    I think you asked, why do you get this error:
    error message from sql telling me it cannot insert the value NULL into column “id” column does not allow nulls (i have got this after trying to set the primary key on the id field)

    Try this,
    — Step 1: Create a table
    CREATE TABLE example1 ( id INT NOT NULL , ename VARCHAR(10))

    — Step 2 : Insert some data in this table, few duplicate records
    INSERT INTO example1 VALUES( 1, ‘imran’)
    INSERT INTO example1 VALUES( 1, ‘imran’)
    INSERT INTO example1 VALUES( 2, ‘imran’)
    INSERT INTO example1 VALUES( 2, ‘imran’)
    INSERT INTO example1 VALUES( 3, ‘imran’)

    — Step 3: lets add Identity Column
    ALTER TABLE example1
    ADD Iden INT IDENTITY

    — Step 4: Lets remove duplicate records.
    DELETE
    FROM example1
    WHERE Iden NOT IN
    (
    SELECT MAX(Iden)
    FROM example1
    GROUP BY id, ename)

    — Check if duplicate records still exists ?
    SELECT * FROM example1

    — Step 6: drop Identity column, which we added earlier.
    ALTER TABLE example1
    DROP COLUMN Iden

    — Step 7: Lets make id column Primary key.
    ALTER TABLE example1
    ADD CONSTRAINT PK_Example1 PRIMARY KEY (id)

    This works well with no issues.

    if you replace step 1: with below script
    — step1 (a) : Create a table

    CREATE TABLE example1 ( id INT , ename VARCHAR(10))

    if you repeat from step 2 – Step 7, step7 will fail, because id column in example1 table allows Null as you can see, I did not create that column with NOT NULL constraint, that is why you cannot make that column as primary key.

    Does this answer your question ? If not please be clear what you actually want to ask ?

    ~ IM.

    Reply
  • hello

    thanks for this script

    Reply
  • Table1 Table2

    ID Sim_No Sim_No Plate
    1 2203080 2203080 AD2345
    2 2203081
    3 2203082

    How I delete The Sim No From Table1(Same Record In Table2 ) ( If I Take any No From Table1 & put In To Table 2)

    Reply
  • Brian Tkatch
    May 12, 2009 9:46 pm

    @Jerry

    Whatever routine is used to INSERT the data INTO Table2, can also DELETE from Table1.

    Or, for a general solution (which may not be very quick) DELETE any records in Table1 that EXIST in Table2:

    DELETE FROM Table1 WHERE EXISTS (SELECT * FROM Table2 WHERE Table2.Sim_No = Table1.Sim_No)

    Reply
  • ItemTable

    ItemNo BatchNo Qty
    11 101 100
    11 102 500
    12 101 100
    12 101 500
    13 105 1000
    11 101 600

    Now I want like this
    ItemNo BatchNo Qty
    11 101 700
    11 102 500
    12 101 600
    13 105 1000

    how to write query for this
    the above formate has Sample record in our project it have 35000 record

    Reply
  • Brian Tkatch
    May 19, 2009 7:06 pm

    @tamilselvan

    Do you want a regular GROUP BY?

    SELECT ItemNo, BatchNo, SUM(Qty)
    GROUP BY ItemNo, BatchNo;

    Reply
  • Imran Mohammed
    May 19, 2009 7:51 pm

    create table example1 (ItemNo int ,BatchNo int ,Qty int)

    insert into example1 values (11 ,101, 100)
    insert into example1 values (11, 102, 500)
    insert into example1 values (12, 101, 100)
    insert into example1 values (12, 101, 500)
    insert into example1 values (13, 105, 1000)
    insert into example1 values (11, 101, 600)

    select ItemNo
    ,BatchNo
    ,Sum(Qty) Qty
    from example1
    group by ItemNo
    ,BatchNo

    Reply
  • I have a few records that are duplicates, all I need to do is mark them as duplicates.

    select field01 from table01
    field01
    AA
    AB
    AC
    AB
    AA
    AD

    All that I require is
    select field01, mark01 from table01 to return
    field01 ; mark01
    AA; 1
    AB; 1
    AC; 1
    AB; 2
    AA; 2
    AD; 1

    Reply
  • Brian Tkatch
    June 8, 2009 9:45 pm

    @gpshiburaj

    ROW_NUMBER() does that:

    WITH
    Data(field01)
    AS
    (
    SELECT ‘AA’ UNION ALL
    SELECT ‘AB’ UNION ALL
    SELECT ‘AC’ UNION ALL
    SELECT ‘AB’ UNION ALL
    SELECT ‘AA’ UNION ALL
    SELECT ‘AD’
    )
    SELECT
    field01,
    ROW_NUMBER() OVER(PARTITION BY field01 ORDER BY field01)
    FROM
    Data;

    Reply
  • Thanks Brian,
    I was thinking that we would have to use a cursor for this, using row_number is easier.

    Reply
  • Hello Pinal,

    Is there any procedure to restore deleted records from particular table in database?

    Thank you
    Amit Jain

    Reply
  • Hello sir,

    You r genious, I love your articles.

    thanks

    Reply
  • Hey Guys thats all fine . U can store the values in the Temp table remove the duplicate records and empty the original table . Then move all the unique records to the original table . But its a lengthy process suppose i am having 2 million records in a table and from that table if i want to ren=move the duplicate records then it would really affect the performance . Sorry Since i am not having so much of exp and if i had hurted someone . Pinal Sir do let me know if i am wrong

    Reply
  • aasim abdullah
    July 13, 2009 11:56 am

    excellent post, Senthilnathan comments really helped me to delete duplicate rows with most appropriate way.

    Reply
  • Thanks for your help man!!!

    It helps me a lot!!!

    Greats from Lima, Perú.

    Reply
  • TABLE- MASTER,W_CODE IS COLUMN NAME.

    DELETE FROM MASTER WHERE W_ID IN(SELECT MAX(W_ID) FROM MASTER WHERE W_CODE IN(SELECT W_CODE FROM MASTER GROUP BY W_CODE HAVING COUNT(W_CODE)>1))

    Reply
  • siddique ahmad
    August 4, 2009 5:22 pm

    Thanks a lot this really help me in removing the duplicate invoice added due to my code

    Reply
  • Hello Pinal,

    Could you please help me? ..When I ran my script it removed both of the records. Could you please advice me why?
    I do not have index in that table.

    Thank you!

    drop table dupcalref;
    create temporary table dupcalref
    SELECT MAX(calownerid) as id FROM calref
    group by calownerid, calenderid having count(calownerid)>1;

    alter table dupcalref add index(id);

    delete FROM calref where calownerid in
    (select id from dupcalref);

    Reply
  • @Zoltan

    Remove “calownerid” from the GROUP BY clause.

    Reply
  • Brian,

    Thank you for your help! Worked like a charm!

    Thanks!

    Reply

Leave a Reply