SQL SERVER – Unique Nonclustered Index Creation with IGNORE_DUP_KEY = ON – A Transactional Behavior

Earlier, I had written on SQL SERVER – Unique Nonclustered Index Creation with IGNORE_DUP_KEY = ON, and I received a comment regarding when this option can be useful. On the same day, I met Jacob Sebastian—my close friend and SQL Server MVP, I discussed this question with him. During our discussion, we came up with following example.

When we have situation where we are dealing with INSERT and TRANSACTION, we can see this feature in action. Let us consider an example where we have two tables. One table has all the data and the second table has partial data. If you want to insert all the data from the first table to the second table and insert all nonduplicate values, you can use this feature.

Let us first create a table that has a duplicate record based on the FirstName.

SQL SERVER - Unique Nonclustered Index Creation with IGNORE_DUP_KEY = ON - A Transactional Behavior duptr1

Option 1: With IGNORE_DUP_KEY set OFF

Solarwinds

Once a table is created, create another table and a nonclustered index on it with the option IGNORE_DUP_KEY set OFF. Try to insert data from another table into this table by using the INSERT… SELECT statement.

SQL SERVER - Unique Nonclustered Index Creation with IGNORE_DUP_KEY = ON - A Transactional Behavior duptr2

As there are duplicate values, the insert will fail with error and not insert any single row in the table.

Option 2: With IGNORE_DUP_KEY set ON

Now, once again create new table and create nonclustered index on it with the option GNORE_DUP_KEY set ON. Try to insert data from another table into this table using INSERT… SELECT statement.

SQL SERVER - Unique Nonclustered Index Creation with IGNORE_DUP_KEY = ON - A Transactional Behavior duptr3

Even though there are duplicate values in the table, the rows in the table will be inserted; however, the duplicate values will not be inserted and ignored. A warning message will be generated with the script, but the process will not be stopped.

I hope that this example clarifies the use of this feature. Again, as I said earlier, this is a business logic related issue, and it should be used based on your need.

The complete script from the example is given here.

USE tempdb
GO
CREATE TABLE DupData (ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
City VARCHAR(100))
GO
INSERT INTO DupData (ID,FirstName,LastName,City)
VALUES(1,'John','Doe','LA');
INSERT INTO DupData (ID,FirstName,LastName,City)
VALUES(2,'Joe','Doe','LA');
INSERT INTO DupData (ID,FirstName,LastName,City)
VALUES(3,'John','Doe','LA');
INSERT INTO DupData (ID,FirstName,LastName,City)
VALUES(4,'Joseph','Doe','LA');
GO
-- Check data from Test Table.
SELECT *
FROM DupData
GO
-- Option 1 - With IGNORE_DUP_KEY set OFF
CREATE TABLE DupIndex (ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
City VARCHAR(100))
GO
--Create Unique Index on Table
CREATE UNIQUE NONCLUSTERED INDEX [IX_DupIndex_FirstName] ON [dbo].[DupIndex] (
[FirstName] ASC
) ON [PRIMARY] GO
-- Insert will throw an error
-- Error will rollback transaction
INSERT INTO DupIndex (ID,FirstName,LastName,City)
SELECT ID,FirstName,LastName,City
FROM DupData
ORDER BY ID
GO
-- Table with contain NO row
SELECT *
FROM DupIndex
GO
DROP TABLE DupIndex
GO
-- Option 2 - With IGNORE_DUP_KEY set ON
USE tempdb
GO
CREATE TABLE DupIndex (ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
City VARCHAR(100))
GO
--Create Unique Index on Table with IGNORE_DUP_KEY set ON
CREATE UNIQUE NONCLUSTERED INDEX [IX_DupIndex_FirstName] ON [dbo].[DupIndex] (
[FirstName] ASC
) WITH (IGNORE_DUP_KEY = ON) ON [PRIMARY] GO
-- Insert will throw warning
-- Warning will not insert duplicate row
INSERT INTO DupIndex (ID,FirstName,LastName,City)
SELECT ID,FirstName,LastName,City
FROM DupData
ORDER BY ID
GO
-- Table with contain three rows
SELECT *
FROM DupIndex
GO
-- Clean up
DROP TABLE DupIndex
DROP TABLE DupData
GO

Let me know what you think of this option and the type of situation in which you will use this.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Solarwinds
, ,
Previous Post
SQL SERVER – SQL Server RDL Specification
Next Post
SQLAuthority News – Webcasts – Resources for IT Managers and their Teams

Related Posts

20 Comments. Leave new

  • Paresh Prajapati
    January 6, 2010 11:55 am

    Hi Pinal,

    Nice article.

    I have one question,

    — Insert will throw warning
    — Warning will not insert duplicate row
    INSERT INTO DupIndex (ID,FirstName,LastName,City)
    SELECT ID,FirstName,LastName,City
    FROM DupData
    ORDER BY ID
    GO

    From above query, WITH (IGNORE_DUP_KEY = ON) ON [PRIMARY] key, why ID =1 is ignored from table DupData while insertion, why not ignore ID = 3?

    Reply
    • Hi Pinal,
      I tried tweaking the same query using a clustered index and it gives a different result altogether. I am not sure but why is it that the SQL server engine works differently for different indexes. I’m not too much in sql server so I’m not sure whether I am asking the right question as clustered index on First Name sounds wierd.

      Reply
  • Hi Paresh,

    The data specific query can be answered only if you provide table design and existing data in source and destination tables.

    Regards,
    Pinal Dave

    Reply
  • Hi Pinal,
    Interesting use for this option. I always wondered how it might come in handy. The same result could be accomplished by using SELECT DISTINCT on the first table, but that would require sorting whereas the IGNORE_DUP_KEY option does not.

    Aaron

    Reply
    • Marko Parkkola
      January 7, 2010 2:03 am

      You mean like this?

      SELECT MIN(ID) AS ID, FirstName, LastName, City
      FROM DupData
      GROUP BY FirstName, LastName, City
      ORDER BY ID

      It works but it’s not very generic (or efficient) way to do it, I would say.

      Reply
  • Marko Parkkola
    January 7, 2010 1:55 am

    Oh, now I see where this might come handy :)

    But I think this is really limited situation where you’d want to copy data from one table to another. Good to know that this kind of possibility exists though.

    Reply
  • Hi,

    Shall i consider this as a new way for removing duplicate records from a table.

    Reply
    • Marko Parkkola
      January 8, 2010 4:26 pm

      Not directly but with temporary table (ordinary will do also if you clean it up afterwards) it is possible. Create temporary table with unique index with IGNORE_DUP_KEY = ON. Select all the rows from original table to temporary table. Truncate original table and select all the rows from temporary table to (now clean) original table.

      — There’s duplicate rows now
      SELECT * FROM DupData ORDER BY ID
      GO
      — Create temp table
      CREATE TABLE #DupIndex (ID INT,
      FirstName VARCHAR(100),
      LastName VARCHAR(100),
      City VARCHAR(100))
      GO
      — And index
      CREATE UNIQUE NONCLUSTERED INDEX [IX_DupIndex_FirstName]
      ON [dbo].[#DupIndex]
      (
      [FirstName] ASC
      ) WITH (IGNORE_DUP_KEY = ON) ON [PRIMARY]
      GO
      — Select rows into temp table (duplicates are ignored)
      INSERT INTO #DupIndex (ID,FirstName,LastName,City)
      SELECT ID,FirstName,LastName,City
      FROM DupData
      ORDER BY ID
      GO
      — Truncate table
      TRUNCATE TABLE DupData;
      GO
      — Select distinct rows back to table
      INSERT INTO DupData (ID,FirstName,LastName,City)
      SELECT ID,FirstName,LastName,City
      FROM #DupIndex
      ORDER BY ID
      GO
      — Now there’s now duplicate data
      SELECT * FROM DupData ORDER BY ID
      GO
      — Drop temp table
      drop table #DupIndex;
      GO

      Reply
  • I actually found use for this!

    I have a situation where I need to select data joining many tables, filter them using various rules and calculate some values out of them. I select data into table valued variable. After that I must update other tables depending what’s in that temp table.

    Now I don’t want to have duplicate entries in that temp table. It wouldn’t break anything but there could be A LOT of rows so I would have to do a lot of unnecessary updates and cause major havoc in the db with write locks.

    I figured out that I could extract distinct rows out of the temp table but that would cause some extra work elsewhere in my procedure. So instead of that I just added “with (IGNORE_DUP_KEY = ON)” to the tables primary key and it seems to work like a charm :)

    Reply
  • deepak mariswamy
    September 16, 2010 10:18 am

    awesome.. helpful.

    Reply
  • Hi,
    I have a table as Test (VouDate Date, Narration Varchar(50)) where both are PK

    Data
    ——–
    insert into test values (‘2012-10-01′,’First-1’);
    insert into test values (‘2012-10-01′,’First-2’);

    insert into test values (‘2012-11-01′,’First-1’);
    insert into test values (‘2012-11-01′,’First-2’);

    When I select (select * from test order by 1) it returns
    2012-10-01 First-1
    2012-10-01 First-2
    2012-11-01 First-1
    2012-11-01 First-2

    And when I again insert
    insert into test values (‘2012-10-01′,’First-3’);
    insert into test values (‘2012-10-01′,’First-0’);

    and select (select * from test order by 1), it returns
    2012-10-01 First-0
    2012-10-01 First-1
    2012-10-01 First-2
    2012-10-01 First-3
    2012-11-01 First-1
    2012-11-01 First-2

    I want to get the data as
    2012-10-01 First-1
    2012-10-01 First-2
    2012-10-01 First-3
    2012-10-01 First-0

    2012-11-01 First-1
    2012-11-01 First-2

    Please let me know if this is possible to sort the data by first column and second column get sorted by the insert sequence.

    Thanks,
    Yunus

    Reply
  • I have tried this without creating the Primary Key in the table, It works fine.

    Reply
  • Brian P. Hernandez
    September 14, 2012 11:55 pm

    Hi.. this is my point.. i have 200 tables with million of rows each one, we want one unique table with unique email addresses, on that unique mail table i create 2columns pk_id (incremental), and email address , then i created an unique index with ignore_dup_key = ON , everything is fine, but when that table grow.. the insert of the remaining table is extremely slow… any suggestions? of how to accelerate the insert on that huge unique column table???

    Reply
  • Hello Pinal,

    I am using threading in my .net application and which is causing duplicates in my database, so when two threads are creating duplicates at the same time, in this situation will this IGNORE_DUP_KEY = ON work successfully?

    Reply
  • I believe I’ve got the answer to the question posed in the first 1/6/10 reply in the thread… “WITH (IGNORE_DUP_KEY = ON) ON [PRIMARY] key, why ID =1 is ignored from table DupData while insertion, why not ignore ID = 3?”. I believe/guess/intuit that it is because SQL Server does NOT “ignore” subsequent duplicates, rather it “writes over” the prior record with the most recent duplicate record’s data. By changing the ORDER in which the records are inserted using ORDER BY ID “DESC” (rather than the default BY ID “ASC”), the ID=1 is kept rather than ID=3 — the former being the more recent insertion of the two. If, indeed, this can be generalized, then it might be a GREAT way to eliminate duplicate keyed entries while keeping the most recent one — just sort the to-be-inserted data in such a way that the most recent duplicate is inserted last.

    Reply
    • Ashok Maganti
      April 4, 2017 9:46 pm

      Thank you. You answered my question. Do you think, this is an efficient method( in terms of time) than getting the most recent record using the Rank functions into a temp table and inserting into the final table?

      Reply
  • Sir,could you please expain what is the difference between unique non clustered index and clusterd index.?

    Reply
  • A quick search on the site can give you a number of articles on clustered index understanding – https://blog.sqlauthority.com/2011/08/19/sql-server-tips-from-the-sql-joes-2-pros-development-series-the-clustered-index-simple-understanding-day-19-of-35/ is one of them. In clustered index the leaf note contains the data when compared to non-clustered index. A non-clustered index contains only the Key values.

    Reply

Leave a Reply

Menu