SQL SERVER – Primary Key and NonClustered Index in Simple Words

February 10, 2013

I have been writing a weekly round up from my blog where I go over last six years of blog posts and pick the best posts from the pasts. While I do this, there are two major place where I focus 1) If there are change in features – I re-blog about it with additional details or 2) If I have not provided complete information six years ago, I try to fill up the gap now. Well, just like everything my knowledge and writing skills have evolved. Before continuing please read my latest memory lane blog post where in 2007 I wrote scripts for Primary Key and Unique Key.

November 2006 was when I started to learn more about SQL and have been only 4 months in the product, I was still exploring various subjects. I wrote a blog post describing about how Primary Key and Unique Key are different. Everything which I wrote there is correct, however, there are a few more details one should learn when it is about Primary Key and Clustered Index.

Here is the common misconception prevailing in the industry.

Primary Key has to be Clustered Index.

In reality the statement should be corrected as follows:

Primary Key can be Clustered or Non-clustered but it is a common best practice to create a Primary Key as Clustered Index.

Well, now we have corrected the statement let us understand a bit more in detail. Primary Key should be uniquely identifying column of the table and it should be NOT NULL. A good (most of the time) candidate of the clustered index key also uniquely identifies column and NOT NULL (most of the time). Well, that means it is a good idea to create a Primary Key Clustered so it solve both the problems together. Keeping these facts in mind SQL Server automatically creates Clustered Index on the Primary Key when the table is created. Developers often do not specify which column should have clustered index so by default Primary Key becomes Clustered Index. This practice is now extremely common and lots of people have forgotten that Primary Key and Clustered Index is two different things. They can be same column but they do not have to be.

Well here are four examples we will see where we will learn the behavior of SQL Server when it is about Primary Key and Clustered Index.

Scenario 1 : Primary Key will default to Clustered Index
Scenario 2: Primary Key is defined as a Non-clustered Index
Scenario 3: Primary Key defaults to Non-Clustered Index with another column defined as a Clustered Index
Scenario 4: Primary Key defaults to Clustered Index with other index defaults to Non-clustered index

Now let us see each of the scenarios in detail.

Scenario 1 : Primary Key will default to Clustered Index

In this case we will create only Primary Key and when we check the kind of index created on the table we will notice that it has created clustered index automatically over it.

-- Case 1 Primary Key Defaults to Clustered Index USE TempDB GO -- Create table CREATE TABLE TestTable (ID INT NOT NULL PRIMARY KEY, Col1 INT NOT NULL) GO -- Check Indexes SELECT OBJECT_NAME(OBJECT_ID) TableObject, [name] IndexName, [Type_Desc] FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable' GO -- Clean up DROP TABLE TestTable GO

Scenario 2: Primary Key is defined as a Non-clustered Index

In this case we will explicitly defined Primary Key as a non-clustered index and it will create it as a non-clustered index. It proves that Primary Key can be non-clustered index.

-- Case 2 Primary Key Non-clustered Index USE TempDB GO -- Create table CREATE TABLE TestTable (ID INT NOT NULL PRIMARY KEY NONCLUSTERED, Col1 INT NOT NULL) GO -- Check Indexes SELECT OBJECT_NAME(OBJECT_ID) TableObject, [name] IndexName, [Type_Desc] FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable' GO -- Clean up DROP TABLE TestTable GO

Scenario 3: Primary Key defaults to Non-Clustered Index with another column defined as a Clustered Index

In this case we will create clustered index on another column, SQL Server will automatically create a Primary Key as a non-clustered index as clustered index is specified on another column.

-- Case 3 Primary Key Defaults to Non-clustered Index USE TempDB GO -- Create table CREATE TABLE TestTable (ID INT NOT NULL PRIMARY KEY, Col1 INT NOT NULL UNIQUE CLUSTERED) GO -- Check Indexes SELECT OBJECT_NAME(OBJECT_ID) TableObject, [name] IndexName, [Type_Desc] FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable' GO -- Clean up DROP TABLE TestTable GO

Scenario 4: Primary Key defaults to Clustered Index with other index defaults to Non-clustered index

In this case we will create two indexes on the both the tables but we will not specify the type of the index on the columns. When we check the results we will notice that Primary Key is automatically defaulted to Clustered Index and another column as a Non-clustered index.

-- Case 4 Primary Key and Defaults USE TempDB GO -- Create table CREATE TABLE TestTable (ID INT NOT NULL PRIMARY KEY, Col1 INT NOT NULL UNIQUE) GO -- Check Indexes SELECT OBJECT_NAME(OBJECT_ID) TableObject, [name] IndexName, [Type_Desc] FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable' GO -- Clean up DROP TABLE TestTable GO

SQL SERVER - Primary Key and NonClustered Index in Simple Words clpk4

I think above examples clarifies if there are any confused related to Primary and Clustered Index.

Now here is the question I often get asked what can be the reason for creating Primary Key and Clustered Index Key on different columns. Well, there are many scenarios when this can be true. It is possible that you have column SSN which you want to create as a Primary Key but do not want make it as a clustered index key because you have uniquely increasing identity column which best suites your need for that table (again this is just an example – you can argue exactly the opposite way as well). You are welcome to continue the discussion on this subject in the comments field or a dedicated blog post I wrote about it years ago over here. There are few really good comments there – I think that blog post is now a gold mine to understand this concept.

Reference : Pinal Dave (https://blog.sqlauthority.com)

SQL Index, SQL Scripts

SQL SERVER – Weekly Series – Memory Lane – #015

SQL SERVER – What is Semantics Model – A Simple Explanation

43 Comments. Leave new

Arunraja (Rocket)
February 10, 2013 12:35 pm
Hi Pinal,
Let me add some points.
The clustered index will store the value in the leaf node and its corresponding data columns in the same data page.
Ex:
My Employee id is 0087 and it is Clustered index column. Whenever I search the table with employee id, it will easily get all my information like name,Date of joining etc…. because they all stored in the same row of the employee id in the table as so the data page since it is clustered.
NON-Clustered Index.
If you have assigned a column as Non-clustered index. The column values were randomly split in the data pages and thus searching will be little slower compare to the clustered index which has every column in assigned to it in a same data page.
Ex:
If we assign the column ‘Code’ as non-clustered index in a dummy table. Consider we have more than 5000 values with code ‘500’. They will split them into different data pages. So if you search using ‘where code=’500”. The search will finds them in all of the data pages and its remaining column results, because the page reads will be for every data pages.
For the above example say if the column ‘Code’ is clustered index in a dummy table. The values will stored in the data pages in same order and it will store all the ‘code’ column values within few data pages, so the page reads will be minimal and thus the data retrieval will be quicker as comparing to the non-clustered index.
So it is not a must to add Primary key as clustered.
Thanks,
AR
Reply
- Kavitha
  February 11, 2013 8:00 pm
  I have a doubt. please clarify. what you meant data pages here?
  I think ‘tables’.
  thanks
  Kavitha
  Reply
  - Siva
    September 1, 2015 4:11 am
    The data in the tables will be in the Data pages.
Imran Mohammed
February 10, 2013 9:16 pm
Another common misconception most people have is “Creating a primary key on a table makes reads fast”
I want to clarify, Primary key is a constraint, it has nothing to do with fast reads. Like Pinal explained, “Primary Key” behind the scenes creates a clustered Index which makes the reads fast. Optionally we can also create non clustered for primary key which contributes to faster reads.
Finally, to compare indexes and Primary Key. You can create an index without a primary key but you cannot create a primary key without an index.
~ IM.
Reply
Rajmendra
February 11, 2013 8:39 am
Hi Pinal and frnds,
that was a nice Explanation thank you, i have a question i always face this the project i am running now major FMCG Project. what problem i face every month, i have created index on my sales table, where sale has to be entered in bulk at end of the month, so we can generate Reports and all, but when i try to insert bulk data in table, it get slows because i have indexs on table, what is the best practice to solve this problem.
I do it by disabling index for the current time is it right?
Thank you,
Rajmendra
Reply
rahul agrawal
February 11, 2013 10:06 pm
when have to make clustered with primary key and non cluster & vice versa…
Reply
Derik Hammer
February 12, 2013 8:17 pm
One additional example of why you would want to have a PK and clustered index on different columns is if you have an insert heavy table with a GUID as your identifying column. GUIDs are non-sequential so having a clustered index on a GUID column can cause some significant performance hindrances due to the clustered index insert. This can also cause index fragmentation. In one particular system at my company we made the GUID the PK and created a clustered index on InsertDate DESC. This allowed us to quickly query the latest records due to the clustered sorting and removed the performance issue of the GUID. Even if the InsertDate is not unique, the clustered index will add a “uniquifier” in the index.
Reply
Maulik S
June 17, 2013 1:56 pm
Hie frnz,
Indeed a good post…I jst have a query..what if my table doesnt have any primary key,
then will my non clustered index work ??? plz let me know as m a new learner..
Thanks in advance,,
Reply
kumar
July 27, 2013 12:22 am
hi frndz
I have 3.2 lak data with repeats how I identify duplicate no’s. but one condition using primary key through. I already try ed.
1. I was created empty table & i set the Primary key ( column name “Mobile_Number)
2. I insert the data using Insert into statement ( Insert into file2 select * from file1) it was successfully uploaded after that I was raised again Insert statement ( insert into file2 select * from file3 but file not appending the data. I checked the second data they have 2 mobile numbers are duplicated.
What I’m saying If duplicated the mobile numbers in skip the mobile number and remaining data how to append the data
can u give me help………
pls send me [email removed]
thanks
raj
Reply
Raj
August 21, 2013 4:02 pm
what is the procedure to insert rowws into a billion rows table
Dropt the Pk of target table
drop the clustered index
drop all non clustered index
insert the rows from source to target
Recreate Pk
Recreate clustered and non clustered index
Please let me know whether I am correct
Reply
Raj
August 21, 2013 4:03 pm
Please correct me if I am wrong
Reply
Anoop
October 17, 2013 4:07 pm
HI,
I have defined column1 as primary key and it defaulted it to clustered index.column2 as non clustered index.
when i do a select * from table by default it is using the primary key index,i want the query to use the index of column2. How do i do it?And why does it uses primary key index by default.Please help me.
Reply
Tom
October 29, 2013 10:43 pm
Hi
Query optimizer should use the appropriate index, however, if you want to force it, you can use the following syntax..
SELECT Column lst..
FROM Table1
WITH (INDEX(idx_Table1_2))
Thx.
Reply
ankush
November 8, 2013 12:39 pm
hi i like your blog
Reply
Vasu
January 14, 2014 11:35 pm
Hi Pinal,
I have one question :- I know that a table can have only one Primary Key. I used to think the reason behind that is with a Primary Key a clustered index is always there and since clustered index can have only one physical sorting, there fore only one Primary Key.
But not if PK can have nonclustered index then why only one PK?
Thanks
Vasu
Reply
- ashish nasre
  February 12, 2014 12:04 pm
  Hi Vasu,
  I would like to discuss on this,as we know the primary key is belong to physical data storage of table ,that mean in which manner the data should be stored.
  and sql server allow us to create only one primary key for a table,because whenever we create primary key on table ,it create clustered index internally that mean in simple language we can say that primary key is nothing but clustered index on table that we are creating while creating table or altering table(simple way to create clustered index) and as we know the clustered index arrange data physically in memory.
  we can arrange data only once physically.
  and as per theory we can arrange data physically only once that mean we can create only one clustered index on single table and because of this we have only one primary key on a table.
  As per me that’s a logic behind one primary key on table.
  if you have other logic than this please let me know.
  Many Thanks….
  Reply
Vasu
January 14, 2014 11:43 pm
Hi Pinal,
One more question :- In Case 2 Primary Key Non-clustered Index, what is role of Heap?
Thanks
Vasu
Reply
ashish nasre
February 6, 2014 5:38 pm
Hi Friends,
Apologies for this type of questions but just I want to make clear about exact physical data storing structure about index.
As we have multiple type of index like clustered index, nonclustered index, unique clustered index, unique nonclustered index,index with include option(2008), filter index(2005), index view and column stored index(2012).
as we know the whenever we create column stored index on table by considering columns, sql server engine re-manage the data as per column per page.
As we know the clustered index is physical index that mean it arrange the data physically as per index key by keeping index data with key(data pages) and we can create only one clustered index and multiple non clustered index in database on single table.
I am agree with this but what happened when we will create columns stored index on table having clustered index. then how it arrange the data as per column in single page and how it keep the data values(data pages with index key).
Please suggest me on this.
Many Thanks…………….
Reply
Manu
February 21, 2014 7:00 am
How is the data stored and sorted in BTree for Scenario 2
Reply
Kishore Bussa
May 2, 2014 2:23 pm
Yeah i can understand that how it works.But i got a question how to write this query in sql
Create Table: CREATE TABLE `purchases_index` (
`transactionid` bigint(20) NOT NULL AUTO_INCREMENT,
`dateandtime` datetime DEFAULT NULL,
`cashregisterid` int(11) NOT NULL,
`customerid` int(11) NOT NULL,
`productid` int(11) NOT NULL,
`price` float NOT NULL
) ENGINE=TOKUDB AUTO_INCREMENT=40000001 DEFAULT CHARSET=latin1
which is in mysql
Reply
Aarthi
October 16, 2014 7:21 pm
I never seen a explanation like this… I really thank you to give a blog like this which make us to understand in an easier way… I don’t know anything about index before reading this but in the first read itself i got the complete idea and knowledge about it.. Hats off to you knowledge to give a notes in this way..
Reply
ashish nasre
December 23, 2014 7:19 pm
Hi Pinal,
As we know whenever we create any clustered index on any of the column , it arrange the complete data of the table physically in sorted order by index key.
And clustered index create as a separate physical object of database.
As data stored in database as 8KB of pages whenever we create any table.my confusion is that, how the clustered index create and how it map with this pages. can clustered index create new pages or use old pages that was created while creating table if clustered index use pages to sort the data as per index key.
If clustered index create new pages then its size should be equal to table size but that is not a thing.
If clustered index do not crate new pages and use old pages then its size should be negligible but that is also not the thing. And if it use old pages then what happened if multiple rows use single page while creating table(how index arrange the data in sorted order as per index kay).
How clustered index get created and how It uses the pages while sorting the data.
Please suggest me on the same.
Reply
Seela Chowdaiah
February 20, 2015 12:10 pm
Hi,
This is chowdaiah nice explanation recently i’am attend one interview they will ask this question but i don’t know now i have complete idea about that , so thanks for your explanation
Thanks& Regards
Seela Chowdaiah.
Reply
- Pinal Dave
  February 23, 2015 10:08 am
  I am glad it helped you @Chowdaiah
  Reply