Note: This article is re-write of my previous article SQL SERVER – Observation – Effect of Clustered Index over Nonclustered Index. I have received so many request that re-write it as it is little confusing. I am going to re-write this with simpler words.
Query optimization is one art which is difficult to master. Just like any other art this requires creativity and imagination as well understanding of subject matter. Let us look at interesting observation which I came across.
First of all download the script from here and run it in SSMS.
Now enable Execution Plan (Using CTRL + M) in SSMS before running the script.
The simple objective of this whole exercise is to understand how clustered index and nonclustered indexes are associated with each other.
In our example we have one query which is not using any index. On the same table there is already nonclustered index created, which is also not being used. Now when we created clustered index on the same table, our query suddenly started to use nonclustered index which was so far it is not using. The interesting part of this is that query is using nonclustered index when clustered index is created on the same.
Now let us test the same thing with example.
USE [AdventureWorks]
GO
/*Â */
CREATEÂ TABLE [dbo].[MyTable](
[ID]Â [int] NOTÂ NULL,
[First]Â [nchar](10)Â NULL,
[Second]Â [nchar](10)Â NULL
) ON [PRIMARY]
GO
/* Create Sample Table */
INSERTÂ INTO [AdventureWorks].[dbo].[MyTable]
([ID],[First],[Second])
SELECT 1,'First1','Second1'
UNION ALL
SELECT 2,'First2','Second2'
UNION ALL
SELECT 3,'First3','Second3'
UNION ALL
SELECT 4,'First4','Second4'
UNION ALL
SELECT 5,'First5','Second5'
GO
Now let us create nonclustered index over this table.
/* Create Nonclustered Index over Table */
CREATEÂ NONCLUSTEREDÂ INDEX [IX_MyTable_NonClustered]
ON [dbo].[MyTable]
(
[First] ASC,
[Second] ASC
) ON [PRIMARY]
GO
Run following two queries together.
/* Run following two queries together and observe the
result in by Enabling Actual Execution Plan (CTRL + M)
1st Query will use Table Scan
2nd Query will use Index Seek
*/
SELECT ID
FROM [MyTable]
WHERE First = 'First1' AND Second = 'Second1'
SELECT Second
FROM [MyTable]
WHERE First = 'First1' AND Second = 'Second1'
GO
It is clear from query that index applies to columns on which it is created. In our case as in WHERE condition we have same columns which are used in Index.
Query 1 – Does not use any index
Query 2 – Does nonclustered index seek
Now create Clustered Index over the same table.
/* Create Clustered Index over Table */
CREATEÂ CLUSTEREDÂ INDEX [IX_MyTable_Clustered]
ON [dbo].[MyTable]
(
[ID] ASC
) ON [PRIMARY]
GO
Once again run above two same query and see the execution plan.
/* Run following two queries together and observe the
result in 1st Query will use Index Seek
2nd Query will use Index Seek
*/
SELECT ID
FROM [MyTable]
WHERE First = 'First1' AND Second = 'Second1'
SELECT Second
FROM [MyTable]
WHERE First = 'First1' AND Second = 'Second1'
GO
Query 1 – Does nonclustered index seek
Query 2 – Does nonclustered index seek
Clean up the database by running following script.
/* Clean up */
DROPÂ TABLE [dbo].[MyTable]
GO
Let us go over our steps of whole exercise.
Step 1: We have one table and one nonclustered index.
Step 2: We ran Query 1 which does not use nonclustered index.
Step 3: We created clustered index over table.
Step 4: We ran Query 1 which now use nonclustered index.
What is puzzling and interesting is how come query suddenly started to use nonclustered query when clustered index is created on table?
Query 1 should not have used index which is for second query as there is no change in that index or query1. Additionally, Query 1 is not even retrieving the column which is in nonclustered index. We created clustered index on column used in Query 1, so it should make Query 1 to use that clustered index but instead it is using nonclustered index which was already created and available to use earlier.
The question is : The question is why this has happened? If Query can use nonclustered index why did it has to wait for clustered index to be created?
Answer:
The reason for this is that every nonclustered index refers to clustered index internally. When clustered index is created on table it reorganizes the table in the physical order of the clustered index. When there is no clustered index created on table at that time all nonclustered index points to data in the table to retrieve the data, however once clustered index is created all the nonclustered indexes are reorganized and they point to clustered index. This effect is creating index seek operation on nonclustered index. In our example column on which clustered index is created is in SELECT clause and WHERE clause contains columns which are used in nonclustered index, which is creating the effect which we have observed.
Let me know what do you think about this re-written article, it should be clear now what I am trying to suggest. Again, I am looking forward to your feedback about this subject. I will be discussing this in next UG meeting which is today. Please leave a not to me and we can go over this article on goto meeting or live meeting.
Reference: Pinal Dave (https://blog.sqlauthority.com)
24 Comments. Leave new
I didn’t read your previous post, but this one is very clear, thanks
Very nice.
This could be expanded upon to make a much larger article, but for a short, to-the-point, answer provided, this is very nice.
This is not a very good example. With only 5 rows, the entire table will fit on a single database page. It takes 2-3 page reads to do the index seek and bookmark look-up (to get the id column), versus only 1 page read to do a table scan.
If the table was larger, say 100 pages, it would choose the index seek since < 5 page reads for the seek and bookmark look-up is clearly better than 100 for the table scan.
For me the “a ha” moment was when I realized that after the creation of the clustered index the 1st query used a Index Seek because the index was in fact a “covering index”, that has all the information it needs (First, Second and ID – clustered index key) to resolve the query. Also I think is will be useful to mention that we can include more key or non-key columns in our non-clustered indexes to get larger covering indexes. Of course this has to be done wisely.
I modified the table load as follows, to create 1000 rows:
/* Create Sample Table */
SET Rowcount OFF
DECLARE @cntr int
SET @cntr=0
WHILE @cntr < 1000
BEGIN
INSERT INTO [dbo].[MyTable]
([ID],[First],[Second])
VALUES (@cntr, N’First’ + cast(@cntr as nchar), N’Second’ + cast(@cntr as nchar))
SET @cntr = @cntr+1
END
The results are very different from yours.
1) After the non-clustered index is created,
* For the SELECT Second… query an Index Seek is indeed performed on the WHERE clause. and it took 2% of the total query time.
* For the SELECT ID… query, SQL used an Index Seek and joined to a RID Lookup to find ID within the result set. This took 4% of the total query time.
–So the table scan was a factor of a small table size.
2) After adding a clustered index on ID,
* For the SELECT Second… query an Index Seek is again performed on the WHERE clause. and it took 2% of the total query time.
* For the SELECT ID… query, SQL also used only an Index Seek on First,Second. This took 2% of the total query time.
— Ivan Kotev was right that this creates a covering index.
Hi Pinalkumar Dave
I have question .
I have table that contains 10,000 rows.
I am a little confusing.
For example
The table(z_menu) schema is as like the below
ID int (IDENTITY_INSERT is ON)
User_ID nvarchar
MENU_ID nvarchar
The sample data is as like the below.
1, ‘user_a’, ‘q1211’
2, ‘user_a’, ‘a1211’
3, ‘user_a’, ‘c1232’
4, ‘user_b’, ‘c1232’
5, ‘user_b’, ‘e2312’
6, ‘user_c’, ‘s3122’
7, ‘user_c’, ‘q1211’
8, ‘user_c’, ‘c1232′
I use only the query
for example
select * from z_menu where User_ID=’user_a’
You emphasize “clustered index key”.
In this case
If so.
do i create clustered index key (id,User_ID) ?
do i create index key(User_ID) ?
what is optimal index ?
please help me.
@Jinsoo,
I was going through this video. I found this video very helpful for performance tuning like where to create which type of index to get best performance. Its a free video. I strongly recommend you to watch this video.
~ IM.
Actually, I was surprised when I read your results.
So, I tried myself.
when I run the EXACT script that you ran, I get Index-Seek for both the queries. Mine is SQL server 2000 Standard edition.
Suppose there are 2 tables.
First table is customer table having a primary Cust_id
2nd table is order table having Cust_id as reference key.
There is no primary key in 2nd table (order)
Please suggest me that which type of index is suitable on 2nd table (order) ?
Hi Pinal ,
When I inserted 1000 rows into table and I run the above query in that case
first query using 50% Index Seek and 50 % RID loop (i dont know what is this)..
And after creating clustered index it have same behavior as u said (100% from non clusters index seek)
Plz explore this for first case when we have many rows in table and only we have non clusterd index i .e contradicting what u said? ..
Thanks & Regards
Rahul Bhargava
Dear Dave,
Can you let me know as to where can i find the detailed explanation of Index and its type,also the need to create index.As your explanation is pretty much good,i would like read those topics from your website.
Regards
Denison.s
USE [AdventureWorks]
GO
/* */
CREATE TABLE [dbo].[MyTable](
[ID] [int] NOT NULL,
[First] [nchar](10) NULL,
[Second] [nchar](10) NULL
) ON [PRIMARY]
GO
what is the use of the statement ” ON [PRIMARY] ” there ??
It means that the table should be stored in the PRIMARY file group
What is puzzling and interesting is how come query suddenly started to use nonclustered query when clustered index is created on table?
Query 1 should not have used index which is for second query as there is no change in that index or query1. Additionally, Query 1 is not even retrieving the column which is in nonclustered index. We created clustered index on column used in Query 1, so it should make Query 1 to use that clustered index but instead it is using nonclustered index which was already created and available to use earlier.
Explanation is quite confusing..can u re-write with some figures ?
superb explaination ….conceptually..thanks pinal…keep it up god bless u
Thanks Pinal.
I have a query here. I have a big table having more than 300,000,000 rows.
This table is Partitioned on a date column.
It has a clustered index and a non clustered index. Both indexes have same columns.
Is this a good approach. Kindly suggest.
Hi Pinal,
A quick question.
Creating a Clustered index after the non-clustered index, is it not necessary to rebuild the non-clustered index(es)?
— Mahesh
SELECT ID
FROM [MyTable]
ON FIRING ABOVE QUERY IT STILL USING NON CLUSTERED INDEX WHERE WE HAVE NOT PLACED ANY COLUMN USED IN NON CLUSTERED INDEX..MY QUESTION WHY IT IS NOT USING CLUSTERED INDEX….PLS REPLY ME FOR BETTER UNDERSTANDING….
Hi, i am using clustered index as transaction date, if i am using non clustered index it is bit slow. where as clustered index is giving best result. please help me ,
Your all articles are really very useful for us Pinal!! please carry on this awesome help to all of us .Thanks !!
Very good