SQL SERVER – Observation – Effect of Clustered Index over Nonclustered Index

February 4, 2009

SQL, SQL Performance, SQL Puzzle, SQL Server, SQL Tips and Tricks

Today I came across very interesting observation while I was working on query optimization. Let us run the example first. Make sure to to enable Execution Plan (Using CTRL + M) before running comparison queries.

USE [AdventureWorks]
GO
/* */
CREATE TABLE [dbo].[MyTable](
[ID] [int] NOT NULL,
[First] [nchar](10) NULL,
[Second] [nchar](10) NULL
) ON [PRIMARY]
GO
/* Create Sample Table */
INSERT INTO [AdventureWorks].[dbo].[MyTable]
([ID],[First],[Second])
SELECT 1,'First1','Second1'
UNION ALL
SELECT 2,'First2','Second2'
UNION ALL
SELECT 3,'First3','Second3'
UNION ALL
SELECT 4,'First4','Second4'
UNION ALL
SELECT 5,'First5','Second5'
GO

Now let us create nonclustered index over this table.

/* Create Nonclustered Index over Table */
CREATE NONCLUSTERED INDEX [IX_MyTable_NonClustered]
ON [dbo].[MyTable]
(
[First] ASC,
[Second] ASC
) ON [PRIMARY]
GO

Run following two queries together.

/* Run following two queries together and observe the
result in by Enabling Actual Execution Plan (CTRL + M)
1st Query will use Table Scan
2nd Query will use Index Seek
*/
SELECT ID
FROM [MyTable]
WHERE First = 'First1' AND Second = 'Second1'
SELECT Second
FROM [MyTable]
WHERE First = 'First1' AND Second = 'Second1'
GO

It is clear from query that index applies to columns on which it is created. In our case as in WHERE condition we have same columns which are used in Index.

Now create Clustered Index over the same table.

/* Create Clustered Index over Table */
CREATE CLUSTERED INDEX [IX_MyTable_Clustered]
ON [dbo].[MyTable]
(
[ID] ASC
) ON [PRIMARY]
GO

Once again run above two same query and see the execution plan.

/* Run following two queries together and observe the
result in 1st Query will use Index Seek
2nd Query will use Index Seek
*/
SELECT ID
FROM [MyTable]
WHERE First = 'First1' AND Second = 'Second1'
SELECT Second
FROM [MyTable]
WHERE First = 'First1' AND Second = 'Second1'
GO

Clean up the database by running following script.

/*Clean up */
DROP TABLE [dbo].[MyTable]
GO

Interesting part of above execution plan is now both queries are using nonclustered index scan. Logically first query should have not used index which is for second query as it was retrieving the column which was not in the nonclustered index. However, it did used the nonclustered index and only difference between our first execution and second execution is that we have created clustered index over the column which is retrieved in the first query.

The question is : The question is why this has happened?

In summary : A query which is not using nonclustered index to retrieve results used nonclustered index when clustered index created on the column which is retrieved.

The reason for this happening is that every nonclustered index refers to clustered index internally. When clustered index is created on table it reorganizes the table in the physical order of the clustered index. When there is no clustered index created on table at that time all nonclustered index points to data in the table to retrieve the data, however once clustered index is created all the nonclustered indexes are reorganized and they point to clustered index. This effect is creating index seek operation on nonclustered index in our case as column on which clustered index is created is in SELECT clause and WHERE clause contains columns which are used in nonclustered index.

Let me know what do you think about this article. It may be possible I have not explained this problem properly so I suggest if my readers can rewrite this part of problem and send it to me and I can include their documentation here.

Reference : Pinal Dave (https://blog.sqlauthority.com)

SQL Constraint and Keys, SQL Index, SQL Scripts

SQLAuthority News – Download SQL Server 2008 System Views Poster – PDF – A Wall Poster

SQLAuthority News – Two Promotion to Help Community

9 Comments. Leave new

HG Humphrey
February 5, 2009 3:01 am

With the NC index and without the Clustered index, the 1st query is a table scan and the 2nd query has an index seek.

When I add the clustered index, both queries are an index seek.

According to your documentation, both queries do an index seek with the NC index and without the CL index.

Reply
Ashish Jain
February 5, 2009 12:41 pm

I think there is some modification in the documnetation after Clean up code….

Interesting part of above execution plan is now both queries are using nonclustered index SEEK (in place of scan)

Reply
amit
February 6, 2009 1:41 am

hmm ….Interesting ……

as per your article … IMHO

observation: non-clustered index seek when a clustered index present, else a table scan

conclusion: when clustered index is introduced, it forces non-clustered index to map itself to Clustered index (internally) hence it’s seek, else it is independent.

assumption: some relation has to be present between Clustered and Non-Clustered index.

I am not sure if the conclusion is correct based on assumption. I would love to test, but dont have the time right now …… We need to take into account the cost of a table scan, non-clustered index seek etc … by the Query optimizer. Selectivity of an index generally plays a role also.

Has anybody tried this test on a table with say a 1000 rows instead of 5 ???? with almost unique data in last two columns ??

Reply
Nikhil
February 7, 2009 2:58 am

The reason for this behavior is :-

1. When you create a non-clust on a heap, it creates a B-tree with each node containing the [ non-clust keys + rowid ( i.e. the location of the data record )].

2. When you create a clust and a non-clust index on a table, it again creates a B-tree but with each node having [non-clust Key + clust key ] . Hence it did a seek, also can be termed as covering index in this case.

When you say non-clust index points to the clustered index that means its a bookmark lookup ( in 2000 terminology ).

Reply
vego
September 13, 2009 11:14 pm

when we create a clustered index on a table it stores the data in the index table.
my question is, is it stores entire row data of all columns or only data of the column on which we created clustered index?

by using index, how it is going to fastren the retrieving data from a table?

in noncluster index, it does not stores the column data on which we created nc index.

my question is, how it is going to retrieve the data fast when we use nc index column in select and where condition?

i have little bit confusion on this topic. Please anybody clear my confusion. i am very greateful to u in this regard.

thank you,
vego

Reply
Brian Tkatch
September 14, 2009 6:10 pm

@vego

A CLUSTERED INDEX stores the data of the TABLE itself according to the INDEX.

A non-CLUSTERED INDEX points to the CLUSTEREd INDEX.

IIUC, the speed increase for a CLUSTERED INDEX is when multiple records are returned that are consecutive in the INDEX. If the INDEX is CLUSTERED, the retrieval will be faster because it only has to check one area of the file, possibly one disk seek. A non-CLUSTERED INDEX is likely to have the data all over the data file, requiring more time for data retrieval.

Reply
Sajitha
April 11, 2010 6:40 pm

Hi All,
this is my question

Query 1:
SELECT d.*,e.* FROM Department d ,Employee e ;

This has clustered index scan on Employee and Department

Query 2:

SELECT d.*,e.* FROM Department d ,Employee e where d.dno=e.dno ;

this has clustered index scan on Employee and clustered index seek on department. But the cost is greater than query 1 .

Pleease explain.

regards
sajitha

Reply
Ashok Kumar
July 7, 2010 5:36 pm

Hi Pinal,

Can you explain how clustered index work ? & how to fetch the data from the table when table have clustered index ?

Regards,
Ashok

Reply
Anil Singh
August 25, 2010 1:22 pm

Hi Pinal,

Can you explain which time we use clustered index & which time we use non-clustered index & why?

Regards
Anil Singh

Reply