Today I came across very interesting observation while I was working on query optimization. Let us run the example first. Make sure to to enable Execution Plan (Using CTRL + M) before running comparison queries.
USE [AdventureWorks] GO /* */ CREATE TABLE [dbo].[MyTable]( [ID] [int] NOT NULL, [First] [nchar](10) NULL, [Second] [nchar](10) NULL ) ON [PRIMARY] GO /* Create Sample Table */ INSERT INTO [AdventureWorks].[dbo].[MyTable] ([ID],[First],[Second]) SELECT 1,'First1','Second1' UNION ALL SELECT 2,'First2','Second2' UNION ALL SELECT 3,'First3','Second3' UNION ALL SELECT 4,'First4','Second4' UNION ALL SELECT 5,'First5','Second5' GO
Now let us create nonclustered index over this table.
/* Create Nonclustered Index over Table */ CREATE NONCLUSTERED INDEX [IX_MyTable_NonClustered] ON [dbo].[MyTable] ( [First] ASC, [Second] ASC ) ON [PRIMARY] GO
Run following two queries together.
/* Run following two queries together and observe the result in by Enabling Actual Execution Plan (CTRL + M) 1st Query will use Table Scan 2nd Query will use Index Seek */ SELECT ID FROM [MyTable] WHERE First = 'First1' AND Second = 'Second1' SELECT Second FROM [MyTable] WHERE First = 'First1' AND Second = 'Second1' GO
It is clear from query that index applies to columns on which it is created. In our case as in WHERE condition we have same columns which are used in Index.
Now create Clustered Index over the same table.
/* Create Clustered Index over Table */ CREATE CLUSTERED INDEX [IX_MyTable_Clustered] ON [dbo].[MyTable] ( [ID] ASC ) ON [PRIMARY] GO
Once again run above two same query and see the execution plan.
/* Run following two queries together and observe the result in 1st Query will use Index Seek 2nd Query will use Index Seek */ SELECT ID FROM [MyTable] WHERE First = 'First1' AND Second = 'Second1' SELECT Second FROM [MyTable] WHERE First = 'First1' AND Second = 'Second1' GO
Clean up the database by running following script.
/*Clean up */ DROP TABLE [dbo].[MyTable] GO
Interesting part of above execution plan is now both queries are using nonclustered index scan. Logically first query should have not used index which is for second query as it was retrieving the column which was not in the nonclustered index. However, it did used the nonclustered index and only difference between our first execution and second execution is that we have created clustered index over the column which is retrieved in the first query.
The question is : The question is why this has happened?
In summary : A query which is not using nonclustered index to retrieve results used nonclustered index when clustered index created on the column which is retrieved.
The reason for this happening is that every nonclustered index refers to clustered index internally. When clustered index is created on table it reorganizes the table in the physical order of the clustered index. When there is no clustered index created on table at that time all nonclustered index points to data in the table to retrieve the data, however once clustered index is created all the nonclustered indexes are reorganized and they point to clustered index. This effect is creating index seek operation on nonclustered index in our case as column on which clustered index is created is in SELECT clause and WHERE clause contains columns which are used in nonclustered index.
Let me know what do you think about this article. It may be possible I have not explained this problem properly so I suggest if my readers can rewrite this part of problem and send it to me and I can include their documentation here.
Reference : Pinal Dave (https://blog.sqlauthority.com)
9 Comments. Leave new
With the NC index and without the Clustered index, the 1st query is a table scan and the 2nd query has an index seek.
When I add the clustered index, both queries are an index seek.
According to your documentation, both queries do an index seek with the NC index and without the CL index.
I think there is some modification in the documnetation after Clean up code….
Interesting part of above execution plan is now both queries are using nonclustered index SEEK (in place of scan)
hmm ….Interesting ……
as per your article … IMHO
observation: non-clustered index seek when a clustered index present, else a table scan
conclusion: when clustered index is introduced, it forces non-clustered index to map itself to Clustered index (internally) hence it’s seek, else it is independent.
assumption: some relation has to be present between Clustered and Non-Clustered index.
I am not sure if the conclusion is correct based on assumption. I would love to test, but dont have the time right now …… We need to take into account the cost of a table scan, non-clustered index seek etc … by the Query optimizer. Selectivity of an index generally plays a role also.
Has anybody tried this test on a table with say a 1000 rows instead of 5 ???? with almost unique data in last two columns ??
The reason for this behavior is :-
1. When you create a non-clust on a heap, it creates a B-tree with each node containing the [ non-clust keys + rowid ( i.e. the location of the data record )].
2. When you create a clust and a non-clust index on a table, it again creates a B-tree but with each node having [non-clust Key + clust key ] . Hence it did a seek, also can be termed as covering index in this case.
When you say non-clust index points to the clustered index that means its a bookmark lookup ( in 2000 terminology ).
when we create a clustered index on a table it stores the data in the index table.
my question is, is it stores entire row data of all columns or only data of the column on which we created clustered index?
by using index, how it is going to fastren the retrieving data from a table?
in noncluster index, it does not stores the column data on which we created nc index.
my question is, how it is going to retrieve the data fast when we use nc index column in select and where condition?
i have little bit confusion on this topic. Please anybody clear my confusion. i am very greateful to u in this regard.
thank you,
vego
@vego
A CLUSTERED INDEX stores the data of the TABLE itself according to the INDEX.
A non-CLUSTERED INDEX points to the CLUSTEREd INDEX.
IIUC, the speed increase for a CLUSTERED INDEX is when multiple records are returned that are consecutive in the INDEX. If the INDEX is CLUSTERED, the retrieval will be faster because it only has to check one area of the file, possibly one disk seek. A non-CLUSTERED INDEX is likely to have the data all over the data file, requiring more time for data retrieval.
Hi All,
this is my question
Query 1:
SELECT d.*,e.* FROM Department d ,Employee e ;
This has clustered index scan on Employee and Department
Query 2:
SELECT d.*,e.* FROM Department d ,Employee e where d.dno=e.dno ;
this has clustered index scan on Employee and clustered index seek on department. But the cost is greater than query 1 .
Pleease explain.
regards
sajitha
Hi Pinal,
Can you explain how clustered index work ? & how to fetch the data from the table when table have clustered index ?
Regards,
Ashok
Hi Pinal,
Can you explain which time we use clustered index & which time we use non-clustered index & why?
Regards
Anil Singh