SQL SERVER – An Interesting Case of Redundant Indexes – Index on Col1, Col2 and Index on Col1, Col2, Col3 – Part 6

This is the sixth part of the series regarding Redundant Indexes. If have not read earlier part – there is quite a good chance that you will miss the context of this part. I quickly suggest you to read earlier four parts. On my production server I personally use embarcadero DB Optimizer for all performance tuning and routine health check up. I will be interested to know what is your feedback about the product.

Part 1: What is Redundant Indexes?
Conversation between Mike and Jon – where they discuss about the fundamentals of Redundant Indexes.

Part 2: Demo – What kind of Redundant Indexes are Useful with a smaller result set
Here Jon demonstrates a script where even though Redundant Indexes exist they are extremely useful and should not be dropped. The result set contains the very small result set.

Part 3: Demo – What kind of Redundant Indexes are Useful with a larger result set
Here Jon demonstrates a script where even though Redundant Indexes exist they are extremely useful and should not be dropped. The result set contains the very large result set.

Part 4: An Interesting Case of Redundant Indexes – Index on Col1, Col2 and Index on Col1, Col2, Col3 – Part 4
In this part Jon demonstrates a script where the direction or order clause of the index is different making the index behaves very differently. To quality any index redundant the order of the index should also be considered.

Part 5:  An Interesting Case of Redundant Indexes – Index on Col1 and Included Columns Col2 and Col3 – Part 5
In this part Jon demonstrates how included columns also take part in the decision making of if the column is redundant and duplicate or not.

Once again if you have not read the earlier blog post, there are good chances you will miss the core point of this blog post. I suggest you at least read the conversation between Mike and Jon to get familiar with their personalities.

Jon and Mike continue their conversation from earlier blog post.

Mike – Jon, I think I am now more clear with the redundant indexes scenario. I have understood that following indexes are not redudant indexes even though they have key column exactly same becasue of the included columns are different from each other. Am I correct to understand the same?

Table1: SampleTable1 
Index on SampleTable1: Col1 ASC INCLUDE Col2 ASC
Index on SampleTable1: Col1 ASC INCLUDE Col3 ASC

Jon – Excellent Mike, I think you got this one correct. It makes me happy that we are not the right path now. Let me know if you have any further question.

Mike - The matter of the fact, I do have a question. I do understand that both the above indexes are different kind of index and they don’t duplicate but is there any possibility to create  a single index which can be as effective as both the indexes. I think it should be possible as both of the above index have the same keys.

Jon – Indeed a great question, actually you can do that as well.

Mike – Really! If that is the case, would you please explain me how to do that.

Jon – Sure, it is actually very easy to do so. Instead of above two indexes, you can just create a single index which has both the columns included in the INCLUDED clause. Here is the example:

Table1: SampleTable1 
Index on SampleTable1: Col1 ASC INCLUDE Col2, Col3

Here you can see that you have created a single index which is effective for Col1 key column and Col2 and Col3 in an included column. Remember, order of the column always mattered however, it does not matter when we are discussing INCLUDED Columns. In this case, it would have not matter if we have written INCLUDE Col3, Col2 instead of INCLUDE Col2, Col3 in the create index.

Mike – Oh, so this is interesting to know that column order does not matter in the Included clause. Would it be possible to see a demo where you demonstrate that this single Index can be replacement of the above two indexes.

Jon – Absolutely – let us check.

First create a table called SampleTable1.

USE tempdb
GO
-- Create table
CREATE TABLE SampleTable1 (ID INT, Col1 INT, Col2 INT, Col3 INT)
GO

Now let us create indexes on Sample Table.

Clustered Indexes are just created for reference – the results of this test will not be affected by its presence or absence of it. I have just created to rule out few doubts I anticipate by their absence. The important part is non clustered indexes. One of the non-clustered index is created on Col1 ASC INCLUDE Col2 ASC and another one is created on Col1 ASC INCLUDE Col2 ASC

Table1: SampleTable1 
Index on SampleTable1: Col1 ASC INCLUDE Col2 ASC
Index on SampleTable1: Col1 ASC INCLUDE Col3 ASC

-- Create Indexes on Sample Table1
CREATE CLUSTERED INDEX [CX_ST]
ON SampleTable1 (ID)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2]
ON SampleTable1 (Col1)
INCLUDE (Col2)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col3]
ON SampleTable1 (Col1)
INCLUDE (Col3)
GO

Now let us populate the table. The tables have absolutely same data.

-- Populate tables
INSERT INTO SampleTable1 (ID, Col1, Col2, Col3)
SELECT RAND()*10000, RAND()*1000, RAND()*100, RAND()*10
GO 100000

Now is the most interesting part. For this first enable the execution plan in SSMS (shortcut key – CTRL + M).

In this test we will run two different queries on both the tables.

Let us first run two scripts for SampleTable1

-- Select from SampleTable1
SELECT Col1, Col2
FROM SampleTable1 st
WHERE st.Col1 > 10 AND st.Col1 < 20
ORDER BY st.Col1 DESC, st.Col2 DESC
GO
-- Select from SampleTable1
SELECT Col1, Col3
FROM SampleTable1 st
WHERE st.Col1 > 10 AND st.Col1 < 20
ORDER BY st.Col1 ASC, st.Col3 DESC
GO

Let us check the execution plan:

Now when you see the execution plan, you will notice that both the queries are taking an equal amount of the resources to execute both the queries. They each take 50% of the time compared to the each other. However, we need to pay extra attention to the queries and see which indexes each of the query is using.

Let us examine very first query individually.

We can see from the execution plan – it is using the index – IX_ST_Col1_Col2.

Now let us examine second query individually.

We can see from the execution plan – it is using the index – IX_ST_Col1_Col3.

Now you can see how the included columns impact the usage of the index. Even though the key column is the same – included columns are integral part of the index and they need to be evaluated.

Creating Included Index

So far everything is very similar to our earlier blog posts. Now let us create a new included column index where we have Col2 and Col3 are included indexes.

-- Create Included Index
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2_Col3]
ON SampleTable1 (Col1)
INCLUDE (Col2,Col3)
GO

As discussed you can change the order of the columns in the Included clause and it will not matter. Now let us run the SELECT statement which we have run earlier and see which index each of the SELECT statement is using.

-- Select from SampleTable1
SELECT Col1, Col2
FROM SampleTable1 st
WHERE st.Col1 > 10 AND st.Col1 < 20
ORDER BY st.Col1 DESC, st.Col2 DESC
GO
-- Select from SampleTable1
SELECT Col1, Col3
FROM SampleTable1 st
WHERE st.Col1 > 10 AND st.Col1 < 20
ORDER BY st.Col1 ASC, st.Col3 DESC
GO

You can notice that both the SELECT statement is using newly created index IX_ST_Col1_Col2_Col3 as this single index now satisfies the need of the both the indexes created earlier. Now when you create this new index, your earlier two indexes IX_ST_Col1_Col2 and IX_ST_Col1_Col3 are not useful and they should be dropped eventually.

     

Now you can clean up using the following script:

-- Clean Up
DROP TABLE SampleTable1
GO

Mike – Wow! Jon, more I learn about indexes, I feel more I do not know about them. I think I am going to rely on embarcadero DB Optimizer for most of my performance tuning needs.

Jon – Absolutely,  embarcadero DB Optimizer is your best friend.  It helps finding users the precise details related to index usage. It also indicates if Indexes are useful or useless. This tool is very easy to use and help in various aspects of the performance tuning. This tool can help you with writing proper queries, do indexes maintenance and optimize overall health of the server.

Mike – Thanks Jon!

Click to Download Scripts

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – TechEd India 2013 Sessions and Relevent Pluralsight Courses

I am presenting at TechEd India 2013 two SQL Server session. You can read about my session in this blog post.

Yesterday I presented on topic SQL Server Performance Troubleshooting: Ancient Problems and Modern Solutions. Today I will be presenting on the subject Indexes – The Unsung Hero. If you are at TechEd India you must show up in my session – we will have fun talking about Indexes and performance tuning together. You can read about various details about the session over here.

However, if you are not at TechEd India 2013 and still want to know what I am going to cover in the session here is something you can do it. You can watch the following two of my Pluralsight courses which absolutely covers what I am going to talk about in TechEd India 2013.

SQL Server Performance: Introduction to Query Tuning 

SQL Server performance tuning is an art to master – for developers and DBAs alike. This course takes a systematic approach to planning, analyzing, debugging and troubleshooting common query-related performance problems. This includes an introduction to understanding execution plans inside SQL Server.

In this almost four hour course we cover following important concepts.

  • Introduction 10:22
  • Execution Plan Basics 45:59
  • Essential Indexing Techniques 20:19
  • Query Design for Performance 50:16
  • Performance Tuning Tools 01:15:14
  • Tips and Tricks 25:53
  • Checklist: Performance Tuning 07:13

The duration of each module is mentioned besides the name of the module.

SQL Server Performance: Indexing Basics

This course teaches you how to master the art of performance tuning SQL Server by better understanding indexes.

In this almost two hour course we cover following important concepts.

  • Introduction 02:03
  • Fundamentals of Indexing 22:21
  • Practical Indexing Implementation Techniques 37:25
  • Index Maintenance 16:33
  • Introduction to ColumnstoreIndex 08:06
  • Indexing Practical Performance Tips and Tricks 24:56
  • Checklist : Index and Performance 07:29

The duration of each module is mentioned besides the name of the module. You will have to login to watch above courses. 

So if you are going to miss my session at TechEd India, you still have a chance to catch up on what I am going to present by watching the Pluralsight courses listed above.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQLAuthority News – Presenting Two Technology Sessions at TechEd India 2013 – Today and Tomorrow

Every year I am looking forward to TechEd India as it presents a wonderful opportunity to meet community in first hand. Community is my passion and I love to get involved with it at every single opportunity it presents. This year once again I am presenting a technology session at TechEd India. I will be presenting on the following subject.

Here is something I promise – if you attend my session – when you walk out of my session you will immediate action items which you can use it for your production server and improve the performance of the database. Additionally, I will have some goodies with me for everyone. I will have few of my books, free subscription access to Pluralsight’s library as well something totally interesting as a giveaway.

Here are the details of the sessions:

SQL Server Performance Troubleshooting: Ancient Problems and Modern Solutions

Date: March 18, 2013 – Bangalore
Date: March 25, 2013 – Pune
Time: 2:45 – 3:45 PM

Many believe Performance Tuning and Troubleshooting is an art which has been lost in time. The truth is that art has evolved with the time and there are more tools and techniques to overcome ancient troublesome scenarios. There three major resource when bottlenecked creates performance problem: CPU, IO, and Memory. In this session we will focus on High CPU scenarios detection and their resolutions. If time permits we will cover other performance related tips and tricks. At the end of the session attendee will have clear ideas and action items regarding what to do in when facing any of the above resource intensive scenarios.

Indexes – The Unsung Hero

Date: March 19, 2013 – Bangalore
Date: March 26, 2013 – Pune
Time: 2:45 – 3:45 PM

Slow Running Queries  are the most common problem that developers face while working with SQL Server. While it is easy to blame the SQL Server for unsatisfactory performance, however the issue often persists with the way queries have been written, and how Indexes has been set up. The session will focus on the ways of identifying problems that slow down SQL Server, and Indexing tricks to fix them. Developers will walk out with scripts and knowledge that can be applied to their servers, immediately post the session.

Looking forward to see you in session!

Pluralsight will be sponsoring the TechEd India event and don’t forget to stop by the Pluralsight booth for some goodies and meet the Pluralsight executives.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Avoid Using Function in WHERE Clause – Scan to Seek

“Don’t use functions in the WHERE clause, they reduce performance.”

I hear this quite often. This is true but this subject is hard to understand in a single statement. Let us see what it means and how to use the function in the WHERE clause.

We will be using sample database AdventureWorks in this example. Additionally, turn on STATISTICS IO ON settings so we can see various statistics as well.

USE AdventureWorks2012
GO
SET STATISTICS IO ON
GO

Let us first execute following query and check the execution plan and statistics.

-- SCAN - Select values from SalesOrderDetail
SELECT [SalesOrderID], [SalesOrderDetailID], [ModifiedDate]
FROM [Sales].[SalesOrderDetail]
WHERE DATEDIFF(YEAR,ModifiedDate,GETDATE()) < 0
GO

Table 'SalesOrderDetail'. Scan count 1, logical reads 1246

You can see that the above query is scanning the whole table as well even though it is not returning any result it is reading 1246 pages from database.

In this case we are retrieving the data based on the ModifiedDate so we will create an index on the ModifiedDate Column.

-- Create Index on ModifiedDate
CREATE NONCLUSTERED INDEX [IX_SalesOrderDetail_ModifiedDate]
ON [Sales].[SalesOrderDetail] ([ModifiedDate])
GO

Now we have created an index on the ModifiedDate column we will once again execute the same query which we had run earlier.

-- SCAN - Select values from SalesOrderDetail
SELECT [SalesOrderID], [SalesOrderDetailID], [ModifiedDate]
FROM [Sales].[SalesOrderDetail]
WHERE DATEDIFF(YEAR,ModifiedDate,GETDATE()) < 0
GO

Table 'SalesOrderDetail'. Scan count 1, logical reads 338

You can see that the above query is still scanning the whole table as well even though it is not returning any result it is read over 338 pages from database.

The reason for the same is because in the query we are using the function DATEDIFF over the column ModifiedDate. As the outcome of this function is evaluated at the run time, the SQL Server Query engine has to scan the whole table to get necessary data.

To avoid this situation, let us try to avoid using the function on the column of the table and re-write the query. To re-write the query let us first understand what the query is retrieved. It is retrieving all the rows where the year difference between ModifiedDate and Current Date is less than 0. In other words what it means is that we need to retrieve the records which have a future date. We can simply re-write above query in the following way.

-- SEEK - Select values from SalesOrderDetail
SELECT [SalesOrderID], [SalesOrderDetailID], [ModifiedDate]
FROM [Sales].[SalesOrderDetail]
WHERE ModifiedDate > GETDATE()
GO

Table 'SalesOrderDetail'. Scan count 1, logical reads 3

Now let us execute the query and see the execution plan. We can see that there are only 3 logical read now and execution plan is also displaying Seek. This is because now we are not using any function over the column used in the WHERE clause.

To clean up you can run following script to drop the newly created index.

-- Cleanup
DROP INDEX [IX_SalesOrderDetail_ModifiedDate] ON [Sales].[SalesOrderDetail]
GO

In our case both the queries are returning the same result, a simple re-write can make a lot of difference.

Click to Download Scripts

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – An Interesting Case of Redundant Indexes – Index on Col1 and Included Columns Col2 and Col3 – Part 5

This is the fifth part of the series regarding Redundant Indexes. If have not read earlier part – there is quite a good chance that you will miss the context of this part. I quickly suggest you to read earlier four parts. On my production server I personally use embarcadero DB Optimizer for all performance tuning and routine health check up. I will be interested to know what is your feedback about the product.

Part 1: What is Redundant Indexes?
Conversation between Mike and Jon – where they discuss about the fundamentals of Redundant Indexes.

Part 2: Demo – What kind of Redundant Indexes are Useful with a smaller result set
Here Jon demonstrates a script where even though Redundant Indexes exist they are extremely useful and should not be dropped. The result set contains the very small result set.

Part 3: Demo – What kind of Redundant Indexes are Useful with a larger result set
Here Jon demonstrates a script where even though Redundant Indexes exist they are extremely useful and should not be dropped. The result set contains the very large result set.

Part 4: An Interesting Case of Redundant Indexes – Index on Col1, Col2 and Index on Col1, Col2, Col3 – Part 4
In this part Jon demonstrates a script where the direction or order clause of the index is different making the index behaves very differently. To quality any index redundant the order of the index should also be considered.

Once again if you have not read the earlier blog post, there are good chances you will miss the core point of this blog post. I suggest you at least read the conversation between Mike and Jon to get familiar with their personalities.

Jon and Mike continue their conversation from last week.

Mike – Jon, you have been asking me many difficult questions. Let me simply ask you following question and instead of long answer – I would like you to give me a direct question answer.

Jon - Well, we can for sure try that way as well. What is your question?

Mike - When I see following columns as part of the index key, they are duplicate index correct?

Index A1: Col1 ASC
Index B1: Col1 ASC

In this case, you can see I have both the indexes are same and their order of the column is same as well. I think this means that they are redundant and duplicate index correct?

Jon - Mike, indeed a great question. If the included columns are also the same, they are indeed a same index. Now you have a simple answer.

Mike - No it is not a simple answer. You inserted the curveball of included column index – what is that now?

Jon - Ha ha – well SQL Server is not simple – here is the article you should read about Included Column Index it explains the concept quite well: Understanding Included Column Index along with Clustered Index and Non-clustered Index.

Once you read it lets us run a quick demo script. The script is very similar as earlier script but this time the index is set up as follows:

Index A1: Col1 ASC INCLUDE Col2 ASC
Index A1: Col1 ASC INCLUDE Col3 ASC

You can see that the key column (Col1) is same in both the indexes. However, the Included Columns are different. Now let us run following script.

First create a table called SampleTable1.

USE tempdb
GO
-- Create table
CREATE TABLE SampleTable1 (ID INT, Col1 INT, Col2 INT, Col3 INT)
GO

Now let us create indexes on Sample Table.

Clustered Indexes are just created for reference – the results of this test will not be affected by its presence or absence of it. I have just created to rule out few doubts I anticipate by their absence. The important part is non clustered indexes. One of the non-clustered index is created on Col1 ASC INCLUDE Col2 ASC and another one is created on Col1 ASC INCLUDE Col2 ASC

Table1: SampleTable1 
Index on SampleTable1: Col1 ASC INCLUDE Col2 ASC
Index on SampleTable1: Col1 ASC INCLUDE Col3 ASC

-- Create Indexes on Sample Table1
CREATE CLUSTERED INDEX [CX_ST]
ON SampleTable1 (ID)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2]
ON SampleTable1 (Col1)
INCLUDE (Col2)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col3]
ON SampleTable1 (Col1)
INCLUDE (Col3)
GO

Now let us populate the table. The tables have absolutely same data.

-- Populate tables
INSERT INTO SampleTable1 (ID, Col1, Col2, Col3)
SELECT RAND()*10000, RAND()*1000, RAND()*100, RAND()*10
GO 100000

Now is the most interesting part. For this first enable the execution plan in SSMS (shortcut key – CTRL + M).

In this test we will run two different queries on both the tables.

Let us first run two scripts for SampleTable1

-- Select from SampleTable1
SELECT Col1, Col2
FROM SampleTable1 st
WHERE st.Col1 > 10 AND st.Col1 < 20
ORDER BY st.Col1 DESC, st.Col2 DESC
GO
-- Select from SampleTable1
SELECT Col1, Col3
FROM SampleTable1 st
WHERE st.Col1 > 10 AND st.Col1 < 20
ORDER BY st.Col1 ASC, st.Col3 DESC
GO

Let us check the execution plan:

Now when you see the execution plan, you will notice that both the queries are taking an equal amount of the resources to execute both the queries. They each take 50% of the time compared to the each other. However, we need to pay extra attention to the queries and see which indexes each of the query is using.

Let us examine very first query individually.

We can see from the execution plan – it is using the index – IX_ST_Col1_Col2.

Now let us examine second query individually.

We can see from the execution plan – it is using the index – IX_ST_Col1_Col3.

Now you can see how the included columns impact the usage of the index. Even though the key column is the same – included columns are integral part of the index and they need to be evaluated.

Now you can clean up using the following script:

-- Clean Up
DROP TABLE SampleTable1
GO

Mike – Jon this is getting indeed complicated for me. I think this subject looks very easy but indeed it keeps on getting confusing. I think I will go back to the professional efficiency tool which you mentioned last time. I have yet to download it, can you provide me more details about it.

Jon – Absolutely, it is embarcadero DB Optimizer. It helps finding users the precise details related to index usage. It also indicates if Indexes are useful or useless. This tool is very easy to use and help in various aspects of the performance tuning. This tool can help you with writing proper queries, do indexes maintenance and optimize overall health of the server. 

Mike – Thanks Jon! I still have one more question but I will ask you about it next time.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Primary Key and NonClustered Index in Simple Words

I have been writing a weekly round up from my blog where I go over last six years of blog posts and pick the best posts from the pasts. While I do this, there are two major place where I focus 1) If there are change in features – I re-blog about it with additional details or 2) If I have not provided complete information six years ago, I try to fill up the gap now. Well, just like everything my knowledge and writing skills have evolved. Before continuing please read my latest memory lane blog post where in 2007 I wrote scripts for Primary Key and Unique Key.

November 2006 was when I started to learn more about SQL and have been only 4 months in the product, I was still exploring various subjects. I wrote a blog post describing about how Primary Key and Unique Key are different. Everything which I wrote there is correct, however, there are a few more details one should learn when it is about Primary Key and Clustered Index.

Here is the common misconception prevailing in the industry.

Primary Key has to be Clustered Index. 

In reality the statement should be corrected as follows:

Primary Key can be Clustered or Non-clustered but it is a common best practice to create a Primary Key as Clustered Index. 

Well, now we have corrected the statement let us understand a bit more in detail. Primary Key should be uniquely identifying column of the table and it should be NOT NULL. A good (most of the time) candidate of the clustered index key also uniquely identifies column and NOT NULL (most of the time). Well, that means it is a good idea to create a Primary Key Clustered so it solve both the problems together. Keeping these facts in mind SQL Server automatically creates Clustered Index on the Primary Key when the table is created. Developers often do not specify which column should have clustered index so by default Primary Key becomes Clustered Index. This practice is now extremely common and lots of people have forgotten that Primary Key and Clustered Index is two different things. They can be same column but they do not have to be.

Well here are four examples we will see where we will learn the behavior of SQL Server when it is about Primary Key and Clustered Index.

  • Scenario 1 : Primary Key will default to Clustered Index
  • Scenario 2: Primary Key is defined as a Non-clustered Index
  • Scenario 3: Primary Key defaults to Non-Clustered Index with another column defined as a Clustered Index
  • Scenario 4: Primary Key defaults to Clustered Index with other index defaults to Non-clustered index

Now let us see each of the scenarios in detail.

Scenario 1 : Primary Key will default to Clustered Index

In this case we will create only Primary Key and when we check the kind of index created on the table we will notice that it has created clustered index automatically over it.

-- Case 1 Primary Key Defaults to Clustered Index
USE TempDB
GO
-- Create table
CREATE TABLE TestTable
(ID INT NOT NULL PRIMARY KEY,
Col1 INT NOT NULL)
GO
-- Check Indexes
SELECT OBJECT_NAME(OBJECT_ID) TableObject,
[name] IndexName,
[Type_Desc]
FROM sys.indexes
WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable'
GO
-- Clean up
DROP TABLE TestTable
GO

Scenario 2: Primary Key is defined as a Non-clustered Index

In this case we will explicitly defined Primary Key as a non-clustered index and it will create it as a non-clustered index. It proves that Primary Key can be non-clustered index.

-- Case 2 Primary Key Non-clustered Index
USE TempDB
GO
-- Create table
CREATE TABLE TestTable
(ID INT NOT NULL PRIMARY KEY NONCLUSTERED,
Col1 INT NOT NULL)
GO
-- Check Indexes
SELECT OBJECT_NAME(OBJECT_ID) TableObject,
[name] IndexName,
[Type_Desc]
FROM sys.indexes
WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable'
GO
-- Clean up
DROP TABLE TestTable
GO

Scenario 3: Primary Key defaults to Non-Clustered Index with another column defined as a Clustered Index

In this case we will create clustered index on another column, SQL Server will automatically create a Primary Key as a non-clustered index as clustered index is specified on another column.

-- Case 3 Primary Key Defaults to Non-clustered Index
USE TempDB
GO
-- Create table
CREATE TABLE TestTable
(ID INT NOT NULL PRIMARY KEY,
Col1 INT NOT NULL UNIQUE CLUSTERED)
GO
-- Check Indexes
SELECT OBJECT_NAME(OBJECT_ID) TableObject,
[name] IndexName,
[Type_Desc]
FROM sys.indexes
WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable'
GO
-- Clean up
DROP TABLE TestTable
GO

Scenario 4: Primary Key defaults to Clustered Index with other index defaults to Non-clustered index

In this case we will create two indexes on the both the tables but we will not specify the type of the index on the columns. When we check the results we will notice that Primary Key is automatically defaulted to Clustered Index and another column as a Non-clustered index.

-- Case 4 Primary Key and Defaults
USE TempDB
GO
-- Create table
CREATE TABLE TestTable
(ID INT NOT NULL PRIMARY KEY,
Col1 INT NOT NULL UNIQUE)
GO
-- Check Indexes
SELECT OBJECT_NAME(OBJECT_ID) TableObject,
[name] IndexName,
[Type_Desc]
FROM sys.indexes
WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable'
GO
-- Clean up
DROP TABLE TestTable
GO

I think above examples clarifies if there are any confused related to Primary and Clustered Index.

Now here is the question I often get asked what can be the reason for creating Primary Key and Clustered Index Key on different columns. Well, there are many scenarios when this can be true. It is possible that you have column SSN which you want to create as a Primary Key but do not want make it as a clustered index key because you have uniquely increasing identity column which best suites your need for that table (again this is just an example – you can argue exactly the opposite way as well). You are welcome to continue the discussion on this subject in the comments field or a dedicated blog post I wrote about it years ago over here. There are few really good comments there – I think that blog post is now a gold mine to understand this concept.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – An Interesting Case of Redundant Indexes – Index on Col1, Col2 and Index on Col1, Col2, Col3 – Part 4

This is the fourth part of the series regarding Redundant Indexes. If have not read earlier part – there is quite a good chance that you will miss the context of this part. I quickly suggest you to read earlier three parts. On my production server I personally use embarcadero DB Optimizer for all performance tuning and routine health check up. I will be interested to know what is your feedback about the product.

Part 1: What is Redundant Indexes?
Conversation between Mike and Jon – where they discuss about the fundamentals of Redundant Indexes.

Part 2: Demo – What kind of Redundant Indexes are Useful with a smaller result set
Here Jon demonstrates a script where even though Redundant Indexes exist they are extremely useful and should not be dropped. The result set contains the very small result set.

Part 3: Demo – What kind of Redundant Indexes are Useful with a larger result set
Here Jon demonstrates a script where even though Redundant Indexes exist they are extremely useful and should not be dropped. The result set contains the very large result set.

Once again if you have not read the earlier blog post, there are good chances you will miss the core point of this blog post. I suggest you at least read the conversation between Mike and Jon to get familiar with their personalities.

Jon and Mike continue their conversation from last week.

Mike – Jon, In very first day we discussed that we should also pay attention to the Order By clause of the columns used in the Indexes. Do you have any working demonstration example where we can understand that concept in depth.

Jon – Sure, Mike. I think you are referring to following example which we have discussed in the conversation.

Index 5: Col1 ASC, Col2 DESC
Index 6: Col1 DESC, Col2 DESC, Col3 ASC

Am I correct?

Mike – Yes, would you please explain how ASC and DESC works with the case of Redundant Indexes.

In normal case, we will not see the direction of the Columns used and consider Indexes as a redudant. For example when I see Index 5 and Index 6 I would see them as following.

Index 5: Col1 ASC, Col2 DESC
Index 6: Col1 DESC, Col2 DESC, Col3 ASC

I will just see that Index 5 has two columns (Col1 and Col2), whereas Index 6 have three columns (Col1, Col2, Col3) and they are redundant (read the very first blog post for why they are considered as a redundant).

However, from our conversation I do understand that the columns used to build indexes have a direction and we should consider them as an important factor before we make these indexes redundant.

Jon – Great Point Mike! You are absolutely correct. Let us understand the same using example.

We will create a single table with two different Indexes. Later we will run two different but similar looking queries on the table. We will observe how their index usage pattern is.

First create a table called SampleTable1.

USE tempdb
GO
-- Create table
CREATE TABLE SampleTable1 (ID INT, Col1 INT, Col2 INT, Col3 INT)
GO

Now let us create indexes on Sample Table.

Clustered Indexes are just created for reference – the results of this test will not be affected by its presence or absence of it. I have just created to rule out few doubts I anticipate by their absence. The important part is non clustered indexes. One of the non-clustered index is created on Col1 & Col2 DESC and another one is created on Col1 DESC, Col2 DESC, and Col3.

Table1: SampleTable1 
Index on SampleTable1: IX_ST_Col1_Col2
Index on SampleTable1: IX_ST_Col1_Col2_Col3

-- Create Indexes on Sample Table1
CREATE CLUSTERED INDEX [CX_ST]
ON SampleTable1 (ID)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2]
ON SampleTable1 (Col1 ASC, Col2 DESC)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2_Col3]
ON SampleTable1 (Col1 DESC, Col2 DESC, Col3 ASC)
GO

Now let us populate both the tables. Both the tables have absolutely same data.

-- Populate tables
INSERT INTO SampleTable1 (ID, Col1, Col2, Col3)
SELECT RAND()*10000, RAND()*1000, RAND()*100, RAND()*10
GO 100000

Now is the most interesting part. For this first enable the execution plan in SSMS (shortcut key – CTRL + M).

In this test we will run two different queries on both the tables.

Let us first run two scripts for SampleTable1

-- Select from SampleTable1
SELECT Col1, Col2
FROM SampleTable1 st
WHERE st.Col1 > 10 AND st.Col1 < 20
ORDER BY st.Col1 DESC, st.Col2 DESC
GO
-- Select from SampleTable1
SELECT Col1, Col2
FROM SampleTable1 st
WHERE st.Col1 > 10 AND st.Col1 < 20
ORDER BY st.Col1 ASC, st.Col2 DESC
GO

Let us check the execution plan:

Now when you see the execution plan, you will notice that both the queries are taking an equal amount of the resources to execute both the queries. They each take 50% of the time compared to the each other. However, we need to pay extra attention to the queries and see which indexes each of the query is using.

Let us examine very first query individually.

We can see from the execution plan – it is using the index – IX_ST_Col1_Col2_Col3.

Now let us examine second query individually.

We can see from the execution plan – it is using the index – IX_ST_Col1_Col2.

Now let us put a everything together quickly.

When WHERE Conditions are st.Col1 DESC, st.Col2 DESC they use the Index with Columns Col1 DESC, Col2 DESC, Col3 ASC.
When WHERE Conditions are st.Col1 ASC, st.Col2 DESC they use the Index with ColumnsCol1 ASC, Col2 DESC.

I think by for you must have guessed that there is a relationship between the order of the columns with the ORDER by Clause used in the WHERE condition.

Even though, Index IX_ST_Col1_Col2 looks like a Redundant Index, it is not a Redundant Index of all. They are matter of the fact very different Indexes. They should be both kept if both are used in the queries.

There are a lot more things we can discuss over here – but this is just a starter. Any question so far.

Mike – Thanks Jon, I understood so far. Here is the question in my mind – If I am following two Indexes they are considered as a different Index am I correct?

Index AA: Col1 DESC, Col2 ASC
Index BB: Col1 ASC, Col2 ASC

Am I correct?

Jon – Yes, they are different Indexes for sure.

Mike – Oh okey I perfectly got it. Let me quickly summarize it.

Even though Indexes look redundant there may be some queries which may find them useful. ORDER of the columns used in the Indexes is very important when considering the redundancy of the indexes. If the column order is different the Indexes are also considered as a different indexes. Before dropping the indexes one should properly validate the usage patterns of the indexes and query workloads. One should properly test everything before taking any actions. 

Jon – Good summary. Let us do a quick clean up.

-- Clean Up
DROP TABLE SampleTable1
GO

Mike – Perfect, I think I know a lot now.

Jon – Do you really think you know everything?

Mike – Well, sort of. Right?

Jon – If you think so answer me following question -

Are following two indexes should be considered as a redundant indexes or are they different indexes?

Index AA: Col1 DESC, Col2 DESC
Index BB: Col1 ASC, Col2 ASC

Mike – From what I understood today – they are Not Redundant Index.

Jon – Glad I asked this question. You are WRONG!

Mike – What? How come?

Jon – Let us talk about this future blog posts – I think I need to teach you a lot of things!

Mike – Thanks, I am indeed looking forward to learn more. So what I understand is that it is quite possible that the indexes may look redundant but they may not be redundant indexes. What I understand is that before dropping any indexes I need to identify if the indexes are used or not. Is there any professional tool which can help me?

Jon – Absolutely, I suggest that you try embarcadero DB Optimizer. It helps finding users the precise details related to index usage. It also indicates if Indexes are useful or useless. This tool is very easy to use and help in various aspects of the performance tuning. Here is detailed articles on this subject Part 1 and Part 2. This tool can help you with writing proper queries, do indexes maintenance and optimize overall health of the server. 

Mike – Thanks Jon!

Reference: Pinal Dave (http://blog.sqlauthority.com)