SQL SERVER – An Interesting Case of Redundant Indexes – Index on Col1, Col2 and Index on Col1, Col2, Col3 – Part 3

Before you start reading this blog post, I strongly suggest you to read the part 1 of this series.It talks about What is Redundant Index. The story is a conversation between two individuals – Jon and Mike. They are different but have single goal learn and explore SQL Server. Their initial conversation sets the ground for this blog post. They earlier discussed what is a Redundant Index as well, discussed what are the special cases for the same. It is a general assumption (or common best practices) is to drop Redundant Indexes. On my production server I personally use embarcadero DB Optimizer for all performance tuning and routine health check up. I will be interested to know what is your feedback about the product.

Later Mike asks for special case where even though the index is clearly a Redundant Index, why it should not be removed. Jon promises to explain with a demo where a Redundant Index is useful and should not be dropped. In part 2 of this series – we talked about in detail about Kind of Redundant Indexes are Useful. We demonstrated that a column width makes a big difference in the performance of the index with a query. The query used in the demonstration was returning a very small result set.

Mike had a question for Jon that if the query is returning a larger result set will it still have the same conclusion which they had arrived in an earlier blog post. Let us continue the conversation here.

Mike – Today is Friday. You promised me a demo today where a Redundant Index is useful Index, even though the query were returning a larger result set.

Jon - Absolutely. Our demo is absolutely same as the earlier demo – the only difference today we will see is that our where condition will be different from the earlier query. This time we will have a much larger resultset returning by our query. We will observe the output and come to the conclusion if our finding is still the same as we discovered earlier.

We will create two tables and will create absolutely same table. We will notice that on the first table Redundant Index will be useless and needs to be dropped whereas on the second table Redundant Index will useful and should not be dropped for performance. One with all the column INT and second with a wider column CHAR (800).

USE tempdb
GO
-- Create table
CREATE TABLE SampleTable1 (ID INT, Col1 INT, Col2 INT, Col3 INT)
GO
CREATE TABLE SampleTable2 (ID INT, Col1 INT, Col2 INT, Col3 CHAR(800))
GO

Now let us create indexes on both the tables. We will make sure that created indexes are same on both the table.

Clustered Indexes are just created for reference – the results of this test will not be affected by its presence or absence of it. I have just created to rule out few doubts I anticipate by their absence. The important part is non clustered indexes. One of the non-clustered index is created on Col1 & Col2 and another one is created on Col1, Col2, and Col3.

Table1: SampleTable1 
Index on SampleTable1: IX_ST_Col1_Col2
Index on SampleTable1: IX_ST_Col1_Col2_Col3

Table2: SampleTable2
Index on SampleTable2: IX_ST_Col1_Col2
Index on SampleTable2: IX_ST_Col1_Col2_Col3

-- Create Indexes on Sample Table1
CREATE CLUSTERED INDEX [CX_ST]
ON SampleTable1 (ID)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2]
ON SampleTable1 (Col1, Col2)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2_Col3]
ON SampleTable1 (Col1, Col2, Col3)
GO
-- Create Indexes on Sample Table2
CREATE CLUSTERED INDEX [CX_ST]
ON SampleTable2 (ID)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2]
ON SampleTable2 (Col1, Col2)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2_Col3]
ON SampleTable2 (Col1, Col2, Col3)
GO

Now let us populate both the tables. Both the tables have absolutely same data.

-- Populate tables
INSERT INTO SampleTable1 (ID, Col1, Col2, Col3)
SELECT RAND()*10000, RAND()*1000, RAND()*100, RAND()*10
GO 100000
INSERT INTO SampleTable2 (ID, Col1, Col2, Col3)
SELECT *
FROM SampleTable1
GO

Now is the most interesting part. For this first enable the execution plan in SSMS (shortcut key – CTRL + M).

We will be doing two tests. Let describe our first test.

Test 2: Select a larger set of the data

In this test we will run a same query on both the tables. Both the times we will apply our path of the index on the each table. As there are 2 tables and each have 2 indexes we will have a total of 4 indexes.

Table1: SampleTable1 
Index on SampleTable1: IX_ST_Col1_Col2
Index on SampleTable1: IX_ST_Col1_Col2_Col3

Table2: SampleTable2
Index on SampleTable2: IX_ST_Col1_Col2
Index on SampleTable2: IX_ST_Col1_Col2_Col3

Now let us run following script with keeping the Actual Execution Plan on (shortcut key CTRM: + M)

Let us first run two scripts for SampleTable1

-- Select from table
SELECT Col1, Col2
FROM SampleTable1 st WITH(INDEX([IX_ST_Col1_Col2]))
WHERE st.Col1 > 10 AND st.Col1 < 20
GO
-- Select from table
SELECT Col1, Col2
FROM SampleTable1 st WITH(INDEX(IX_ST_Col1_Col2_Col3))
WHERE st.Col1 > 10 AND st.Col1 < 20
GO

Let us check the execution plan:

You can notice from the execution plan that in the case of the SampleTable1 it does not matter if we use either of the index the performance of the both the query is same. Both the queries are using the same amount of the resources. In this case, Col3 is an integer and for SQL Server the width of the column does not make much difference. I can clearly say in this case Indexes are redundant as they are giving the same performance.

Now looking at both the indexes indexes IX_ST_Col1_Col2 is subset of IX_ST_Col1_Col2_Col3 and as mentioned in an earlier note if there is an additional column (col3) is in the SELECT statement that index will be more suitable. We can easily remove IX_ST_Col1_Col2 index in this particular special case (note this does not apply all the time).

Now let us run similar script for SampleTable2

-- Select from table
SELECT Col1, Col2
FROM SampleTable2 st WITH(INDEX([IX_ST_Col1_Col2]))
WHERE st.Col1 > 10 AND st.Col1 < 20
GO
-- Select from table
SELECT Col1, Col2
FROM SampleTable2 st WITH(INDEX([IX_ST_Col1_Col2_Col3]))
WHERE st.Col1 > 10 AND st.Col1 < 20
GO

Let us check the execution plan:

You can notice from the execution plan that in the case of the SampleTable2 it matters a lot about which index is used for the query as the performance differences between those queries is huge.  One of the query is using very little resources and another one is taking a huge amount of the resources. In this case, Col3 is a CHAR (800) datatype which is fixed length string datatype. In this case, for SQL Server the width of the column does make a big difference. We can clearly say that here in our SELECT statement IX_ST_Col1_Col2 is the most optimal indexes.

If that is the case, what is the use of IX_ST_Col1_Col2_Col3. Well, the answer of this question is also interesting. If you change the SELECT statement to also include Col3, it will become totally different scenario as it will require to do a key lookup for IX_ST_Col1_Col2. If your SELECT statement has Col1, Col2, Col3 – the optimal index here is IX_ST_Col1_Col2_Col3. The need of the both the indexes is different and they achieve a specific task. If you think IX_ST_Col1_Col2 is redundant as a more inclusive index IX_ST_Col1_Col2_Col3 exists, it will be not the optimal thinking. Even though, IX_ST_Col1_Col2_Col3 includes all the columns, when SQL Server only needs Col1 and Col2 it finds IX_ST_Col1_Col2 more suitable for performance.

I hope this is now clear to you that how to identify if the redundant index is useful or useless now.

Even this second test, proved our earlier findings correct that it does not matter if resultset had more rows or less, the Redundent Index can be helpful and before dropping them we should evaluate them properly.

One more thing to notice is that when we ran the query for SampleTable2, the resource utilization is even larger than what we have seen in the earlier tests done when we had smaller resultset. With the larger resultset in this example the redudent index which has less number of column have greater (optimal) performance.

Mike - Ahha – I guess in this case the summary should stay the same. Let me quickly summarize it.

Even though Indexes look redundant there may be some queries which may find them useful. This usually happens when the data type of the of any column is much wider than other columns. Before dropping the indexes one should properly validate the usage patterns of the indexes and query workloads. One should properly test everything before taking any actions. 

Jon – Good summary. It is always a good idea to test various combinations before making a decision. One should not drop an index because it was marked as a Redundant Index.

Mike – This is interesting! Jon. Do you have any more goodies related to this subject.

Jon – I do have many more goodies. However, let us do a quick clean up.

-- Clean Up
DROP TABLE SampleTable1
GO
DROP TABLE SampleTable2
GO

Mike – What about some more insight on this subject please?

Jon – Wait and watch – we will have something new next week!

Reference: Pinal Dave (http://blog.sqlauthority.com)

8 thoughts on “SQL SERVER – An Interesting Case of Redundant Indexes – Index on Col1, Col2 and Index on Col1, Col2, Col3 – Part 3

  1. All you’ve done is give a really good example of why you shouldn’t use optimizer hints to force index select. The Query Optimizer isn’t going to be stupid enough to even use an index on Col1, Col2, Col3 when Col3 never appears in the query. Of course, the index on the second table requires more resources, the index key is much larger than the other indexes which means a lot fewer rows per page. You’ve chosen a very artificial scenario and then forced the Query Optimizer to use an index it would never.

    Like

    • Hi Pinal,

      i have question why queries from 1st
      SELECT Col1, Col2
      FROM SampleTable1 st WITH(INDEX([IX_ST_Col1_Col2]))
      WHERE st.Col1 > 10 AND st.Col1 < 2

      not able to use index and

      quries from another samples are able to use it .is their any reason?

      Like

  2. The index hints are a workaround to show what would happen if you dropped (what some might consider to be) a duplicate index. It’s to demonstrate the relative costs so that both statements can be run in a single batch. Obviously nobody would ever run with index hints, (except *cough* sharepoint *cough*).

    Like

  3. Pingback: SQL SERVER – An Interesting Case of Redundant Indexes – Index on Col1, Col2 and Index on Col1, Col2, Col3 – Part 4 « SQL Server Journey with SQL Authority

  4. Pingback: SQL SERVER – An Interesting Case of Redundant Indexes – Index on Col1, Col2 and Index on Col1, Col2, Col3 – Part 5 « SQL Server Journey with SQL Authority

  5. Pingback: SQL SERVER – An Interesting Case of Redundant Indexes – Index on Col1 and Included Columns Col2 and Col3 – Part 5 | SQL Server Journey with SQL Authority

  6. Pingback: SQL SERVER – An Interesting Case of Redundant Indexes – Index on Col1, Col2 and Index on Col1, Col2, Col3 – Part 6 | SQL Server Journey with SQL Authority

  7. Pingback: SQL SERVER – Redundant Indexes Good or Bad – Final Wrap – Part 7 | Journey to SQL Authority with Pinal Dave

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s