Before you start reading this blog post, I strongly suggest you to read the part 1 of this series. It talks about What is Redundant Index. The story is a conversation between two individuals – Jon and Mike. They are different but have single goal learn and explore SQL Server. Their initial conversation sets the ground for this blog post. They earlier discussed what is a Redundant Index as well, discussed what are the special cases for the same. It is a general assumption (or common best practices) is to drop Redundant Indexes. Later Mike asks for special case where even though the index is clearly a Redundant Index, why it should not be removed. Jon promises to explain with a demo where a Redundant Index is useful and should not be dropped. Here is their conversation continued from earlier. On my production server I personally use embarcadero DB Optimizer for all performance tuning and routine health check up. I will be interested to know what is your feedback about the product.
Mike – Today is Monday. You promised me a demo today where a Redundant Index is useful Index.
Jon – Absolutely. We will create two tables and will create absolutely same table. We will notice that on the first table Redundant Index will be useless and needs to be dropped whereas on the second table Redundant Index will useful and should not be dropped for performance.
Let us start with a demo.
Let us create two tables. One with all the column INT and second with a wider column CHAR (800).
USE tempdb
GO
-- Create table
CREATE TABLE SampleTable1 (ID INT, Col1 INT, Col2 INT, Col3 INT)
GO
CREATE TABLE SampleTable2 (ID INT, Col1 INT, Col2 INT, Col3 CHAR(800))
GO
Now let us create indexes on both the tables. We will make sure that created indexes are same on both the table.
Clustered Indexes are just created for reference – the results of this test will not be affected by its presence or absence of it. I have just created to rule out few doubts I anticipate by their absence. The important part is non clustered indexes. One of the non-clustered index is created on Col1 & Col2 and another one is created on Col1, Col2, and Col3.
Table1: SampleTable1Â
Index on SampleTable1:Â IX_ST_Col1_Col2
Index on SampleTable1:Â IX_ST_Col1_Col2_Col3
Table2: SampleTable2
Index on SampleTable2:Â IX_ST_Col1_Col2
Index on SampleTable2:Â IX_ST_Col1_Col2_Col3
-- Create Indexes on Sample Table1
CREATE CLUSTERED INDEX [CX_ST]
ON SampleTable1 (ID)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2]
ON SampleTable1 (Col1, Col2)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2_Col3]
ON SampleTable1 (Col1, Col2, Col3)
GO
-- Create Indexes on Sample Table2
CREATE CLUSTERED INDEX [CX_ST]
ON SampleTable2 (ID)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2]
ON SampleTable2 (Col1, Col2)
GO
CREATE NONCLUSTERED INDEX [IX_ST_Col1_Col2_Col3]
ON SampleTable2 (Col1, Col2, Col3)
GO
Now let us populate both the tables. Both the tables have absolutely same data.
-- Populate tables
INSERT INTO SampleTable1 (ID, Col1, Col2, Col3)
SELECT RAND()10000, RAND()1000, RAND()100, RAND()10
GO 100000
INSERT INTO SampleTable2 (ID, Col1, Col2, Col3)
SELECT *
FROM SampleTable1
GO
Now is the most interesting part. For this first enable the execution plan in SSMSÂ (shortcut key – CTRL + M).
We will be doing two tests. Let describe our first test.
Test 1: Select a smaller set of the data
In this test we will run a same query on both the tables. Both the times we will apply our path of the index on the each table. As there are 2 tables and each have 2 indexes we will have a total of 4 indexes.
Table1: SampleTable1Â
Index on SampleTable1:Â IX_ST_Col1_Col2
Index on SampleTable1:Â IX_ST_Col1_Col2_Col3
Table2: SampleTable2
Index on SampleTable2:Â IX_ST_Col1_Col2
Index on SampleTable2:Â IX_ST_Col1_Col2_Col3
Now let us run following script with keeping the Actual Execution Plan on (shortcut key CTRM: + M)
Let us first run two scripts for SampleTable1
-- Select from SampleTable1
SELECT Col1, Col2
FROM SampleTable1 st WITH(INDEX([IX_ST_Col1_Col2]))
WHERE st.Col1 = 10
GO
-- Select from SampleTable1
SELECT Col1, Col2
FROM SampleTable1 st WITH(INDEX(IX_ST_Col1_Col2_Col3))
WHERE st.Col1 = 10
GO
Let us check the execution plan:
You can notice from the execution plan that in the case of the SampleTable1 it does not matter if we use either of the index the performance of the both the query is same. Both the queries are using the same amount of the resources. In this case, Col3 is an integer and for SQL Server the width of the column does not make much difference. I can clearly say in this case Indexes are redundant as they are giving the same performance.
(Note: If you are going to ask me to change the SELECT statement to also include Col3, it will become totally different scenario as it will require to do a key lookup for IX_ST_Col1_Col2. If your SELECT statement has Col1, Col2, Col3 – the optimal index here is IX_ST_Col1_Col2_Col3, there is no further discussion in that case).
(A Quick Tip: When we compare execution plan – the higher cost compared to the batch explains higher usage of the resources and expensive query.)
Now looking at both the indexes indexes IX_ST_Col1_Col2 is subset of IX_ST_Col1_Col2_Col3 and as mentioned in an earlier note if there is an additional column (col3) is in the SELECT statement that index will be more suitable. We can easily remove IX_ST_Col1_Col2 index in this particular special case (note this does not apply all the time).
Now let us run similar script for SampleTable2
-- Select from SampleTable2
SELECT Col1, Col2
FROM SampleTable2 st WITH(INDEX([IX_ST_Col1_Col2]))
WHERE st.Col1 = 10
GO
-- Select from SampleTable2
SELECT Col1, Col2
FROM SampleTable2 st WITH(INDEX([IX_ST_Col1_Col2_Col3]))
WHERE st.Col1 = 10
GO
Let us check the execution plan:
You can notice from the execution plan that in the case of the SampleTable2 it matters a lot about which index is used for the query as the performance differences between those queries is huge.  One of the query is using very little resources and another one is taking a huge amount of the resources. In this case, Col3 is a CHAR (800) datatype which is fixed length string datatype. In this case, for SQL Server the width of the column does make a big difference. We can clearly say that here in our SELECT statement IX_ST_Col1_Col2 is the most optimal indexes.
If that is the case, what is the use of IX_ST_Col1_Col2_Col3. Well, the answer of this question is also interesting. If you change the SELECT statement to also include Col3, it will become totally different scenario as it will require to do a key lookup for IX_ST_Col1_Col2. If your SELECT statement has Col1, Col2, Col3 – the optimal index here is IX_ST_Col1_Col2_Col3. The need of the both the indexes is different and they achieve a specific task. If you think IX_ST_Col1_Col2 is redundant as a more inclusive index IX_ST_Col1_Col2_Col3 exists, it will be not the optimal thinking. Even though, IX_ST_Col1_Col2_Col3 includes all the columns, when SQL Server only needs Col1 and Col2 it finds IX_ST_Col1_Col2 more suitable for performance.
I hope this is now clear to you that how to identify if the redundant index is useful or useless now.
Mike – Thanks Jon, this is a great explanation. Let me quickly summarize it.
Even though Indexes look redundant there may be some queries which may find them useful. This usually happens when the data type of the of any column is much wider than other columns. Before dropping the indexes one should properly validate the usage patterns of the indexes and query workloads. One should properly test everything before taking any actions.Â
Jon – Good summary. Test before you Act! However, you should notice that this is not the only case when redundant indexes are useful. There may be other cases too!
Mike – I understand. However, before you continue further, I see that you called this as a Test 1. Is there any test 2 with the larger dataset? Does it also validate my earlier summary.
Jon – Another good question – Let us see the Test 2 in Friday’s blog post. You can clean up your database by dropping your test tables.
-- Clean Up
DROP TABLE SampleTable1
GO
DROP TABLE SampleTable2
GO
Stay tuned for part 3 of this series.
Reference:Â Pinal Dave (https://blog.sqlauthority.com)
2 Comments. Leave new
Hi Pinal,
I was thinking about few cases where we have index on
C1, C2, C3
and SELECT query having filters on (C2 and/or C3) then would NC index apply?
Also example you took was for CHAR column. How about varchar column where values range from 1 byte to 8000 byte?
Thanks for Useful Demo Pinal!
1. Does it have any impact when we are joining the indexed tables with other non-Index tables?
2. Is the Index related to columns used in Select clause or Columns used in Join conditions and Where Conditions.