Earlier this week, I wrote a blog post about SQL SERVER – Top Reasons for Slow Performance. The blog post got extremely popular as it contained many real-world problems and suggestions. After the blog post was released, I received quite many emails and questions. In response to the blog post, I have received a lot of comments on my Twitter and LinkedIn. The most questions were around section Poor Indexing Strategies and that lead me to write about 10 Don’ts for Indexes.
One of my FinTech clients who often hire me for Comprehensive Database Performance Health Check just sent me their 10 Don’ts for the Indexes which I had built with them when I worked along with them. I think the list is very simple but effective. I suggest everyone create their such list and keep in front of developers.
10 Don’ts for Indexes – Poor Indexing Strategies
- Don’t index every column of the table.
- Don’t create more than 7 indexes per table (clustered and non-clustered)
- Don’t leave a table as Heap (create a clustered index).
- Don’t create an index on every column involved in every foreign key
- Don’t rebuild an index too frequently (monthly once is good enough)
- Don’t create an index with more than 5 to 7 key columns
- Don’t create an index with more than 5 to 7 included columns
- Don’t add your clustered index key in your non-clustered index
- Don’t change server fill factor, change it at an index level (as needed)
- Don’t ignore index maintenance
In your organization, if you are following the above 10 simple guidelines, I am very confident that your server would be running absolutely fine and you may not need not worry about indexes anymore. There are many different guidelines one has to follow if you want 100% performance for your SQL Server. However, that is out of the scope of this blog post and we often discuss that in detail Comprehensive Database Performance Health Check.
If you are following all the 10 guidelines of what to not do with index and if you are still facing performance problems, I am very sure that the issue is related to either your deployment or how your queries are following anti-patterns. I will write about that in the future.
Give yourself 1 point for each Don’t which you follow and post your score in the comments. If you get 10/10, your organizations should be proud of you.
Update: Always create indexes after doing proper workload analysis of your system. Just creating indexes because of best practices is not always a great idea.
Reference: Pinal Dave (https://blog.sqlauthority.com)
i am DBA on database for dynamics ax 2012 r2 and i face problem on performance for some tables there is built in indexes with the custom that comes from query so the count of them more than 7 indexes per some tables so is it possible to be special case
also i need to create script for specific table (not all tables) inside dynamics database to check which indexes need rebuild and which need reorganize
Not sure I agree with #4. Yes, there are cases where adding an index on FKs is not needed, but very frequently FKs benefit frim from an index.
Great Point Peter,
The title of the blog post is what not to do. Very often people create indexes all the FK and eventually build lots of indexes. The goal is not to create indexes on every single FK.
One has to do a proper analysis of their workload and come up with 5-7 best possible indexes. Creating indexes on all the FK may just lead to lots of indexes, which do not necessarily help the performance.
Now after workload analysis, if you find you need to create an index on FK, absolutely is welcome.
However, I see, why this statement can pass an incorrect messages. Let me modify it a bit more to pass the correct intent.
Thanks for bringing to my attention.
Don’t add your clustered index key in your non-clustered index
What is wrong with this ? I understand clustering key will be already part of the NC. I don’t think SQL server will duplicate it even clustering key columns are added explicitly in NC.
Sometime it is useful to have clustering key column for the included non cluster index. For example high selective seekable queries.
So my opinion is not to generalise it.
The question is if it is already part of it, why add them again? The reason of high selective seekable queries will just work without even adding the keys.
I disagree with you!
My example shows a situation where all data is stamped with a DepartmentID. A given user may only see data from his own department. This is implemented by using RowLevelSecurity or by using a framework that adds DepartmentID to all queries. (Dyn Ax adds dataAreaId – the dataAreaId field stores the company that the record is belonging to).
Let’s say that the company has 1000 departments! And we have this table.
CREATE TABLE dbo.Customer
DepartmentID INT NOT NULL
CONSTRAINT FK_Customer_Department FOREIGN KEY
REFERENCES dbo.Department (DepartmentID),
CustomerID INT NOT NULL,
Name VARCHAR(40) NOT NULL,
Adress VARCHAR(40) NOT NULL,
Zipcode SMALLINT NOT NULL,
DeliveryAdress VARCHAR(40) NULL,
DeliveryZipcode SMALLINT NULL,
BillingAdress VARCHAR(40) NULL,
BillingZipcode SMALLINT NULL,
CONSTRAINT PK_Customer PRIMARY KEY CLUSTERED (DepartmentID, CustomerID)
If we execute the following SELECT-statement
SELECT CustomerID, Name, DeliveryAdress, DeliveryZipcode
WHERE Name = <>;
the performance will be different depending on which index is defined
CREATE INDEX nc_Customer_3_6_7
ON dbo.Customer (Name, DeliveryAdress, DeliveryZipcode);
CREATE INDEX nc_Customer_1_3_8_9
ON dbo.Customer (DepartmentID, Name, DeliveryAdress, DeliveryZipcode);
CREATE INDEX nc_Customer_3_1_8_9
ON dbo.Customer (Name, DepartmentID, DeliveryAdress, DeliveryZipcode);
because the executed statement will be
SELECT CustomerID, Name, DeliveryAdress, DeliveryZipcode
WHERE Name = @Name AND DepartmentID = <>;
Index number 1 will be the bad index!!!!
If we create this index
CREATE INDEX nc_Customer_1_3_8_9_2
ON dbo.Customer (DepartmentID, Name, DeliveryAdress, DeliveryZipcode)
UPDATE STATISTICS will be faster, because we don’t sort on column CustormerID at index 4. See the output from following statement.
DBCC SHOW_STATISTICS (Kunde, ‘nc_Kunde_1_3_8_9’) WITH DENSITY_VECTOR;
DBCC SHOW_STATISTICS (Kunde, ‘nc_Kunde_1_3_8_9_2’) WITH DENSITY_VECTOR;
And why did MS change from 250 to 1000 index on each table – not because 7 is the limit!!!! It is always dangerous to state a rule – someone believes it to be the truth.
Absolutely no issue, as long as your system is working fine with too many indexes, you should not have any problem.
I am not talking about too many indexes, but the right number of indexes. Maybe 4 is too many – there should only be 2. Why 7?
There are many tables with a lot of columns. SALESLINE in Dyn Ax has 80 columns. For making covered indexes for own statement (also performance) and indexes for Dyn Ax statement, 7 could be to few.
So instead of a limit on 7, you should talk about the right indexes with columns in the right order – and it is not necessary with clustered columns last. And if you use 11 columns of the SALESLINE table, maybe you should create an (covered) index with 11 columns.
I shared what I find at my client’s. Let us say in the last 10 years, my 99% of the clients were happy with 7 or fewer indexes.
Maybe it’s small databases? But MS’s experience says that in version 2008 R2 it was necessary to expand from 250 to 1000 index per table. With their experience, I think it is dangerous to make a general recommendation of max 7. Such a recommendation is considered by some – many – as a rule.
I often hear that many indexes are problematic due to INSERT, DELETE and UPDATE. But it is also a truth with modifications. Yes, all indexes need to be updated by INSERT and DELETE. However, only those indexes where altered values in the columns involved are affected by an UPDATE – not the columns referred to in the UPDATE statement, but only columns with altered values. And since most systems perform significantly more UPDATE than INSERT + DELETE statements, this argument falls short of having few indexes.
So the important thing for me is to understand exactly how index is structured, how they are used, how an index can be applied to several different statements, …., the best order of columns and this include the right position of the cluster-key. Use INCLUDE for better possibility for UPDATE STATISTICS, … And make sure they are maintained optimally with reorganize / rebuild and UPDATE STATISTICS.
Maybe the index is rowstore index, maybe it’s columnstore index, maybe an indexed view needs to be defined on the table, or a mix of all these. So maybe it’s 2 indexes, maybe it’s 20 indexes, …. But don’t stop when you reach 7, if more index will be better and stop before 7 when this is enough!
My most consulting engagements are for TB of the databases.
I appreciate your opinion is different from me.
However, as I said, I have found maximum number of 7 indexes efficient in 99% of my business cases. When I see that even 10% need a different number, I will for sure change the recommendations.
I have come to believe that 7 is the max you want as indexes and in most cases 4-5 are great enough! So in reality, usually I stop at 4-5.
This is such a broad statement to claim there will be no performance issues.
Based on the above, if I had a database where each table only had a single clustered index on an auto incrementing Id and contained millions of rows, there would be no performance issues. No chance!
What about covering indexes to improve query performance? Most of the time they have more then 5-7 columns..
As long as you are under 5-7 columns and 5-7 indexes, you are good.
> But MS’s experience says that in version 2008 R2 it was necessary to expand from 250 to 1000 index per table.
Just because you *can* do something, doesn’t mean that it is a good idea.
>With their experience, I think it is dangerous to make a general recommendation of max 7.
If you follow Brent Ozar, you will find that his general guideline – not a hard and fast rule – is that a table should not have more than 5 indexes. So Pinal says 7, and Brent says 5, based on their experience.
> There are many tables with a lot of columns. SALESLINE in Dyn Ax has 80 columns.
I’m not familiar with this particular database, but another general rule – not hard and fast – is that a properly normalized database will have about 25-35 columns maximum in any table. Perhaps SALESLINE in Dyn Ax is not a shining example of a properly normalized database. So yeah, maybe you do need more indexes in this case.
My problem is that you specify a limit at all, instead of talking about what indexes are needed for a table and the manipulations that are made against this table. For 5 indexes may as well be too many for a table. Maybe there should only be 2 indexes.
It is important to emphasize that normalization has nothing to do with the number of columns. A table with 80 columns can be on 5NF. Normalization says something about the relationships between information in different columns – key and non-key columns. But creating a table with 80 columns, can be a bad physical design. When going from the logical model to the physical model, it must be decided, whether you should split the 80 columns into several tables based on user scenarios and/or performance and/or null-notnull and/or ….
But I’m sure, that MS makes this changes in 2008, because there was a need. The changes could create many complications, so they don’t do it just because … And there is far way from 5/7 over 250 to 1000. By defining that kind of restriction, you can also get the impression, that if you just staying below 5/7, then everything is fine, and this is not truth. It is important to know the consequences of how the index is defined and how SQL Server then uses these indexes.