Index optimization is always interesting subject to me. Every time I receive requests to help optimize query or query on any specific table. I always ask Jr.DBA to go over following list first before I take a look at it. Most of the time the Query Speed is optimized just following basic rules mentioned below. Once following checklist applied interesting optimization part begins which only experiment and experience can resolve.
- Create Index on frequently used columns in T-SQL Code. Columns used in WHERE, ORDER BY and GROUP BY are good candidates for Indexes. Create Index on column, which are used in JOIN Condition.
- Remove any un-necessary Indexes. As Index occupies hard drive space as well as it decreases performance of all the inserts, updates, deletes to the table.
- Smaller Index Key gives better performance than Index key which covers large data or many columns
- Index on Integer Columns performs better than varchar columns.
- Clustered Index must exist before creating Non-Clustered Index.
- Clustered Index must be created on Single Column which is not changing and narrow in size. The best candidate is a primary key.
- Non-clustered Indexes increase performance of the query that returns fewer rows and rows has a wide selectivity spectrum.
- Each table must have one Clustered Index.
- If column have low selectivity avoid creating an Index on that column as it slow down the rows modification and system will not get benefit from the Index.
- Multiple Columns Index or Covered Index should be ordered as Most Selective column on left and gradually decreasing selectivity as they go right.
- Use SORT_IN_TEMPDB option when the table is created if tempdb is on different disks. This will increase the performance to create the Index.
- Rebuild Index frequently using ALTER INDEX and De-fragment Index to keep performance optimal for Indexes.
Download SQL SERVER Index Optimization CheckList
Reference: Pinal Dave (https://blog.sqlauthority.com)
21 Comments. Leave new
Am a bit new to db work..but some1 pliz assist me…
how do i convert the following to a case statemnt
if (condition) print ‘test…’
if (condition) print ‘test…’
if (condition) print ‘test…’
is it smthin like
case
when (condition) then print
–does this work
Hi,
I have a question about clustered indexes. Is it possible to configure SQL server to create a non-clustered index for a primary key by default?
I am trying to keep my SQL vendor-independent and in standard SQL I can not specify non-clustered as many DBs will not accept it.
thank you in advance
thanks..
Hi Pinal,
I understand that
Index on Integer Columns performs better than varchar columns
But do Index on Integer Columns performs better than datetime columns too? or they are same? I think later.
Also I must compliment the work you are doing. All I can say is after spending time on this site (which went right up there to my favourite’s) I am more motivated to know all about sql server.
Thanks.
Best Regards,
– Gopal
Hi Pinal,
I always read articles from your websites its very informative and really a nice site for developers and dbas.
will raise questions when facing problems
with best regards
Nasir
Hi Pinal,
I read your articles.I have a question about sql mail.
Some time When we use xp_sendmail it times out.Then
we reset the sql server agent.We use sql server 2000.
Is there any other way other than resetting it and why does it occur
With regards
Preethi
By definition the Primary Key is a clustered unique index but if you really would just rather put the clustered index on a different set of fields that you can just change the index on the Primary key and make it just a unique index and then use the clustered index for whatever fields you like.
Hi Pinal
Love your blog, always very useful for reference. Just out of interest, if we have large db tables with 100m+ rows will an index on a datetime field (low selectivity) perform much worse than an index on a derived (or computed) integer column?
This the above assuming that we are only concerned with the date part and not time for these queries….
Would be interested in finding out your expert opinion…. ;-)
Thanks and keep up the good work.
Andrew
Hi Pinale,
Your last sessions in the tech-ed india 2011 was fantastics.
Thanks for that. I have attended both of them.
Here one doubt on indexing.
Is it a good idea to index datetime columns.
Lets say we have a table in which we keep daily transaction details. The table size is very large and at any time we only required the today’s transaction of the user.
So I was thinking of indexing by date column desending and after that by the user id ascending.
Please share your views
Thanks
Aditya
Yes create an index on the datetime column and make sure you use the query in such a way that it will make use of the index
Hi Pinal,
I want to know about the Index fill factor, Index fragmentation and defragmentation, Paging and linking of indexes.
I know all about the above, but not have confidence on the same, so if you can provide some book or link reference for the same, it would be highly appreciated.
Best Regards,
Mahesh
Hi Pinal,
I have several questions that i’d like to ask
1. How could we check if the indexes that we have created were good or bad i.e. degrading the performance, or overlap with other indexes? I am afraid that i’ve added too many indexes that causing degradation in my application.
2. Is there a way to delete several indexes and statistics all at once without knowing the full name and the table name i.e. only part of the indexes or statistics name.
Your blog is very informative, thanks for sharing :), keep up the good job.
Thanks,
Adi
These are not true:
– “Each table must have one Clustered Index.”
– “Clustered Index must exist before creating Non-Clustered Index.”
That can be a recommendation, but not a “must”.
Primary key and unique constraint are logical constructs (constraints) that are phisycally backed-up by an unique index. All three of them: PK, unique constraint and unique index can be created as NONCLUSTERED if you specify that keyword, and you surely can have a table without a clustered index.
Such tables are called HEAPS. It is just a type of physical structure, as CLUSTERED structure also is.
Developer and DBA should know when it is better for a table to use heap, and when it is better to have clustered index structure of a table. If you are not sure – measure and be sure!
If you have a table that is rarely accessed by a primary key (no child tables that reference it, e.g. fact table), and it is wide (has many columns), and frequently modified (heavy on insert/update/delete), you should try with a heap as a phisycal structure.
Each phisycal structure should be considered and used when appropriate.
I would add a very neat new feature of SQL 2008: FILTERED INDEXES.
Ordinary clustered or non-clustered indexes always have exactly the same number of rows as the table they are created on. For huge tables, you have huge indexes (although smaller than table itself).
If you are interesting in fetching just a small portion of rows, e.g. non-processed rows in a large table that has 99.9% of rows which are “processed”, you will benefit from filtered index greatly.
It will be very very small compared to the table, and lightning-fast, containing only rows you are interested in (un-processed).
You can have partitioned indexes on non-partitioned tables or non-partitioned (global) indexes on partitioned table. Or pertitined index and partitined table by a different partitioning schema. But if you have the same partitioning schema for table and all it’s indexes, you get some benefits (easy and super-fast EXCHANGE PARTITION feature is one of them).
Almost every detail about the indexes and their structure, how they are partitioned, what space so they take, are they used at all and much much more you can see with this add-in for SQL Management Studio:
http://www.sqlxdetails.com
Great source!
2 questions:
1. Is a qood idea to have GUID as PrimaryKey on a table? (Clustered index)
2. In case of a 2 column index (non clustered index). One column is GUID (the first -left) and the other is int (possible values: 5) (the right column-2nd). Is there a meaning in changing their order with respect to performance?
1) Short answer: guid is not a very good clustering key. Try alternative solutions. Long answer: GUID takes 16 bytes, and every NC (non-clustered) index on that table will also become wider for 16 bytes per row. That can add-up considerably and slow your system. Another thing is fragmentation (table will become very fragmented very fast, and thus will be slower than optimal), but you can partially avoid it by using NEWSEQUENTIALID() instead of NEWID() function to generate new values. Much better is to use int identity, not null. If you have replication, you can have CL key of two columns: (INT identity, smallint – replication id) which is 6 bytes, much less than 16. Primary key and clustered index are two different and independant things. Choose clustering key wisely, does not have to consist of PK columns at all.
2) For queries that filter both fields there is no measurable difference. But for queries that filter just guid, you should have guid as first column. For queries that filter just on int column, you should not have index because it is not selective and sql would go to full scan anyway. So, (guid,int) is only logical option here.
Don’t be afraid of wider indexes (2-5 key columns, and bunch of included). Consolidate similar indexes into one wider and you will get better overall performance. One has to learn a lot to get the true knowledge about how to do it properly.
1) Short answer: guid is not a very good clustering key. Try alternative solutions. Long answer: GUID takes 16 bytes, and every NC (non-clustered) index on that table will also become wider for 16 bytes per row. That can add-up considerably and slow your system. Another thing is fragmentation (table will become very fragmented very fast, and thus will be slower than optimal), but you can partially avoid it by using NEWSEQUENTIALID() instead of NEWID() function to generate new values. Much better is to use int identity, not null. If you have replication, you can have CL key of two columns: (INT identity, smallint – replication id) which is 6 bytes, much less than 16. Primary key and clustered index are two different and independant things. Choose clustering key wisely, does not have to consist of PK columns at all.
2) For queries that filter both fields there is no measurable difference. But for queries that filter just guid, you should have guid as first column. For queries that filter just on int column, you should not have index because it is not selective and sql would go to full scan anyway. So, (guid,int) is only logical option here.
Don’t be afraid of wider indexes (2-5 key columns, and bunch of included). Consolidate similar indexes into one wider and you will get better overall performance. One has to learn a lot to get the true knowledge about how to do it properly.
HI Pinal,
Can you please explain in detail the difference between cluster index seek (clustered) and Index seek (Nonclustered)..Thanks in advance
If a table has 10 indexes and 5 indexes are useful for a query then how many indexes does SQL server use?
Hi Pinal
What is the difference between Rebuild and Reorganize?
Thanks