SQL SERVER – Index Optimization CheckList

Index optimization is always interesting subject to me. Every time I receive request to help optimize query or query on any specific table. I always ask Jr.DBA to go over following list first before I take a look at it. Most of the time the Query Speed is optimized just following basic rules mentioned below. Once following checklist applied interesting optimization part begins which only experiment and experience can resolve.

  • Create Index on frequently used columns in T-SQL Code. Columns used in WHERE, ORDER BY and GROUP BY are good candidate for Indexes. Create Index on column which are used in JOIN Condition.
  • Remove any un-necessary Indexes. As Index occupies hard drive space as well as it decreases performance of all the insert, updates, deletes to the table.
  • Smaller Index Key gives better performance than Index key which covers large data or many columns
  • Index on Integer Columns performs better than varchar columns.
  • Clustered Index must exist before creating Non-Clustered Index.
  • Clustered Index must be created on Single Column which is not changing and narrow in size. Best candidate is primary key.
  • Non-clustered Indexes increases performance of the query that returns fewer rows and rows has wide selectivity spectrum.
  • Each table must have one Clustered Index.
  • If column have low selectivity avoid creating Index on that column as it slow down the rows modification and system will not get benefit from Index.
  • Multiple Columns Index or Covered Index should be ordered as Most Selective column on left and gradually decreasing selectivity as they go right.
  • Use SORT_IN_TEMPDB option when table is created if tempdb is on different disk. This will increase the performance to create Index.
  • Rebuild Index frequently using ALTER INDEX and De-fragment Index to keep performance optimal for Indexes.

Reference : Pinal Dave (http://blog.SQLAuthority.com) , Various References

20 thoughts on “SQL SERVER – Index Optimization CheckList

  1. Am a bit new to db work..but some1 pliz assist me…
    how do i convert the following to a case statemnt

    if (condition) print ‘test…’
    if (condition) print ‘test…’
    if (condition) print ‘test…’

    is it smthin like

    case

    when (condition) then print
    –does this work

    Like

  2. Hi,

    I have a question about clustered indexes. Is it possible to configure SQL server to create a non-clustered index for a primary key by default?

    I am trying to keep my SQL vendor-independent and in standard SQL I can not specify non-clustered as many DBs will not accept it.

    thank you in advance

    Like

  3. Hi Pinal,

    I understand that
    Index on Integer Columns performs better than varchar columns

    But do Index on Integer Columns performs better than datetime columns too? or they are same? I think later.

    Also I must compliment the work you are doing. All I can say is after spending time on this site (which went right up there to my favourite’s) I am more motivated to know all about sql server.

    Thanks.

    Best Regards,
    – Gopal

    Like

  4. Hi Pinal,
    I always read articles from your websites its very informative and really a nice site for developers and dbas.

    will raise questions when facing problems

    with best regards

    Nasir

    Like

  5. Hi Pinal,
    I read your articles.I have a question about sql mail.
    Some time When we use xp_sendmail it times out.Then
    we reset the sql server agent.We use sql server 2000.
    Is there any other way other than resetting it and why does it occur

    With regards
    Preethi

    Like

  6. By definition the Primary Key is a clustered unique index but if you really would just rather put the clustered index on a different set of fields that you can just change the index on the Primary key and make it just a unique index and then use the clustered index for whatever fields you like.

    Like

  7. Hi Pinal

    Love your blog, always very useful for reference. Just out of interest, if we have large db tables with 100m+ rows will an index on a datetime field (low selectivity) perform much worse than an index on a derived (or computed) integer column?

    This the above assuming that we are only concerned with the date part and not time for these queries….

    Would be interested in finding out your expert opinion…. ;-)

    Thanks and keep up the good work.

    Andrew

    Like

  8. Hi Pinale,

    Your last sessions in the tech-ed india 2011 was fantastics.
    Thanks for that. I have attended both of them.

    Here one doubt on indexing.

    Is it a good idea to index datetime columns.
    Lets say we have a table in which we keep daily transaction details. The table size is very large and at any time we only required the today’s transaction of the user.

    So I was thinking of indexing by date column desending and after that by the user id ascending.

    Please share your views

    Thanks
    Aditya

    Like

  9. Hi Pinal,

    I want to know about the Index fill factor, Index fragmentation and defragmentation, Paging and linking of indexes.

    I know all about the above, but not have confidence on the same, so if you can provide some book or link reference for the same, it would be highly appreciated.

    Best Regards,
    Mahesh

    Like

  10. Hi Pinal,

    I have several questions that i’d like to ask
    1. How could we check if the indexes that we have created were good or bad i.e. degrading the performance, or overlap with other indexes? I am afraid that i’ve added too many indexes that causing degradation in my application.
    2. Is there a way to delete several indexes and statistics all at once without knowing the full name and the table name i.e. only part of the indexes or statistics name.

    Your blog is very informative, thanks for sharing :), keep up the good job.

    Thanks,
    Adi

    Like

  11. These are not true:
    – “Each table must have one Clustered Index.”
    – “Clustered Index must exist before creating Non-Clustered Index.”

    That can be a recommendation, but not a “must”.

    Primary key and unique constraint are logical constructs (constraints) that are phisycally backed-up by an unique index. All three of them: PK, unique constraint and unique index can be created as NONCLUSTERED if you specify that keyword, and you surely can have a table without a clustered index.
    Such tables are called HEAPS. It is just a type of physical structure, as CLUSTERED structure also is.
    Developer and DBA should know when it is better for a table to use heap, and when it is better to have clustered index structure of a table. If you are not sure – measure and be sure!
    If you have a table that is rarely accessed by a primary key (no child tables that reference it, e.g. fact table), and it is wide (has many columns), and frequently modified (heavy on insert/update/delete), you should try with a heap as a phisycal structure.
    Each phisycal structure should be considered and used when appropriate.

    I would add a very neat new feature of SQL 2008: FILTERED INDEXES.
    Ordinary clustered or non-clustered indexes always have exactly the same number of rows as the table they are created on. For huge tables, you have huge indexes (although smaller than table itself).
    If you are interesting in fetching just a small portion of rows, e.g. non-processed rows in a large table that has 99.9% of rows which are “processed”, you will benefit from filtered index greatly.
    It will be very very small compared to the table, and lightning-fast, containing only rows you are interested in (un-processed).

    You can have partitioned indexes on non-partitioned tables or non-partitioned (global) indexes on partitioned table. Or pertitined index and partitined table by a different partitioning schema. But if you have the same partitioning schema for table and all it’s indexes, you get some benefits (easy and super-fast EXCHANGE PARTITION feature is one of them).

    Almost every detail about the indexes and their structure, how they are partitioned, what space so they take, are they used at all and much much more you can see with this add-in for SQL Management Studio:
    http://www.sqlxdetails.com

    Like

  12. Great source!
    2 questions:
    1. Is a qood idea to have GUID as PrimaryKey on a table? (Clustered index)
    2. In case of a 2 column index (non clustered index). One column is GUID (the first -left) and the other is int (possible values: 5) (the right column-2nd). Is there a meaning in changing their order with respect to performance?

    Like

    • 1) Short answer: guid is not a very good clustering key. Try alternative solutions. Long answer: GUID takes 16 bytes, and every NC (non-clustered) index on that table will also become wider for 16 bytes per row. That can add-up considerably and slow your system. Another thing is fragmentation (table will become very fragmented very fast, and thus will be slower than optimal), but you can partially avoid it by using NEWSEQUENTIALID() instead of NEWID() function to generate new values. Much better is to use int identity, not null. If you have replication, you can have CL key of two columns: (INT identity, smallint – replication id) which is 6 bytes, much less than 16. Primary key and clustered index are two different and independant things. Choose clustering key wisely, does not have to consist of PK columns at all.

      2) For queries that filter both fields there is no measurable difference. But for queries that filter just guid, you should have guid as first column. For queries that filter just on int column, you should not have index because it is not selective and sql would go to full scan anyway. So, (guid,int) is only logical option here.

      Don’t be afraid of wider indexes (2-5 key columns, and bunch of included). Consolidate similar indexes into one wider and you will get better overall performance. One has to learn a lot to get the true knowledge about how to do it properly.

      Like

  13. 1) Short answer: guid is not a very good clustering key. Try alternative solutions. Long answer: GUID takes 16 bytes, and every NC (non-clustered) index on that table will also become wider for 16 bytes per row. That can add-up considerably and slow your system. Another thing is fragmentation (table will become very fragmented very fast, and thus will be slower than optimal), but you can partially avoid it by using NEWSEQUENTIALID() instead of NEWID() function to generate new values. Much better is to use int identity, not null. If you have replication, you can have CL key of two columns: (INT identity, smallint – replication id) which is 6 bytes, much less than 16. Primary key and clustered index are two different and independant things. Choose clustering key wisely, does not have to consist of PK columns at all.

    2) For queries that filter both fields there is no measurable difference. But for queries that filter just guid, you should have guid as first column. For queries that filter just on int column, you should not have index because it is not selective and sql would go to full scan anyway. So, (guid,int) is only logical option here.

    Don’t be afraid of wider indexes (2-5 key columns, and bunch of included). Consolidate similar indexes into one wider and you will get better overall performance. One has to learn a lot to get the true knowledge about how to do it properly.

    Like

  14. Pingback: SQL SERVER – Weekly Series – Memory Lane – #028 | SQL Server Journey with SQL Authority

  15. HI Pinal,

    Can you please explain in detail the difference between cluster index seek (clustered) and Index seek (Nonclustered)..Thanks in advance

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s