Earlier I asked puzzle why statistics are not updated. Read the complete details over here: Statistics are not Updated but are Created Once
In the question I have demonstrated even though statistics should have been updated after lots of insert in the table are not updated.(Read the details SQL SERVER – When are Statistics Updated – What triggers Statistics to Update)
In this example I have created following situation:
- Create Table
- Insert 1000 Records
- Check the Statistics
- Now insert 10 times more 10,000 indexes
- Check the Statistics – it will be NOT updated
- Auto Update Statistics and Auto Create Statistics for database is TRUE
Now I have requested two things in the example 1) Why this is happening? 2) How to fix this issue?
I have many answers – here is the how I fixed it which has resolved the issue for me. NOTE: There are multiple answers to this problem and I will do my best to list all.
Solution:
- Create nonclustered Index on column City
Here is the working example for the same. Let us understand this script and there is added explanation at the end.
-- Execution Plans Difference -- Estimated Execution Plan Vs Actual Execution Plan -- Create Sample Database CREATE DATABASE SampleDB GO USE SampleDB GO -- Create Table CREATE TABLE ExecTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO CREATE NONCLUSTERED INDEX IX_ExecTable1 ON ExecTable (City); GO -- Insert One Thousand Records -- INSERT 1 INSERT INTO ExecTable (ID,FirstName,LastName,City) SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Display statistics of the table sp_helpstats N'ExecTable', 'ALL' GO -- Select Statement SELECT FirstName, LastName, City FROM ExecTable WHERE City = 'New York' GO -- Display statistics of the table sp_helpstats N'ExecTable', 'ALL' GO -- Replace your Statistics over here DBCC SHOW_STATISTICS('ExecTable', IX_ExecTable1); GO -------------------------------------------------------------- -- Round 2 -- Insert One Thousand Records -- INSERT 2 INSERT INTO ExecTable (ID,FirstName,LastName,City) SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Select Statement SELECT FirstName, LastName, City FROM ExecTable WHERE City = 'New York' GO -- Display statistics of the table sp_helpstats N'ExecTable', 'ALL' GO -- Replace your Statistics over here DBCC SHOW_STATISTICS('ExecTable', IX_ExecTable1); GO -- Clean up Database DROP TABLE ExecTable GO
When I created non clustered index on the column city, it also created statistics on the same column with same name as index. When we populate the data in the column the index is update – resulting execution plan to be invalided – this leads to the statistics to be updated in next execution of SELECT. This behavior does not happen on Heap or column where index is auto created. If you explicitly update the index, often you can see the statistics are updated as well. You can see this is for sure happening if you follow the tell of John Sansom.
John Sansom‘s suggestion:
That was fun!
Although the column statistics are invalidated by the time the second select statement is executed, the query is not compiled/recompiled but instead the existing query plan is reused.
It is the “next” compiled query against the column statistics that will see that they are out of date and will then in turn instantiate the action of updating statistics.
You can see this in action by forcing the second statement to recompile.
SELECT FirstName, LastName, City FROM ExecTable WHERE City = ‘New York’ option(RECOMPILE) GO
Kevin Cross also have another suggestion:
I agree with John. It is reusing the Execution Plan. Aside from OPTION(RECOMPILE), clearing the Execution Plan Cache before the subsequent tests will also work.
i.e., run this before round 2:
————————————————————–
– Clear execution plan cache before next test
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
————————————————————–
Nice puzzle!
Kevin
As this was puzzle John and Kevin both got the correct answer, there was no condition for answer to be part of best practices. I know John and he is finest DBA around – his tremendous knowledge has always impressed me. John and Kevin both will agree that clearing cache either using DBCC FREEPROCCACHE and recompiling each query every time is for sure not good advice on production server. It is correct answer but not best practice.
By the way, if you have better solution or have better suggestion please advise. I am open to change my answer and publish further improvement to this solution. On very separate note, I like to have clustered index on my Primary Key, which I have not mentioned here as it is out of the scope of this puzzle.
Reference: Pinal Dave (https://blog.sqlauthority.com)
2 Comments. Leave new
Pinal-
I ran into similar situation at work the other day. However, I do have a non-clustered index on the column I am running the query on. The curious thing is that I noticed the statistics on that index was not updated after inserts. Ie, the query plan shows estimated row = 1. If I force a index rebuild, the stats updated itself as expected and the new query plan looked much better. I thought the statistics should update automatically. Do you have any idea why I am seeing this?
Sorry guys, but according to a whitepaper “Statistics Used by the Query Optimizer in Microsoft SQL Server 2008” posted on MIcrosoft web site (which this article refers to)
“The statistics auto update is triggered by query optimization or by execution of a compiled plan”
So, should we conclude that this statement in whitepaper is wrong and statistics update is only triggered by compile but not by execution of a compiled plan?