SQL SERVER – Puzzle – Statistics are not Updated but are Created Once

After having excellent response to my quiz – Why SELECT * throws an error but SELECT COUNT(*) does not?I have decided to ask another puzzling question to all of you.

I am running this test on SQL Server 2008 R2. Here is the quick scenario about my setup.

  • Create Table
  • Insert 1000 Records
  • Check the Statistics
  • Now insert 10 times more 10,000 indexes
  • Check the Statistics – it will be NOT updated

Note: Auto Update Statistics and Auto Create Statistics for database is TRUE

Expected Result – Statistics should be updated – SQL SERVER – When are Statistics Updated – What triggers Statistics to Update

Now the question is why the statistics are not updated?

The common answer is – we can update the statistics ourselves using

UPDATE STATISTICS TableName WITH FULLSCAN, ALL

However, the solution I am looking is where statistics should be updated automatically based on algorithm mentioned here.

Now the solution is to ____________________.

Vinod Kumar is not allowed to take participate over here as he is the one who has helped me to build this puzzle.

I will publish the solution on next week. Please leave a comment and if your comment consist valid answer, I will publish with due credit.

Here is the script to reproduce the scenario which I mentioned.

-- Execution Plans Difference
-- Create Sample Database
CREATE DATABASE SampleDB
GO
USE SampleDB
GO
-- Create Table
CREATE TABLE ExecTable (ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
City VARCHAR(100))
GO
-- Insert One Thousand Records
-- INSERT 1
INSERT INTO ExecTable (ID,FirstName,LastName,City)
SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas'
ELSE 'Houston' END
FROM
sys.all_objects a
CROSS JOIN sys.all_objects b
GO
-- Display statistics of the table - none listed
sp_helpstats N'ExecTable', 'ALL'
GO
-- Select Statement
SELECT FirstName, LastName, City
FROM ExecTable
WHERE City  = 'New York'
GO
-- Display statistics of the table
sp_helpstats N'ExecTable', 'ALL'
GO

-- Replace your Statistics over here
-- NOTE: Replace your _WA_Sys with stats from above query
DBCC SHOW_STATISTICS('ExecTable', _WA_Sys_00000004_7D78A4E7);
GO
--------------------------------------------------------------
-- Round 2
-- Insert Ten Thousand Records
-- INSERT 2
INSERT INTO ExecTable (ID,FirstName,LastName,City)
SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas'
ELSE 'Houston' END
FROM
sys.all_objects a
CROSS JOIN sys.all_objects b
GO
-- Select Statement
SELECT FirstName, LastName, City
FROM ExecTable
WHERE City  = 'New York'
GO
-- Display statistics of the table
sp_helpstats N'ExecTable', 'ALL'
GO
-- Replace your Statistics over here
-- NOTE: Replace your _WA_Sys with stats from above query
DBCC SHOW_STATISTICS('ExecTable', _WA_Sys_00000004_7D78A4E7);
GO
-- You will notice that Statistics are still updated with 1000 rows
-- Clean up Database
DROP TABLE ExecTable
GO
USE MASTER
GO
ALTER DATABASE SampleDB
SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
GO
DROP DATABASE SampleDB
GO

Reference: Pinal Dave (http://blog.SQLAuthority.com)

About these ads

15 thoughts on “SQL SERVER – Puzzle – Statistics are not Updated but are Created Once

  1. Hi Pinal,

    Statistics is initially created while retrieving the data from table using the RowSample as 1000 records which are initially inserted, when we insert the Next 10000 records and retrieving the data with same Filter criteria, Query optimizer will reuse the Same Statistics, so its not triggering the engine to update Statistics

    If you check the Statistics Updated time after inserting 10000 records, it will be same as before.

    If we change the Filter criteria like this

    SELECT FirstName, LastName, City
    FROM ExecTable
    WHERE FirstName=’Bob’ and City=’New York’

    It will trigger the Statistics Update process and create a new statistics based on FirstName and also update the Statistics based on City

    Warm Regards
    Arunraj Chandrasekaran

  2. I guess it may be becuse the table is not indexed

    If we crate an Index

    CREATE CLUSTERED INDEX IDX ON ExecTable(id)

    Insert 1000 rows frist then

    SELECT FirstName, LastName, City
    FROM ExecTable
    WHERE City = ‘New York’

    — this statement compiles it first time to keep track of subsequent executions

    and

    insert next 700 rows (as per the algorithm its 500 + 20% rows changed)

    Then check your statistics using
    DBCC SHOW_STATISTICS

    we can see statistics updated.

    INSERT INTO ExecTable (ID,FirstName,LastName,City)
    SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
    ‘Bob’,’smith’,’New York’
    FROM sys.all_objects a
    CROSS JOIN sys.all_objects b
    GO

    CREATE CLUSTERED INDEX IDX ON ExecTable(id)

    GO

    SELECT FirstName, LastName, City
    FROM ExecTable
    WHERE city = ‘New York’

    sp_helpstats N’ExecTable’, ‘ALL’
    DBCC SHOW_STATISTICS(ExecTable, ‘statistics’)

    GO

    INSERT INTO ExecTable (ID,FirstName,LastName,City)
    SELECT TOP 700 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
    ‘Bob’,’smith’,’New York’
    FROM sys.all_objects a
    CROSS JOIN sys.all_objects b

    DBCC SHOW_STATISTICS(ExecTable, ‘statistics’)

    Go

    But, when there is no Index:

    Ex: when there are no Indexes

    1) First insert 1000 rows

    2)Compile statemnt
    3) insert 700 rows
    4) insert 850 rows (as per the algorithm)

    Then Check your Statistics those will be updated.

    INSERT INTO ExecTable (ID,FirstName,LastName,City)
    SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
    ‘Bob’,’smith’,’New York’
    FROM sys.all_objects a
    CROSS JOIN sys.all_objects b

    GO

    SELECT FirstName, LastName, City
    FROM ExecTable
    WHERE city = ‘New York’

    sp_helpstats N’ExecTable’, ‘ALL’
    DBCC SHOW_STATISTICS(ExecTable, ‘statistics’)

    GO

    INSERT INTO ExecTable (ID,FirstName,LastName,City)
    SELECT TOP 700 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
    ‘Bob’,’smith’,’New York’
    FROM sys.all_objects a
    CROSS JOIN sys.all_objects b

    Go

    INSERT INTO ExecTable (ID,FirstName,LastName,City)
    SELECT TOP 850 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
    ‘Bob’,’smith’,’New York’
    FROM sys.all_objects a
    CROSS JOIN sys.all_objects b

    sp_helpstats N’ExecTable’, ‘ALL’

    DBCC SHOW_STATISTICS(ExecTable, ‘statistics’)

  3. Update Statistics is Triggered when ever the optimizer is trying to build an optimal execution plan for the query , in the present scenario we are trying to query the same query twice

    SELECT FirstName, LastName, City
    FROM ExecTable
    WHERE City = ‘New York’

    and allowing the optimizer to reuse the existing plan ,which is considered as most optimal plan .

  4. That was fun!

    Although the column statistics are invalidated by the time the second select statement is executed, the query is not compiled/recompiled but instead the existing query plan is reused.

    It is the “next” compiled query against the column statistics that will see that they are out of date and will then in turn instantiate the action of updating statistics.

    You can see this in action by forcing the second statement to recompile.

    SELECT FirstName, LastName, City
    FROM ExecTable
    WHERE City = ‘New York’ option(RECOMPILE)
    GO

  5. I agree with John. It is reusing the Execution Plan. Aside from OPTION(RECOMPILE), clearing the Execution Plan Cache before the subsequent tests will also work.

    i.e., run this before round 2:
    ————————————————————–
    — Clear execution plan cache before next test
    DBCC FREEPROCCACHE WITH NO_INFOMSGS;
    ————————————————————–

    Nice puzzle!

    Kevin

  6. Pingback: SQL SERVER – Question to You – When to use Function and When to use Stored Procedure Journey to SQLAuthority

  7. Pingback: SQL SERVER – Three Puzzling Questions – Need Your Answer Journey to SQLAuthority

  8. Pingback: SQL SERVER – Solution – Puzzle – Statistics are not Updated but are Created Once Journey to SQLAuthority

  9. This is an excellent post Pinal.

    I think the issue lies with the fact that the plan generated for the query is a “Trivial Plan”. This can be checked by clicking on the Select icon on the Actual Execution plan and looking at the “Optimisation Level” value in the properties window.

    In this case the statistics do get marked for update by the background auto-update stats process but aren’t actually updated since the Trivial Plan present in the cache is never recompiled irrespective of the number of times it is used. This is true even if we encapsulate the select into a stored procedure.

    The statistics update on the city column is triggered by operations like (though not limited to)
    – Explicit Procedure/Plan recompilation
    – DBCC FREEPROCCACHE
    – OPTION RECOMPILE as per John’s suggestion above
    – New query compiled which makes use of the statistics created on this column as per Arunraj’s suggestion above

    More information can be found under the heading “Skipping the Recompilation Step” on this link:

    http://technet.microsoft.com/en-us/library/cc293623.aspx

    Hope this makes sense!

    Regards
    Chirag

  10. Pingback: SQL SERVER – Weekly Series – Memory Lane – #032 | Journey to SQL Authority with Pinal Dave

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s