Feeds:
Posts
Comments

Posts Tagged ‘Statistics’

I was recently asked is there a single script which can provide all the necessary details about statistics for any database. This question made me write following script. I was initially planning to use sp_helpstats command but I remembered that this is marked to be deprecated in future. Again, using DMV is the right thing to do moving forward. I quickly wrote following script which gives a lot more information than sp_helpstats.

USE AdventureWorks
GO
SELECT DISTINCT
OBJECT_NAME(s.[object_id]) AS TableName,
c.name AS ColumnName,
s.name AS StatName,
s.auto_created,
s.user_created,
s.no_recompute,
s.[object_id],
s.stats_id,
sc.stats_column_id,
sc.column_id,
STATS_DATE(s.[object_id], s.stats_id) AS LastUpdated
FROM sys.stats s JOIN sys.stats_columns sc ON sc.[object_id] = s.[object_id] AND sc.stats_id = s.stats_id
JOIN sys.columns c ON c.[object_id] = sc.[object_id] AND c.column_id = sc.column_id
JOIN sys.partitions par ON par.[object_id] = s.[object_id]
JOIN sys.objects obj ON par.[object_id] = obj.[object_id]
WHERE OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1
AND (s.auto_created = 1 OR s.user_created = 1);

If you have better script to retrieve information about statistics, please share here and I will publish it with due credit.

Update: Read follow up excellent blog post by Jason Brimhall.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

About these ads

Read Full Post »

Earlier I asked puzzle why statistics are not updated. Read the complete details over here: Statistics are not Updated but are Created Once

In the question I have demonstrated even though statistics should have been updated after lots of insert in the table are not updated.(Read the details SQL SERVER – When are Statistics Updated – What triggers Statistics to Update)

In this example I have created following situation:

  • Create Table
  • Insert 1000 Records
  • Check the Statistics
  • Now insert 10 times more 10,000 indexes
  • Check the Statistics – it will be NOT updated
  • Auto Update Statistics and Auto Create Statistics for database is TRUE

Now I have requested two things in the example 1) Why this is happening? 2) How to fix this issue?

I have many answers – here is the how I fixed it which has resolved the issue for me. NOTE: There are multiple answers to this problem and I will do my best to list all.

Solution:

  • Create nonclustered Index on column City

Here is the working example for the same. Let us understand this script and there is added explanation at the end.

-- Execution Plans Difference
-- Estimated Execution Plan Vs Actual Execution Plan
-- Create Sample Database
CREATE DATABASE SampleDB
GO
USE SampleDB
GO
-- Create Table
CREATE TABLE ExecTable (ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
City VARCHAR(100))
GO
CREATE NONCLUSTERED INDEX IX_ExecTable1
ON ExecTable (City);
GO
-- Insert One Thousand Records
-- INSERT 1
INSERT INTO ExecTable (ID,FirstName,LastName,City)
SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas'
ELSE 'Houston' END
FROM
sys.all_objects a
CROSS JOIN sys.all_objects b
GO
-- Display statistics of the table
sp_helpstats N'ExecTable', 'ALL'
GO
-- Select Statement
SELECT FirstName, LastName, City
FROM ExecTable
WHERE City  = 'New York'
GO
-- Display statistics of the table
sp_helpstats N'ExecTable', 'ALL'
GO
-- Replace your Statistics over here
DBCC SHOW_STATISTICS('ExecTable', IX_ExecTable1);
GO
--------------------------------------------------------------
-- Round 2
-- Insert One Thousand Records
-- INSERT 2
INSERT INTO ExecTable (ID,FirstName,LastName,City)
SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas'
ELSE 'Houston' END
FROM
sys.all_objects a
CROSS JOIN sys.all_objects b
GO
-- Select Statement
SELECT FirstName, LastName, City
FROM ExecTable
WHERE City  = 'New York'
GO
-- Display statistics of the table
sp_helpstats N'ExecTable', 'ALL'
GO
-- Replace your Statistics over here
DBCC SHOW_STATISTICS('ExecTable', IX_ExecTable1);
GO
-- Clean up Database
DROP TABLE ExecTable
GO

When I created non clustered index on the column city, it also created statistics on the same column with same name as index. When we populate the data in the column the index is update – resulting execution plan to be invalided – this leads to the statistics to be updated in next execution of SELECT. This behavior does not happen on Heap or column where index is auto created. If you explicitly update the index, often you can see the statistics are updated as well. You can see this is for sure happening if you follow the tell of John Sansom.

John Sansom‘s suggestion:

That was fun!

Although the column statistics are invalidated by the time the second select statement is executed, the query is not compiled/recompiled but instead the existing query plan is reused.

It is the “next” compiled query against the column statistics that will see that they are out of date and will then in turn instantiate the action of updating statistics.

You can see this in action by forcing the second statement to recompile.

SELECT FirstName, LastName, City
FROM ExecTable
WHERE City = ‘New York’ option(RECOMPILE)
GO

Kevin Cross also have another suggestion:

I agree with John. It is reusing the Execution Plan. Aside from OPTION(RECOMPILE), clearing the Execution Plan Cache before the subsequent tests will also work.

i.e., run this before round 2:
————————————————————–
– Clear execution plan cache before next test
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
————————————————————–

Nice puzzle!

Kevin

As this was puzzle John and Kevin both got the correct answer, there was no condition for answer to be part of best practices. I know John and he is finest DBA around – his tremendous knowledge has always impressed me. John and Kevin both will agree that clearing cache either using DBCC FREEPROCCACHE and recompiling each query every time is for sure not good advice on production server. It is correct answer but not best practice.

By the way, if you have better solution or have better suggestion please advise. I am open to change my answer and publish further improvement to this solution. On very separate note, I like to have clustered index on my Primary Key, which I have not mentioned here as it is out of the scope of this puzzle.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

Read Full Post »

After having excellent response to my quiz – Why SELECT * throws an error but SELECT COUNT(*) does not?I have decided to ask another puzzling question to all of you.

I am running this test on SQL Server 2008 R2. Here is the quick scenario about my setup.

  • Create Table
  • Insert 1000 Records
  • Check the Statistics
  • Now insert 10 times more 10,000 indexes
  • Check the Statistics – it will be NOT updated

Note: Auto Update Statistics and Auto Create Statistics for database is TRUE

Expected Result – Statistics should be updated – SQL SERVER – When are Statistics Updated – What triggers Statistics to Update

Now the question is why the statistics are not updated?

The common answer is – we can update the statistics ourselves using

UPDATE STATISTICS TableName WITH FULLSCAN, ALL

However, the solution I am looking is where statistics should be updated automatically based on algorithm mentioned here.

Now the solution is to ____________________.

Vinod Kumar is not allowed to take participate over here as he is the one who has helped me to build this puzzle.

I will publish the solution on next week. Please leave a comment and if your comment consist valid answer, I will publish with due credit.

Here is the script to reproduce the scenario which I mentioned.

-- Execution Plans Difference
-- Create Sample Database
CREATE DATABASE SampleDB
GO
USE SampleDB
GO
-- Create Table
CREATE TABLE ExecTable (ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
City VARCHAR(100))
GO
-- Insert One Thousand Records
-- INSERT 1
INSERT INTO ExecTable (ID,FirstName,LastName,City)
SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas'
ELSE 'Houston' END
FROM
sys.all_objects a
CROSS JOIN sys.all_objects b
GO
-- Display statistics of the table - none listed
sp_helpstats N'ExecTable', 'ALL'
GO
-- Select Statement
SELECT FirstName, LastName, City
FROM ExecTable
WHERE City  = 'New York'
GO
-- Display statistics of the table
sp_helpstats N'ExecTable', 'ALL'
GO

-- Replace your Statistics over here
-- NOTE: Replace your _WA_Sys with stats from above query
DBCC SHOW_STATISTICS('ExecTable', _WA_Sys_00000004_7D78A4E7);
GO
--------------------------------------------------------------
-- Round 2
-- Insert Ten Thousand Records
-- INSERT 2
INSERT INTO ExecTable (ID,FirstName,LastName,City)
SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas'
ELSE 'Houston' END
FROM
sys.all_objects a
CROSS JOIN sys.all_objects b
GO
-- Select Statement
SELECT FirstName, LastName, City
FROM ExecTable
WHERE City  = 'New York'
GO
-- Display statistics of the table
sp_helpstats N'ExecTable', 'ALL'
GO
-- Replace your Statistics over here
-- NOTE: Replace your _WA_Sys with stats from above query
DBCC SHOW_STATISTICS('ExecTable', _WA_Sys_00000004_7D78A4E7);
GO
-- You will notice that Statistics are still updated with 1000 rows
-- Clean up Database
DROP TABLE ExecTable
GO
USE MASTER
GO
ALTER DATABASE SampleDB
SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
GO
DROP DATABASE SampleDB
GO

Reference: Pinal Dave (http://blog.SQLAuthority.com)

Read Full Post »

My article on “Important of Statistics” has been published in Solid Quality Journal.

Statistics are a key part of getting solid performance. In this article we will go over the basics of the statistics and various best practices related to Statistics. We will go over various frequently asked questions like when to update statistics and difference between sync and async update of statistics. We will also discuss the pros and cons of the statistics update.

I have answered one very important questions in this article:

Should keep Auto Create Statistics and Auto Update Statistics settings true/on?

Download Importance of Statistics

Reference: Pinal Dave (http://blog.SQLAuthority.com)

Read Full Post »

Quite often when I am staring at my SSMS I wonder what is going on under the hood in my SQL Server. I often want to know which database is very busy and which database is bit slow because of IO issue. Sometime, I think at the file level as well. I want to know which MDF or NDF is busiest and doing most of the work. Following query gets the same results very quickly.

SELECT DB_NAME(vfs.DbId) DatabaseName, mf.name,
mf.physical_name, vfs.BytesRead, vfs.BytesWritten,
vfs.IoStallMS, vfs.IoStallReadMS, vfs.IoStallWriteMS,
vfs.NumberReads, vfs.NumberWrites,
(
Size*8)/1024 Size_MB
FROM ::fn_virtualfilestats(NULL,NULL) vfs
INNER JOIN sys.master_files mf ON mf.database_id = vfs.DbId
AND mf.FILE_ID = vfs.FileId
GO

When you run above query you will get many valuable information like what is the size of the file as well how many times the reads and writes are done for each file. It also displays the read/write data in bytes. Due to IO if there has been any stall (delay) in read or write, you can know that as well.

I keep this handy but have not shared on blog earlier.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

Read Full Post »

Older Posts »