SQL SERVER – Find Details for Statistics of Whole Database – DMV – T-SQL Script

I was recently asked is there a single script which can provide all the necessary details about statistics for any database. This question made me write following script. I was initially planning to use sp_helpstats command but I remembered that this is marked to be deprecated in future. Again, using DMV is the right thing to do moving forward. I quickly wrote following script which gives a lot more information than sp_helpstats.

USE AdventureWorks
GO
SELECT DISTINCT
OBJECT_NAME(s.[object_id]) AS TableName,
c.name AS ColumnName,
s.name AS StatName,
s.auto_created,
s.user_created,
s.no_recompute,
s.[object_id],
s.stats_id,
sc.stats_column_id,
sc.column_id,
STATS_DATE(s.[object_id], s.stats_id) AS LastUpdated
FROM sys.stats s JOIN sys.stats_columns sc ON sc.[object_id] = s.[object_id] AND sc.stats_id = s.stats_id
JOIN sys.columns c ON c.[object_id] = sc.[object_id] AND c.column_id = sc.column_id
JOIN sys.partitions par ON par.[object_id] = s.[object_id]
JOIN sys.objects obj ON par.[object_id] = obj.[object_id]
WHERE OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1
AND (s.auto_created = 1 OR s.user_created = 1);

If you have better script to retrieve information about statistics, please share here and I will publish it with due credit.

Update: Read follow up excellent blog post by Jason Brimhall.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

About these ads

SQL SERVER – Puzzle – Statistics are not Updated but are Created Once

After having excellent response to my quiz – Why SELECT * throws an error but SELECT COUNT(*) does not?I have decided to ask another puzzling question to all of you.

I am running this test on SQL Server 2008 R2. Here is the quick scenario about my setup.

  • Create Table
  • Insert 1000 Records
  • Check the Statistics
  • Now insert 10 times more 10,000 indexes
  • Check the Statistics – it will be NOT updated

Note: Auto Update Statistics and Auto Create Statistics for database is TRUE

Expected Result – Statistics should be updated – SQL SERVER – When are Statistics Updated – What triggers Statistics to Update

Now the question is why the statistics are not updated?

The common answer is – we can update the statistics ourselves using

UPDATE STATISTICS TableName WITH FULLSCAN, ALL

However, the solution I am looking is where statistics should be updated automatically based on algorithm mentioned here.

Now the solution is to ____________________.

Vinod Kumar is not allowed to take participate over here as he is the one who has helped me to build this puzzle.

I will publish the solution on next week. Please leave a comment and if your comment consist valid answer, I will publish with due credit.

Here is the script to reproduce the scenario which I mentioned.

-- Execution Plans Difference
-- Create Sample Database
CREATE DATABASE SampleDB
GO
USE SampleDB
GO
-- Create Table
CREATE TABLE ExecTable (ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
City VARCHAR(100))
GO
-- Insert One Thousand Records
-- INSERT 1
INSERT INTO ExecTable (ID,FirstName,LastName,City)
SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas'
ELSE 'Houston' END
FROM
sys.all_objects a
CROSS JOIN sys.all_objects b
GO
-- Display statistics of the table - none listed
sp_helpstats N'ExecTable', 'ALL'
GO
-- Select Statement
SELECT FirstName, LastName, City
FROM ExecTable
WHERE City  = 'New York'
GO
-- Display statistics of the table
sp_helpstats N'ExecTable', 'ALL'
GO

-- Replace your Statistics over here
-- NOTE: Replace your _WA_Sys with stats from above query
DBCC SHOW_STATISTICS('ExecTable', _WA_Sys_00000004_7D78A4E7);
GO
--------------------------------------------------------------
-- Round 2
-- Insert Ten Thousand Records
-- INSERT 2
INSERT INTO ExecTable (ID,FirstName,LastName,City)
SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego'
WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas'
ELSE 'Houston' END
FROM
sys.all_objects a
CROSS JOIN sys.all_objects b
GO
-- Select Statement
SELECT FirstName, LastName, City
FROM ExecTable
WHERE City  = 'New York'
GO
-- Display statistics of the table
sp_helpstats N'ExecTable', 'ALL'
GO
-- Replace your Statistics over here
-- NOTE: Replace your _WA_Sys with stats from above query
DBCC SHOW_STATISTICS('ExecTable', _WA_Sys_00000004_7D78A4E7);
GO
-- You will notice that Statistics are still updated with 1000 rows
-- Clean up Database
DROP TABLE ExecTable
GO
USE MASTER
GO
ALTER DATABASE SampleDB
SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
GO
DROP DATABASE SampleDB
GO

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQLAuthority News – Solid Quality Journal – Importance of Statistics

My article on “Important of Statistics” has been published in Solid Quality Journal.

Statistics are a key part of getting solid performance. In this article we will go over the basics of the statistics and various best practices related to Statistics. We will go over various frequently asked questions like when to update statistics and difference between sync and async update of statistics. We will also discuss the pros and cons of the statistics update.

I have answered one very important questions in this article:

Should keep Auto Create Statistics and Auto Update Statistics settings true/on?

Download Importance of Statistics

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Copy Statistics from One Server to Another Server

I was recently working on a performance tuning project in Dubai (yeah I was able to see the tallest tower from the window of my work place). I had a very interesting learning experience there. There was a situation where we wanted to receive the schema of original database from a certain client. However, the client was not able to provide us any data due to privacy issues. The schema was very important because without having an access to underlying data, it was a bit difficult to judge the queries etc. For example, without any primary data, all the queries are running in 0 (zero) milliseconds and all were using nested loop as there were no data to be returned. Even though we had CPU offending queries, they were not doing anything without the data in the tables. This was really a challenge as I did not have access to production server data and I could not recreate the scenarios as production without data.

Well, I was confused but Ruben from Solid Quality Mentors, Spain taught me new tricks. He suggested that when table schema is generated, we can create the statistics consequently. Here is how we had done that:

Once statistics is created along with the schema, without data in the table, all the queries will work as how they will work on production server. This way, without access to the data, we were able to recreate the same scenario as production server on development server.

When observed at the script, you will find that the statistics were also generated along with the query.

You will find statistics included in WITH STATS_STREAM clause.

What a very simple and effective script.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQLAuthority News – Statistics and Best Practices – Virtual Tech Days – Nov 22, 2010

I am honored that I have been invited to speak at Virtual TechDays on Nov 22, 2010 by Microsoft. I will be speaking on my favorite subject of Statistics and Best Practices.

This exclusive online event will have 80 deep technical sessions across 3 days – and, attendance is completely FREE. There are dedicated tracks for Architects, Software Developers/Project Managers, Infrastructure Managers/Professionals and Enterprise Developers. So, REGISTER for this exclusive online event TODAY.

Statistics and Best Practices
Timing: 11:45am-12:45pm
Statistics are a key part of getting solid performance. In this session we will go over the basics of the statistics and various best practices related to Statistics. We will go over various frequently asked questions like a) when to update statistics, b) different between sync and async update of statistics c) best method to update statistics d) optimal interval of updating statistics. We will also discuss the pros and cons of the statistics update. This session is for all of you – whether you’re a DBA or developer!

You can register for this event over here.

If you have never attended my session on this subject I strongly suggest that you attend the event as this is going to be very interesting conversation between us. If you have attended this session earlier, this will contain few new information which will for sure interesting to share with all.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Find Automatically Created Statistics – T-SQL

Earlier, I wrote about my experience at an organization here: SQL SERVER – Plan Cache – Retrieve and Remove – A Simple Script. This blog post briefly narrates another experience I had at the same organization.

When I was there, I also looked at the statistics and found something that I would like to bring into the limelight. As the developers ran many non-production queries on the production server, many statistics were automatically created on the table. These stats were not useful as they were created by several queries which ran one-time or ad-hoc. Because of this, we really had to identify all the useless statistics and drop them off in order to clean up the statement.

Note: If you are going to say that it is not advisable to do as explained, I suggest you read the circumstances of the client I told in the earlier blog post.

Running the following script will return all the automatically created statistics:

SELECT name, OBJECT_NAME(OBJECT_ID)
FROM sys.stats
WHERE auto_created = 1;

You can drop the useless statistics by running the following script on each table:

DROP STATISTICS TableName.StatisticsName

Please note that not all automatically created statistics are useless, although the statistics created by the ad-hoc workload may not be that useful.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Outer Join Not Allowed in Indexed Views

I recently received an email that contains a question from one of my readers. I have already replied the answer to his email, but I would still like to bring it to your attention and ask if you think I could have done any better with the example I gave.

The question was raised when the email sender read the white paper, Improving Performance with SQL Server 2008 Indexed Views. If you scroll all the way down through the said white paper, there are several questions and answers.

Q: Why can’t I use OUTER JOIN in an Indexed view?

A: Rows can logically disappear from an Indexed view based on OUTER JOIN when you insert data into a base table. This makes the OUTER JOIN view to be increasingly updated, which is relatively difficult to implement. In addition, the performance of the implementation would be slower than for views based on standard (INNER) JOIN.

The reader was confused with my answer and wanted me to explain it further. Here is the example which I have quickly put together to demonstrate the behavior described in the above statement.
USE tempdb
GO
-- Create Two Tables
CREATE TABLE BaseTable (ID1 INT, Col1 VARCHAR(100))
CREATE TABLE JoinedTable (ID2 INT, Col2 VARCHAR(100))
GO
-- Insert Values in Tables
INSERT INTO BaseTable (ID1,Col1)
SELECT 1,'First'
UNION ALL
SELECT 2,'Second'
GO
INSERT INTO JoinedTable (ID2,Col2)
SELECT 1,'First'
UNION ALL
SELECT 2,'Second'
UNION ALL
SELECT 3,'Third'
UNION ALL
SELECT 4,'Fourth'
GO
-- Use Outer Join
SELECT jt.*
FROM BaseTable bt
RIGHT OUTER JOIN JoinedTable jt ON bt.ID1 = jt.ID2
WHERE bt.ID1 IS NULL
GO

The script above will give us the following output:

-- Now Insert Rows in Base Table
INSERT INTO BaseTable (ID1,Col1)
SELECT 3,'Third'
GO
-- You will notice that one row less retrieved from Join
SELECT jt.*
FROM BaseTable bt
RIGHT OUTER JOIN JoinedTable jt ON bt.ID1 = jt.ID2
WHERE bt.ID1 IS NULL
GO
-- Clean up
DROP TABLE BaseTable
DROP TABLE JoinedTable
GO

After running this script, you will notice that as the base table gains one row, the result loses one row. Going back to the white paper mentioned earlier, I believe this is expensive to manage for the same reason why it is not allowed in Indexed View.

Let me know if you have a better example to demonstrate this behavior in the Outer Join.

Reference: Pinal Dave (http://blog.SQLAuthority.com)