[Note from Pinal]: This is a 83rd episode of Notes from the Field series. Maintenance of the database is very critical activity and I have always seen DBA taking it very seriously. There is a only one problem – there is no single solution or guidance for how to maintain the database. Everybody has their own opinion and way to do different tasks. Statistics is one of the most important aspect of the database. The performance of entire application can depend on statistics, as it can help SQL Engine with intelligence to execute optimal plan.
In this episode of the Notes from the Field series database expert Stuart Ainsworth explains about basic statistics maintenance.
Statistic maintenance is an important (but often overlooked) aspect of performance tuning for SQL Server. The query optimizer relies on distribution statistics to determine how the query will be executed, with a particular emphasis on SEEKS vs SCANS as well as estimates of effort needed (the cost threshold for parallelism). Out-of date stats can impact performance significantly.
Luckily, the default setting for most databases covers most database performance scenarios; SQL Server offers three basic settings for statistics maintenance:
- Auto Create Statistics – SQL Server will create statistics during an index creation script, or when deemed necessary to satisfy a query; enabled by default.
- Auto Update Statistics – SQL Server will update statistics when it deems them to be outdated; enabled by default.
- Auto Update Statistics Asynchronously – When enabled, SQL Server will updated statistics after the execution of a query if it determines that an update is necessary; if disabled, the statistics will be updated first, and then the query executed. This setting is disabled by default, and there’s some controversy over whether or not it’s useful.
For the most part, SQL Server does a good job of maintaining statistics if these options are left with the defaults; however, statistics can still become stale over time if the data is updated at a slow rate of change. If your statistics are more than a year old in your database, it’s probably time to do a refresh.
But how do you tell when your stats are out of date? There’s a catalog view in SQL Server called sys.stats that will give you a lot of information about statistics, but it’s very detailed; data is collected down to the column level, which may be overwhelming if you have lots of databases. I suggest starting at a higher level, by taking a look at how out of date statistics are across all of your databases. I use the following query to help me quickly identify troublesome databases:
/*checks last updated stats date for all databases; calculates a percentage of stats updated within the last 24 hours.
useful to determine how out-of-date statistics are. Also identifies if auto updatestates are on and if asynchronous updates
are enabled.
*/
CREATE TABLE #dbs
(
database_name VARCHAR(100)
, oldest_stats DATETIME
, newest_stats DATETIME
, percent_update_within_last_day DECIMAL(5, 2)
, is_auto_update_stats_on BIT
, is_auto_update_stats_async_on BIT
)
DECLARE @dynsql NVARCHAR(4000) = 'use ?
;
if db_id() >4
BEGIN
WITH x AS ( SELECT STATS_DATE(s.object_id, stats_id) date_updated
FROM sys.stats s
JOIN sys.tables t ON t.object_id = s.object_id
WHERE t.is_ms_shipped = 0
) ,
x1
AS ( SELECT MIN(date_updated) AS oldest_stats
, MAX(date_updated) AS newest_stats
FROM x
)
SELECT DB_NAME() database_name
, oldest_stats
, newest_stats
, SUM(CASE WHEN DATEDIFF(d, date_updated, newest_stats) <=1 THEN 100.00 ELSE 0.00 END)/COUNT(*) "percent_update_within_last_day"
, d.is_auto_update_stats_on
, d.is_auto_update_stats_async_on
FROM x
CROSS JOIN x1
CROSS JOIN sys.databases d
WHERE d.database_id = db_id()
GROUP BY oldest_stats
, newest_stats
, d.is_auto_update_stats_on
, d.is_auto_update_stats_async_on
END
'
INSERT INTO #dbs
( database_name
, oldest_stats
, newest_stats
, percent_update_within_last_day
, is_auto_update_stats_on
, is_auto_update_stats_async_on
)
EXEC sp_MSforeachdb @dynsql
SELECT *
FROM #dbs d
DROP TABLE #dbs
Results will look similar to the following:
Looking at the results, if I see that the oldest database stats are more than a year old, it’s s a pretty good indicator that statistics are not being maintained by some form of ongoing maintenance operation. If the defaults are not being used, that’s also something that needs to be investigated.
The percent_update_within_last_day is also a good trigger for me to investigate a potential issue; if less than 20% of the statistics in a database were updated in the last 24 hours, the odds of a less-than-optimal execution plan increase significantly. If the tables in the database are large, it may take several million rows of changed data to trigger a refresh.
To refresh outdated statistics, I recommend Ola Hallengren’s maintenance scripts; they offer a lot of flexibility for developing a customized index and statistics maintenance plan, including the ability to update statistics on a scheduled basis, and focus on updating statistics that need to be refreshed. Basic maintenance (such as this) can help prevent performance problems from occurring.
If you want to get started with SQL Server with the help of experts, read more over at Fix Your SQL Server.
Reference: Pinal Dave (https://blog.sqlauthority.com)
9 Comments. Leave new
Thanks, Stuart and Pinal (and greetings to you both).
Stuart, I find that the query you offer fails because of the reference in the first query to is_temporary. that’s not a column in the select, which returns just date_updated. Is that really working for you?
In case someone else may search this page for the error (if they get it to), it’s:
Invalid column name ‘is_temporary’.
Msg 207, Level 16, State 1, Line 7
When I remove that WHERE clause, it then at least works. That said, I do get a warning:
Warning: Null value is eliminated by an aggregate or other SET operation.
That occurs once each for the 4 db’s I have. I guess some col in one of those that query doing a GROUP By has a null value causing problem for one of the aggregating functions.
But it does work otherwise, so again thanks for the nifty code! :-)
As per https://docs.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-stats-transact-sql?view=sql-server-2017 that column was introduced in SQL Server 2012.
I guess you are running that on lower version of SQL.
Pinal is correct; I need to update the code for older versions. In fact, I have a couple different versions of this script, and this is one of the simpler forms. The reference to is_temporary is usually only necessary in an AlwaysOn environment, so removing it is fine.
The NULL error occurs because a stat may have been created but never updated; since the script is looking for an aggregate value as a rule-of-thumb, this too is OK.
Thanks for replying. Pinal, And I was having a brain fart that day, in thinking that there was a connection between cols in the WHERE clause and cols in the SELECT. (Some old DB’s used to have that limit. Just slipped into old thinking.
And I should have checked to see if it was a version-specific thing. I suspect the server I was testing it on is in 2008 or its compat mode.
At least the comments will stand, to help others coming along who may be in the same boat. If Stuart may tweak it to somehow make it conditional based on the version, you could freely delete my comments and your reply if you’d like. Thanks, as always, for all you (both) do.
I also got same error. To make it run , it simply commented where clause.
corporateworldasiseeit – Thanks. I have fixed it.
Thanks to you both, Pinal and Stuart, for your responses to the observations shared here, to make the script more widely usable by future readers. :-)
:) … Thanks.