[Note from Pinal]: This is a 83rd episode of Notes from the Field series. Maintenance of the database is very critical activity and I have always seen DBA taking it very seriously. There is a only one problem – there is no single solution or guidance for how to maintain the database. Everybody has their own opinion and way to do different tasks. Statistics is one of the most important aspect of the database. The performance of entire application can depend on statistics, as it can help SQL Engine with intelligence to execute optimal plan.
In this episode of the Notes from the Field series database expert Stuart Ainsworth explains about basic statistics maintenance.
Statistic maintenance is an important (but often overlooked) aspect of performance tuning for SQL Server. The query optimizer relies on distribution statistics to determine how the query will be executed, with a particular emphasis on SEEKS vs SCANS as well as estimates of effort needed (the cost threshold for parallelism). Out-of date stats can impact performance significantly.
Luckily, the default setting for most databases covers most database performance scenarios; SQL Server offers three basic settings for statistics maintenance:
- Auto Create Statistics – SQL Server will create statistics during an index creation script, or when deemed necessary to satisfy a query; enabled by default.
- Auto Update Statistics – SQL Server will update statistics when it deems them to be outdated; enabled by default.
- Auto Update Statistics Asynchronously – When enabled, SQL Server will updated statistics after the execution of a query if it determines that an update is necessary; if disabled, the statistics will be updated first, and then the query executed. This setting is disabled by default, and there’s some controversy over whether or not it’s useful.
For the most part, SQL Server does a good job of maintaining statistics if these options are left with the defaults; however, statistics can still become stale over time if the data is updated at a slow rate of change. If your statistics are more than a year old in your database, it’s probably time to do a refresh.
But how do you tell when your stats are out of date? There’s a catalog view in SQL Server called sys.stats that will give you a lot of information about statistics, but it’s very detailed; data is collected down to the column level, which may be overwhelming if you have lots of databases. I suggest starting at a higher level, by taking a look at how out of date statistics are across all of your databases. I use the following query to help me quickly identify troublesome databases:
/*checks last updated stats date for all databases; calculates a percentage of stats updated within the last 24 hours.
useful to determine how out-of-date statistics are. Also identifies if auto updatestates are on and if asynchronous updates
CREATE TABLE #dbs
, oldest_stats DATETIME
, newest_stats DATETIME
, percent_update_within_last_day DECIMAL(5, 2)
, is_auto_update_stats_on BIT
, is_auto_update_stats_async_on BIT
DECLARE @dynsql NVARCHAR(4000) = 'use ?
if db_id() >4
WITH x AS ( SELECT STATS_DATE(s.object_id, stats_id) date_updated
FROM sys.stats s
JOIN sys.tables t ON t.object_id = s.object_id
WHERE t.is_ms_shipped = 0
AS ( SELECT MIN(date_updated) AS oldest_stats
, MAX(date_updated) AS newest_stats
SELECT DB_NAME() database_name
, SUM(CASE WHEN DATEDIFF(d, date_updated, newest_stats) <=1 THEN 100.00 ELSE 0.00 END)/COUNT(*) "percent_update_within_last_day"
CROSS JOIN x1
CROSS JOIN sys.databases d
WHERE d.database_id = db_id()
GROUP BY oldest_stats
INSERT INTO #dbs
EXEC sp_MSforeachdb @dynsql
FROM #dbs d
DROP TABLE #dbs
Results will look similar to the following:
Looking at the results, if I see that the oldest database stats are more than a year old, it’s s a pretty good indicator that statistics are not being maintained by some form of ongoing maintenance operation. If the defaults are not being used, that’s also something that needs to be investigated.
The percent_update_within_last_day is also a good trigger for me to investigate a potential issue; if less than 20% of the statistics in a database were updated in the last 24 hours, the odds of a less-than-optimal execution plan increase significantly. If the tables in the database are large, it may take several million rows of changed data to trigger a refresh.
To refresh outdated statistics, I recommend Ola Hallengren’s maintenance scripts; they offer a lot of flexibility for developing a customized index and statistics maintenance plan, including the ability to update statistics on a scheduled basis, and focus on updating statistics that need to be refreshed. Basic maintenance (such as this) can help prevent performance problems from occurring.
If you want to get started with SQL Server with the help of experts, read more over at Fix Your SQL Server.
Reference: Pinal Dave (http://blog.sqlauthority.com)