What I like the most about Microsoft is that they frequently release a new version of SQL Server. The most common problem I have after upgrading to the latest SQL Server is performance regression after an upgrade and that is what brings lots of customers to sign up for my Comprehensive Database Performance Health Check. Recently I got the opportunity to work on a very new feature of Persisting Sample Percentage for Statistics.
Why Turn Off Auto Update Statistics?
While working with my client, I noticed that they have turned off Auto Update Statistics for their database. I asked them why have they kept this one-off and I received a very interesting answer. Here is the reply from the Senior DBA:
We have kept the auto-update statistics turned off as we are manually updating them with the FULLSCAN every night. If we keep the Auto Update Statistics On during the day, sometimes the auto-update process is triggered and updates our statistics with a very low sample rate which actually provides very poor performance.
Well, if it was the year 2000, I would have totally agreed with the argument but I was surprised to know that he still believes in this method almost after 20 years when we are around the corner of the year 2020.
Details About Statistics
Here is the query which you can run to get additional details about your statistics.
SELECT stats.name AS StatisticsName, OBJECT_SCHEMA_NAME(stats.object_id) AS SchemaName, OBJECT_NAME(stats.object_id) AS TableName, last_updated AS LastUpdated, [rows] AS [Rows], rows_sampled, steps, modification_counter AS ModCounter, persisted_sample_percent PersistedSamplePercent, (rows_sampled * 100)/rows AS SamplePercent FROM sys.stats INNER JOIN sys.stats_columns sc ON stats.stats_id = sc.stats_id AND stats.object_id = sc.object_id INNER JOIN sys.all_columns ac ON ac.column_id = sc.column_id AND ac.object_id = sc.object_id CROSS APPLY sys.dm_db_stats_properties(stats.object_id, stats.stats_id) shr WHERE OBJECT_SCHEMA_NAME(stats.object_id) <> 'sys'
I have run this query against sample database WideWorldImporters and got the following results.

In the image above pay special attention to the very last column SamplePercent. The Senior DBA in our story above was actually talking about this one. The statistics were auto-updated with the default sample rate and they were updated with a default calculation of the sampling rate.
Statistics Wisdom in Brief: SQL Server whenever do the statistics update, it does not take the entire table’s data to build a sample, it actually calculates a good sampling ration and takes the selected rows to build a sample statistics which it uses to build an execution plan. In most cases, it is good enough information to build an efficient execution plan but not always.
As I said 20 years ago, I would have agreed to this one but since the release of SQL Server 2017 SP1 (to be precise from SQL Server 2016 SP1 CU4), there is a new feature which can help us to preserve our original sample rate when auto update statistics is executed. Let us learn about the trick to use PERSIST_SAMPLE_PERCENT.
Persisting Sample Percentage for Statistics
Now let us see how we can Persisting Sample Percentage for Statistics with the help of keyword PERSIST_SAMPLE_PERCENT.
First, let us update statistics where the sample percent is not 100%. In our image for the table Sales.OrderLines the statistics of key FK_Sales_OrderLines_OrderID is updated only 29 percent.
Now let us update the statistics with the full scan which will change the SamplePercent to 100 percent.
-- Let us update Statistics with Full Scan UPDATE STATISTICS Sales.OrderLines FK_Sales_OrderLines_OrderID WITH FULLSCAN GO
Next, we will once again run the statistics update with the default value which will reset the SamplePercent to 29 percent.
-- Let us update Statistics with default settings UPDATE STATISTICS Sales.OrderLines FK_Sales_OrderLines_OrderID GO
This was can be a big problem if your query is performing well at the SamplePercent 100% and performing poorly at the SamplePercent 29%.
However, this a very quick workaround to this situation. Now we can force the any statistics with the persisted sample percentage. If you want any particular statistics always at full scan you can always run the following script and lock your sample percentage.
-- Let us update Statistics with Full Scan and PERSIST_SAMPLE_PERCENT UPDATE STATISTICS Sales.OrderLines FK_Sales_OrderLines_OrderID WITH FULLSCAN, PERSIST_SAMPLE_PERCENT = ON GO
After running the above keyword, you will notice that the very last column of the our very first query will now show you sample percent which we have instructed to preserve (persist) for our statistics.
Now let us test if our statistics preserve the percentages or not. Run the following command one more time.
-- Let us update Statistics with default settings UPDATE STATISTICS Sales.OrderLines FK_Sales_OrderLines_OrderID GO
Now whenever you run the default script to update the statistics as above, it will not resample your statistics to default value but will follow whatever we have instructed to our statistics while we persisted sample percent.
However, upon running our very first query we can see that our statistics are preserved at 100%. This is also applicable to auto-update statistics. Now you can also keep your auto-update statistics enable for your server and selected mark your statistics with your preferred sample percentages and keep them persisted.
You can turn off the setting for any particular statistics by running the following command.
-- Let us update Statistics with Full Scan and PERSIST_SAMPLE_PERCENT OFF UPDATE STATISTICS Sales.OrderLines FK_Sales_OrderLines_OrderID WITH FULLSCAN, PERSIST_SAMPLE_PERCENT = OFF GO
While the above query still does full scan the subsequent run of the update statistics with any further values will just update the statistics with the original Sample Percentage.
-- Let us update Statistics with default settings UPDATE STATISTICS Sales.OrderLines FK_Sales_OrderLines_OrderID GO
Did you know about this feature? I have been using this in my consulting engagement Comprehensive Database Performance Health Check for a while and we have been seeing some amazing results. Let me know your experience with this feature. You can follow me on twitter for the latest updates.
Reference:Â Pinal Dave (https://blog.sqlauthority.com)
1 Comment. Leave new
which option is better full scan or sample?