SQL SERVER – What is Filtered Statistics?

April 7, 2015

Continuous learning is the only way to keep myself up-to-date with SQL Server product features. While I was working for a demo to show usage of SQL Server Statistics, I came across feature called Filter Statistics which was introduced in SQL Server 2008.

There are many times when we have all indexes and statistics in-place, but still optimizer is not able to make a proper estimation because it doesn’t understand the co-relation of the data. To show the power of this, I was able to make a small demo and show its real time usage.

In my database, I have created two tables. Employee and Sales. Here is the script.

USE MASTER GO IF DB_ID('DropMeAfterDemo') IS NOT NULL BEGIN ALTER DATABASE DropMeAfterDemo SET single_user WITH ROLLBACK IMMEDIATE DROP DATABASE DropMeAfterDemo END GO CREATE DATABASE DropMeAfterDemo GO SET NOCOUNT ON GO USE DropMeAfterDemo GO IF OBJECT_ID('Employee') IS NOT NULL DROP TABLE Employee GO IF OBJECT_ID('Sales') IS NOT NULL DROP TABLE Sales GO CREATE TABLE Employee(EmpID INT, name NVARCHAR(100)) GO CREATE TABLE Sales(EmpID INT, SalesAmount INT, DATETIME DATETIME) GO CREATE CLUSTERED INDEX cidx_EmpID ON Employee(EmpID) GO CREATE INDEX ix_Employee_name ON Employee(name) GO CREATE STATISTICS ix_Employee_EmpID_name ON Employee(EmpID, name) GO CREATE CLUSTERED INDEX ix_Sales_id_amount ON Sales(EmpID, SalesAmount) GO

This company has only two employees and they are not doing the same amount of sales. This is what is called as skewness in data. Here is the insert script to populate them in uneven fashion.

-- SQLAuthority is a small blog where me and my wife work INSERT Employee VALUES(1, 'Pinal') INSERT Employee VALUES(2, 'Nupur') GO SET NOCOUNT ON -- Pinal did very bad with just one Sale INSERT Sales VALUES(1, 100,GETDATE()) -- but Nupur did very well with 5000 Sales DECLARE @loop INT SET @loop = 1 WHILE @loop <= 5000 BEGIN INSERT Sales VALUES (2, @loop, GETDATE()-RAND()*1000) SET @loop = @loop + 1 END GO

To make sure the statistics are proper, I would update it with full scan.

-- updating all statistics for both tables UPDATE STATISTICS Employee WITH fullscan GO UPDATE STATISTICS Sales WITH fullscan GO

Now, let’s run the query and have a look at the actual execution plan. You need to enable that in SQL Server Management Studio.

-- Let's look at estimated and actual number of rows DBCC freeproccache GO SELECT salesamount FROM employee ,sales WHERE employee.empid = sales.empid AND name = 'pinal' GO DBCC freeproccache GO SELECT salesamount FROM employee ,sales WHERE employee.empid = sales.empid AND name = 'nupur' GO

Here is the query plan. As we can see that the estimated are not correct because SQL can’t determine the matching rows until it find Employee ID from Employee Table.

SQL SERVER - What is Filtered Statistics? FilterStats-01

SQL SERVER - What is Filtered Statistics? FilterStats-02

Now, let’s create filtered statistics based on where clause.

-- create filter statistics based on where clause in query. CREATE STATISTICS Employee_stats_EmpID ON Employee (EmpID) WHERE name = 'Pinal' GO GO CREATE STATISTICS Employee_stats_EmpID2 ON Employee (EmpID) WHERE name = 'Nupur' GO

If we run exact same query as earlier, here are the plans.

Here is the plan for “name = pinal”

SQL SERVER - What is Filtered Statistics? FilterStats-03

Here is the plan for “name = nupur”

SQL SERVER - What is Filtered Statistics? FilterStats-04

Notice two things here:
1. Statistics picked by optimizer are accurate now. Estimated rows is equal to actual rows.
2. Query Plan is different for two different values.

Here is how we can see filtered statistics

SELECT * FROM sys.stats WHERE filter_definition IS NOT NULL

SQL SERVER - What is Filtered Statistics? FilterStats-05

The main usage of this feature would be in a situation where data is skewed. Have you ever got a chance to use this feature in your environment? Let me know.

Reference: Pinal Dave (https://blog.sqlauthority.com)

SQL Statistics

SQL SERVER – Transaction Log Backup Failing After Rebuilding Database Log File

SQL SERVER – Introduction to Policy Management

15 Comments. Leave new

Shuaib
April 7, 2015 12:25 pm
Great article. Something learnt new today. Thanks for sharing.
Shuaib
Reply
- Pinal Dave
  April 8, 2015 6:32 pm
  Thanks Shuaib.
  Reply
Wilfred
April 7, 2015 1:26 pm
Seems this could also be used to prevent parameter sniffing?
Reply
- Pinal Dave
  April 8, 2015 6:33 pm
  Wilfred – I don’t think so.
  Reply
Kumar Harsh
April 7, 2015 5:11 pm
Thanks for such good article.
To clear my doubts,
Filtered index and Filtered Statistics are same ?
Reply
- Pinal Dave
  April 8, 2015 6:33 pm
  Nope Kumar. Index and stats are two different things.
  Reply
  - Sanjeev
    September 13, 2019 6:03 pm
    Hi Pinal, is there any different strategy to update Stats for Filtered Indexes. As we have observed that Stats changes so fast for filtered Indexes.
Bryan Maynard
April 7, 2015 5:37 pm
Pinal, Can you add the steps to create the filtered statistics?
Reply
- Pinal Dave
  April 8, 2015 6:33 pm
  Bryan – Done.
  Reply
ssn
April 8, 2015 2:47 pm
Thanks sir…..very useful article……..
Reply
- Pinal Dave
  April 8, 2015 6:34 pm
  thanks ssn.
  Reply
Shivendra Kumar Yadav (MCP)
April 8, 2015 10:26 pm
Awesome, Its fascinating.
Reply
- Pinal Dave
  April 9, 2015 6:32 am
  thanks Shivendra.
  Reply
vikas pathak
April 14, 2015 12:21 pm
Interesting
Reply
- Pinal Dave
  April 15, 2015 6:54 am
  vikas – Thanks!
  Reply