SQL SERVER – What is Filtered Statistics?

Continuous learning is the only way to keep myself up-to-date with SQL Server product features. While I was working for a demo to show usage of SQL Server Statistics, I came across feature called Filter Statistics which was introduced in SQL Server 2008.

There are many times when we have all indexes and statistics in-place, but still optimizer is not able to make a proper estimation because it doesn’t understand the co-relation of the data. To show the power of this, I was able to make a small demo and show its real time usage.

In my database, I have created two tables. Employee and Sales. Here is the script.

USE MASTER
GO
IF DB_ID('DropMeAfterDemo') IS NOT NULL
BEGIN
ALTER DATABASE
DropMeAfterDemo SET single_user WITH ROLLBACK IMMEDIATE
DROP DATABASE
DropMeAfterDemo
END
GO
CREATE DATABASE DropMeAfterDemo
GO
SET NOCOUNT ON
GO
USE DropMeAfterDemo
GO
IF OBJECT_ID('Employee') IS NOT NULL
DROP TABLE Employee
GO
IF OBJECT_ID('Sales') IS NOT NULL
DROP TABLE Sales
GO
CREATE TABLE Employee(EmpID INT, name NVARCHAR(100))
GO
CREATE TABLE Sales(EmpID INT, SalesAmount INT, DATETIME DATETIME)
GO
CREATE CLUSTERED INDEX cidx_EmpID ON Employee(EmpID)
GO
CREATE INDEX ix_Employee_name ON Employee(name)
GO
CREATE STATISTICS ix_Employee_EmpID_name ON Employee(EmpID, name)
GO
CREATE CLUSTERED INDEX ix_Sales_id_amount ON Sales(EmpID, SalesAmount)
GO

This company has only two employees and they are not doing the same amount of sales. This is what is called as skewness in data. Here is the insert script to populate them in uneven fashion.

-- SQLAuthority is a small blog where me and my wife work
INSERT Employee VALUES(1, 'Pinal')
INSERT Employee VALUES(2, 'Nupur')
GO
SET NOCOUNT ON
-- Pinal did very bad with just one Sale
INSERT Sales VALUES(1, 100,GETDATE())
--  but Nupur did very well with 5000 Sales
DECLARE @loop INT
SET
@loop = 1
WHILE @loop <= 5000 BEGIN
INSERT
Sales  VALUES (2, @loop, GETDATE()-RAND()*1000)
SET @loop = @loop + 1
END
GO

To make sure the statistics are proper, I would update it with full scan.

Solarwinds

-- updating all statistics for both tables
UPDATE STATISTICS Employee WITH fullscan
GO
UPDATE STATISTICS Sales WITH fullscan
GO

Now, let’s run the query and have a look at the actual execution plan. You need to enable that in SQL Server Management Studio.

-- Let's look at estimated and actual number of rows
DBCC freeproccache
GO
SELECT salesamount
FROM employee
,sales
WHERE employee.empid = sales.empid
AND name = 'pinal'
GO
DBCC freeproccache
GO
SELECT salesamount
FROM employee
,sales
WHERE employee.empid = sales.empid
AND name = 'nupur'
GO

Here is the query plan. As we can see that the estimated are not correct because SQL can’t determine the matching rows until it find Employee ID from Employee Table.

SQL SERVER - What is Filtered Statistics? FilterStats-01

SQL SERVER - What is Filtered Statistics? FilterStats-02

Now, let’s create filtered statistics based on where clause.

-- create filter statistics based on where clause in query.
CREATE STATISTICS Employee_stats_EmpID ON Employee (EmpID)
WHERE name = 'Pinal'
GO
GO
CREATE STATISTICS  Employee_stats_EmpID2 ON Employee (EmpID)
WHERE name = 'Nupur'
GO

If we run exact same query as earlier, here are the plans.

Here is the plan for “name = pinal”

SQL SERVER - What is Filtered Statistics? FilterStats-03

Here is the plan for “name = nupur”

SQL SERVER - What is Filtered Statistics? FilterStats-04

Notice two things here:
1. Statistics picked by optimizer are accurate now. Estimated rows is equal to actual rows.
2. Query Plan is different for two different values.

Here is how we can see filtered statistics

SELECT *
FROM   sys.stats
WHERE  filter_definition IS NOT NULL

SQL SERVER - What is Filtered Statistics? FilterStats-05

The main usage of this feature would be in a situation where data is skewed. Have you ever got a chance to use this feature in your environment? Let me know.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Solarwinds
Previous Post
SQL SERVER – Transaction Log Backup Failing After Rebuilding Database Log File
Next Post
SQL SERVER – Introduction to Policy Management

Related Posts

15 Comments. Leave new

Leave a Reply

Menu