Continuous learning is the only way to keep myself up-to-date with SQL Server product features. While I was working for a demo to show usage of SQL Server Statistics, I came across feature called Filter Statistics which was introduced in SQL Server 2008.
There are many times when we have all indexes and statistics in-place, but still optimizer is not able to make a proper estimation because it doesn’t understand the co-relation of the data. To show the power of this, I was able to make a small demo and show its real time usage.
In my database, I have created two tables. Employee and Sales. Here is the script.
USE MASTER
GO
IF DB_ID('DropMeAfterDemo') IS NOT NULL
BEGIN
ALTER DATABASE DropMeAfterDemo SET single_user WITH ROLLBACK IMMEDIATE
DROP DATABASE DropMeAfterDemo
END
GO
CREATE DATABASE DropMeAfterDemo
GO
SET NOCOUNT ON
GO
USE DropMeAfterDemo
GO
IF OBJECT_ID('Employee') IS NOT NULL
DROP TABLE Employee
GO
IF OBJECT_ID('Sales') IS NOT NULL
DROP TABLE Sales
GO
CREATE TABLE Employee(EmpID INT, name NVARCHAR(100))
GO
CREATE TABLE Sales(EmpID INT, SalesAmount INT, DATETIME DATETIME)
GO
CREATE CLUSTERED INDEX cidx_EmpID ON Employee(EmpID)
GO
CREATE INDEX ix_Employee_name ON Employee(name)
GO
CREATE STATISTICS ix_Employee_EmpID_name ON Employee(EmpID, name)
GO
CREATE CLUSTERED INDEX ix_Sales_id_amount ON Sales(EmpID, SalesAmount)
GO
This company has only two employees and they are not doing the same amount of sales. This is what is called as skewness in data. Here is the insert script to populate them in uneven fashion.
-- SQLAuthority is a small blog where me and my wife work
INSERT Employee VALUES(1, 'Pinal')
INSERT Employee VALUES(2, 'Nupur')
GO
SET NOCOUNT ON
-- Pinal did very bad with just one Sale
INSERT Sales VALUES(1, 100,GETDATE())
-- but Nupur did very well with 5000 Sales
DECLARE @loop INT
SET @loop = 1
WHILE @loop <= 5000 BEGIN
INSERT Sales VALUES (2, @loop, GETDATE()-RAND()*1000)
SET @loop = @loop + 1
END
GO
To make sure the statistics are proper, I would update it with full scan.
-- updating all statistics for both tables
UPDATE STATISTICS Employee WITH fullscan
GO
UPDATE STATISTICS Sales WITH fullscan
GO
Now, let’s run the query and have a look at the actual execution plan. You need to enable that in SQL Server Management Studio.
-- Let's look at estimated and actual number of rows
DBCC freeproccache
GO
SELECT salesamount
FROM employee
,sales
WHERE employee.empid = sales.empid
AND name = 'pinal'
GO
DBCC freeproccache
GO
SELECT salesamount
FROM employee
,sales
WHERE employee.empid = sales.empid
AND name = 'nupur'
GO
Here is the query plan. As we can see that the estimated are not correct because SQL can’t determine the matching rows until it find Employee ID from Employee Table.
Now, let’s create filtered statistics based on where clause.
-- create filter statistics based on where clause in query.
CREATE STATISTICS Employee_stats_EmpID ON Employee (EmpID)
WHERE name = 'Pinal'
GO
GO
CREATE STATISTICS Employee_stats_EmpID2 ON Employee (EmpID)
WHERE name = 'Nupur'
GO
If we run exact same query as earlier, here are the plans.
Here is the plan for “name = pinal”
Here is the plan for “name = nupur”
Notice two things here:
1. Statistics picked by optimizer are accurate now. Estimated rows is equal to actual rows.
2. Query Plan is different for two different values.
Here is how we can see filtered statistics
SELECT *
FROM sys.stats
WHERE filter_definition IS NOT NULL
The main usage of this feature would be in a situation where data is skewed. Have you ever got a chance to use this feature in your environment? Let me know.
Reference: Pinal Dave (https://blog.sqlauthority.com)
15 Comments. Leave new
Great article. Something learnt new today. Thanks for sharing.
Shuaib
Thanks Shuaib.
Seems this could also be used to prevent parameter sniffing?
Wilfred – I don’t think so.
Thanks for such good article.
To clear my doubts,
Filtered index and Filtered Statistics are same ?
Nope Kumar. Index and stats are two different things.
Hi Pinal, is there any different strategy to update Stats for Filtered Indexes. As we have observed that Stats changes so fast for filtered Indexes.
Pinal, Can you add the steps to create the filtered statistics?
Bryan – Done.
Thanks sir…..very useful article……..
thanks ssn.
Awesome, Its fascinating.
thanks Shivendra.
Interesting
vikas – Thanks!