Question: What is Faster, SUM or COUNT?
Answer: Both are the same. Let me prove it to you.
I recently visited a big financial technology organization and while working on the day-long engagement on Comprehensive Database Performance Health Check in tuning 5 of their most expensive queries, landed upon the very interesting question by their Senior Developer. During the day-long engagement first part of the day, we spent on tuning their server and the final part of the day in tuning their expensive queries. The cumulative time for all the original five queries was over 3 hours which we were able to bring down to just a total of 6 minutes.
During the conversation, the senior developer pointed out that they are using SUM at one place and wanted to change it to COUNT as they believe that SUM is slower and COUNT is faster. Honestly, in most of the cases, they both are the same if your situation is something like in the following query.
As the senior developer and entire organization did not believe me that SUM or COUNT both perform equally in most of the case when you are counting certain conditions. Here I have created the same situation with the help of the AdventureWorks database.
Let us compare the performance of the SUM and COUNT.
USE AdventureWorks2014 GO SET STATISTICS IO ON GO -- Use of SUM -- Original Query SELECT SUM(CASE WHEN SalesOrderID > 50 THEN 1 ELSE 0 END) FROM [Sales].[SalesOrderDetail] WHERE ProductID > 750 GO -- Use of COUNT -- New Proposed Query SELECT COUNT(CASE WHEN SalesOrderID > 50 THEN 1 ELSE 0 END) FROM [Sales].[SalesOrderDetail] WHERE ProductID > 750 GO
Here is the output of the execution plan:
You can see the execution plan is identical and absolutely the same.
Let us see the output of the statistics.
You can observe that IO read is absolutely the same as well.
I have been doing SQL Server Performance Tuning Consulting for over 10 years and so far when you are counting condition with the CASE expression, SUM or COUNT, both perform absolutely the same in SQL Server.
Please leave your comment, I would love to know your experience.
Reference:Â Pinal Dave (https://blog.sqlauthority.com)
10 Comments. Leave new
The core part here is to compute the scalar value, since both expressions are essentially the same, the execution plan should be same.
Doesn’t including 0 in the COUNT statement mean everything is counted (so you basically get the count of rows) whereas if it is NULL only the 1 (ie non-null) values get counted. It has to be 0 for the SUM because the 1s and 0s get added up? I appreciate that the pint of this is about the execution difference between SUM and COUNT but the use seems wrong to me.
I agree that SUM wouldn’t be any slower than count (certainly not measurably), but the queries you’ve used in the demo aren’t interchangeable – they will return different results (unless SalesOrderID is always > 50 when ProductID > 750).
Try out the queries sir – they return the same results.
It might for those conditions, but it won’t for every query. I think Sean is correct, if you returned null instead of 0 for the count then it would return the same result as SUM. Count is just counting how many records have a value. 0 is a value just as much as 1 is a value, so count will count both instances. SUM however will add 0’s and 1’s and get you a number based on how many records there are.
This doesn’t take away from the point of the post – SUM and COUNT I think would have the same performance, but if the result is different then the performance doesn’t matter.
Try out the below and you’ll see the different results (I’m using the AdventureWorks2017 database, not 2014). The only difference from your query is the SalesOrderID:
— Use of SUM — Original Query
SELECT SUM(CASE WHEN SalesOrderID > 56245 THEN 1 ELSE 0 END)
FROM [Sales].[SalesOrderDetail]
WHERE ProductID > 750
GO –returns 52364
— Use of COUNT — New Proposed Query
SELECT COUNT(CASE WHEN SalesOrderID > 56245 THEN 1 ELSE 0 END)
FROM [Sales].[SalesOrderDetail]
WHERE ProductID > 750
GO –returns 98256
If the result is different, the performance would not matter as you said.
My story is where the result is the same SUM and COUNT behaves identically.
Even though if you try out your query where the result is different…. in terms of performance, If you check the Statistics, you will notice that both does identical Logical Read and Execution Plan is same.
So my story holds true even in the case of different results.
SUM and COUNT both performances equal in terms of speed and resource consumption in terms of page reads.
Knew it before, but could not prove it. Thanks for the blog Pinal.
Hello, thank you for this analysis. Does this work on all SQL versions or is it meant for a specific version?
All the versions!