I keep a check on the questions received from my readers; when any question crosses my threshold, I surely try to blog about it online. Stream Aggregate is a quite commonly encountered showplan operator. I have often found it in very simple COUNT(*) operation’s execution plan. If you like to read an official note on the subject, you can read the same on Book Online over here. The Stream Aggregate operator groups rows by one or more columns and then calculates one or more aggregate expressions returned by the query.
Running the following query will give you Stream Aggregate Operator in Execution Plan. To turn on Execution Plan, press CTRL + M.
USE AdventureWorks
GO
SELECT COUNT(*) cRows
FROM HumanResources.Shift;
GO
Now if you try to see the Aggregate operation in text instead of graphic, you can find that it clearly suggests that aggregation happens because of the COUNT(*) operation.
SET SHOWPLAN_ALL ON;
GO
SELECT COUNT(*) cRows
FROM HumanResources.Shift;
GO
SET SHOWPLAN_ALL OFF;
GO
Notably, in the same operation, the Compute Scalar operation is also present, which in fact does the implicit conversion of the COUNT(*) operation. This operation is there because during the stream aggregation operation, the Expression is converted to BIGINT, and it is converted back to INT right after COUNT(*) operation.
Now let us try to prove that Stream Aggregate returns results in the format for BIGINT and not in INT datatype, which is leading to use of additional operation of scalar convert. We will run the above code with COUNT_BIG(*) instead of COUNT(*).
SET SHOWPLAN_ALL ON;
GO
SELECT COUNT_BIG(*) cRows
FROM HumanResources.Shift;
GO
SET SHOWPLAN_ALL OFF;
GO
You can clearly see from this example that there is no Compute Scalar operation when function COUNT_BIG is used in place of COUNT.
Now, the abovementioned information leads to the following question: does this mean that the performance of COUNT_BIG is better than COUNT operation. Let us compare the performance for the same by running following code.
SET SHOWPLAN_ALL ON;
GO
SELECT COUNT(*) cRows
FROM HumanResources.Shift;
GO
SELECT COUNT_BIG(*) cRows
FROM HumanResources.Shift;
GO
SET SHOWPLAN_ALL OFF;
GO
From the execution plan, the cost of both the operations is exactly same. Now I would like to get your opinion on what you think of this article and your initial reaction to the behavior of SQL Server. Please do leave a comment here.
Reference: Pinal Dave (https://blog.sqlauthority.com)
9 Comments. Leave new
Very nice Pinal, thanx!
Well, conversion from larger datatype (BIGINT – 64 bit) to smaller type (INT – 32 bit) does not take time at all. Other way around it could take couple of CPU cycles, depending how it is done.
I think what happens is that COUNT() and COUNT_BIG() are calculated into 64 bit variable and then type casted into 32 bit variable in the case of COUNT(). There is no memory allocation, copying or anything happening, just type casting. You just read the last 32 bit out of 64 bit wide memory space and leave the first 32 bit unnoticed. This takes zero CPU cycles.
(Well, not “last” 32 bit since memory isn’t ordered from left to right in the way one would think)
hi pinal,
If both the operations cost is same then which types operation we used for query writing? if both are same why there are 2 different operations given?
Great Article. Complex topics like this are understand easily by your explanation.Thanks a lot.
Wonderful blog! I found it while surfing around on Yahoo News.
Do you have any suggestions on how to get listed in Yahoo News?
I’ve been trying for a while but I never seem to get there! Thank you
Greetings! Very useful advice within this post! It’s the little changes that make the most significant changes. Many thanks for sharing!
Excellent way of describing, and pleasant article to obtain information concerning my presentation subject matter,
which i am going to convey in institution of higher education.
thanks….very good explanation.
Hi Pinal,
In My code guid is inherited every where and my most of the queries and sp’s are referencing that in where clause. i know that guid is not better option for creating indexes but already inherited everywhere is there any way to optimize the database?