Working with SQL Server has never seems to be monotonous – no matter how long one has worked with it. Quite often, I come across some excellent comments that I feel like acknowledging them as blog posts. Recently, I wrote an article on SQL SERVER – Execution Plan and Results of Aggregate Concatenation Queries Depend Upon Expression Location, which is well received in community.
Before you read this article further, I request you to read original article. I received very interesting comments from Bob on the blog, where he explained why this is happening. Further, he talked about a similar kind of example. Let us first read his comment, and then we will execute his example and see the resultset.
Comment from Bob
pinaldave,
If you want to see what’s going on here, I think you need to shift your point of view from an implementation-centric view to an ANSI point of view. ANSI does not guarantee processing order. Figure 2 is interesting, but it will be potentially misleading if you don’t understand the ANSI rule-set SQL Server operates under in most cases. Implementation thinking can certainly be useful at times when you really need that multi-million row query to finish before the backups fire off, but in this case, it’s counterproductive to understanding what is going on.
First off, your statements all return a single row, which were properly sorted according to your ORDER BY expression; which is to say that there was really no sorting to be done because only a single row is returned. So let’s forget about the ORDER BY. Forget that the execution plan shows 2 rows being sorted — that’s implementation specific. A different vendor’s product could have an optimizer smart enough to see that only a single row can ever be returned from this query and therefore, avoid the unnecessary sort altogether.
Secondly, you are not guaranteed that the string concatenation will occur in any particular order. The ORDER BY clause only pertains to the final results from an ANSI perspective (at least, for these queries). MS SQL Server is right in line with ANSI and will not guarantee you anything different. Don’t expect to get AB in every case.
Finally, SQL works using relational sets, so the construct varchar = varchar + varchar forms a relational set that is evaluated accordingly. I have seen this type of construct used many times and it works sometimes – until it doesn’t. This is why you get only B in one example where I suspect you wanted AB. After thinking about it a bit, I built this example which I hope is just different enough to show you the same mechanism is at work here:
DECLARE @sum INT
SELECT @sum = 0
SELECT @sum = @sum + num FROM (
SELECT 1 [num] UNION SELECT 2 [num]
) [a] ORDER BY (num * num)
SELECT @sum [Is it 3]
I tried to show two things here. The first is that the ORDER BY isn’t doing what you think it’s doing with respect to the order of concatenation in your query. The order here is nonsensical. The second thing I think the example shows is that 1+2 doesn’t equal 3 with this @ = @ + col construct in all cases. It’s a bad construct to use. Remove the ORDER BY and you get 3, leave it in and you get 2. Heck, someone else could get 1 in theory. There’s a case I see all the time where this is used, but I can’t quite recall it. I’ll post it if I think of it again.
I must admit that he has made excellent point and explained the concept very well as well. Just for clarity, we will see the implementation of the example mentioned by him to understand how it works.
DECLARE @sum INT
SELECT @sum = 0
SELECT @sum = @sum + num FROM (
SELECT 1 [num] UNION SELECT 2 [num]
) [a] ORDER BY (num * num)
SELECT @sum WithOrderBy
GO
DECLARE @sum INT
SELECT @sum = 0
SELECT @sum = @sum + num FROM (
SELECT 1 [num] UNION SELECT 2 [num]
) [a]
SELECT @sum WithoutOrderBy
GO
Let us see the result when we run the above two queries with and without ORDER BY clause.
Again, it is interesting to see how results are different from each other when there is only change of inclusion of ORDER BY clause. Bob has explained why this is happening in his comment earlier. He also explains further in another comment in the same article that.
Comment from Bob:
One more item to note is that if you built:
SELECT @Str0 = @Str0 + C1 FROM T1
into a table expression (i.e. VIEW, FUNCTION, FROM (SELECT)), you may get the results you want over and over until the one day where you call the view or function with the combination of syntax that triggers the unexpected behavoir.
pinaldave,
On my final point, I think it’s unclear/weak. I’d like to add that the most clear reason as to why you can’t expect to get AB or BA, but might in fact get just A or just B, is that the construct @=@ +col is non-deterministic and therefore, for the same range of input values, is not guaranteed to produce the same result in every instance.
Bob, Thank you very much for excellent explanation and your participation in the community.
Reference : Pinal Dave (https://blog.sqlauthority.com)
5 Comments. Leave new
Hi Pinal,
Excellent article and explantion from bob, i guess there are lot of finer nuances to be learnt with SQL.
I want to thank you for your guidance. This is really great, you have taken the original explanation to next level.