I was reading the blog of Ward Pond, and I came across another note of Microsoft. I really found it very interesting. The given explanation was very simple; however, I would like to rewrite it again.
Let us execute the following script. This script inserts two values ‘A’ and ‘B’ in the table and outputs a simple code to concatenate each other to produce the result ‘AB’.
IF EXISTS( SELECT * FROM sysobjects WHERE name = 'T1' )
DROPÂ TABLE T1
GO
CREATEÂ TABLE T1( C1 NCHAR(1)Â Â )
INSERT T1 VALUES( 'A' )
INSERT T1 VALUES( 'B' )
DECLARE @Str0 VARCHAR(4)
SET @Str0 = ''
SELECT @Str0 = @Str0 + C1 FROM T1 ORDERÂ BY C1
SELECT @Str0 AS Result
DROPÂ TABLE T1
The code to concatenating any two string is very simple which is used three times in following example.
IF EXISTS( SELECT * FROM sysobjects WHERE name = 'T1' )
DROPÂ TABLE T1
GO
CREATEÂ TABLE T1( C1 NCHAR(1)Â Â )
INSERT T1 VALUES( 'A' )
INSERT T1 VALUES( 'B' )
DECLARE @Str0 VARCHAR(4)
DECLARE @Str1 VARCHAR(4)
DECLARE @Str2 VARCHAR(4)
SET @Str0 = ''
SET @Str1 = ''
SET @Str2 = ''
SELECT @Str0 = @Str0 + C1 FROM T1 ORDERÂ BY C1
SELECT @Str1 = @Str1 + C1 FROM T1 ORDERÂ BY LTRIM( RTRIM( C1 )Â )
SELECT @Str2 = @Str2 + LTRIM( RTRIM( C1 )Â ) FROM T1 ORDERÂ BY C1
SELECT @Str0 'No functions applied to column.'
SELECT @Str1 'LTRIM() and RTRIM() applied to ORDER BY clause.'
SELECT @Str2 'SELECT list with LTRIM(RTRIM()) (Workaround)'
DROPÂ TABLE T1
Resultset of the above query is as follows.
From this result, it is evident that the resultset where function is applied in ORDER BY clause gives the wrong result. When the same function is moved to SELECT clause, it gives the correct result. That is in one way very strange; however, this is how it is defined in SQL Server standard. The behavior of the function in ORDER BY is not defined anywhere in SQL documentation.
The best practice is to avoid the usage of function in ORDER BY clause when string concatenation operations are executed.
The reason for this behavior is that the use of function in ORDER BY clause will change the order of query execution and create an unexpected output.
Let me know if you have any other example for the same or a better explanation.
Reference : Pinal Dave (https://blog.sqlauthority.com)
12 Comments. Leave new
That is interesting.
That’s Strange thing about String in Sqlserver.
I use datetime many time in the order clause with
cast and convert function it give me proper output
every time.
Very strange.
And thanx for the link to Ward’s blog!
pinaldave,
If you want to see what’s going on here, I think you need to shift your point of view from an implementation-centric view to an ANSI point of view. ANSI does not guarantee processing order. Figure 2 is interesting, but it will be potentially misleading if you don’t understand the ANSI rule-set SQL Server operates under in most cases. Implementation thinking can certainly be useful at times when you really need that multi-million row query to finish before the backups fire off, but in this case, it’s counterproductive to understanding what is going on.
First off, your statements all return a single row, which were properly sorted according to your ORDER BY expression; which is to say that there was really no sorting to be done because only a single row is returned. So let’s forget about the ORDER BY. Forget that the execution plan shows 2 rows being sorted — that’s implementation specific. A different vendor’s product could have an optimizer smart enough to see that only a single row can ever be returned from this query and therefore, avoid the unnecessary sort altogether.
Secondly, you are not guaranteed that the string concatenation will occur in any particular order. The ORDER BY clause only pertains to the final results from an ANSI perspective (at least, for these queries). MS SQL Server is right in line with ANSI and will not guarantee you anything different. Don’t expect to get AB in every case.
Finally, SQL works using relational sets, so the construct varchar = varchar + varchar forms a relational set that is evaluated accordingly. I have seen this type of construct used many times and it works sometimes – until it doesn’t. This is why you get only B in one example where I suspect you wanted AB. After thinking about it a bit, I built this example which I hope is just different enough to show you the same mechanism is at work here:
DECLARE @sum INT
SELECT @sum = 0
SELECT @sum = @sum + num FROM (
SELECT 1 [num] UNION SELECT 2 [num]
) [a] ORDER BY (num * num)
SELECT @sum [Is it 3]
I tried to show two things here. The first is that the ORDER BY isn’t doing what you think it’s doing with respect to the order of concatenation in your query. The order here is nonsensical. The second thing I think the example shows is that 1+2 doesn’t equal 3 with this @ = @ + col construct in all cases. It’s a bad construct to use. Remove the ORDER BY and you get 3, leave it in and you get 2. Heck, someone else could get 1 in theory. There’s a case I see all the time where this is used, but I can’t quite recall it. I’ll post it if I think of it again.
Kuldip,
When CAST is used with a datetime, it is non-deterministic, which would change the optimization path. In this case, the implicit CAST from nchar to varchar is deterministic. My only point being that your example would be treated differently by the optimizer, so it’s probably not a similar enough case. If you’re using @date = @date + datecol over a set of rows, I think you should consider some different construct.
One more item to note is that if you built:
SELECT @Str0 = @Str0 + C1 FROM T1
into a table expression (i.e. VIEW, FUNCTION, FROM (SELECT)), you may get the results you want over and over until the one day where you call the view or function with the combination of syntax that triggers the unexpected behavoir.
It’s a strange behaviour of LTRIM, RTRIM functions
@Bob
Very nice explanation. I do have some comments.
“The ORDER BY clause only pertains to the final results from an ANSI perspective (at least, for these queries). ”
TOP is after the ORDER BY. The help file (TOP) explains that.
SELECT…INTO ignores the ORDER BY (see ORDER BY).
SQL Server’s implementation of ORDER BY is arbitrary. I just don’t feel the ANSI standard is truly an answer.
pinaldave,
On my final point, I think it’s unclear/weak. I’d like to add that the most clear reason as to why you can’t expect to get AB or BA, but might in fact get just A or just B, is that the construct @=@ +col is non-deterministic and therefore, for the same range of input values, is not guaranteed to produce the same result in every instance.
hi Pinal,
could you please send interview questions on DTS,SSIS,SSRS,SSAS,Crystal reports please.