Question: Does the order of conditions matter in WHERE clause? If we move the conditions in the WHERE clause does it increases the performance? Does SQL Server WHERE conditions support a short circuit?
Sr. Developer in my organization asked me the following question about WHERE clause.
In SQL Server order does not matter in the WHERE condition. SQL Server does not short circuit conditions as well it does not help in performance.
Today is a quick puzzle time.
I recently heard from someone it does matter and a senior SQL person was able to reproduce it, but again, I have no proof for it and I have not seen it before.
Here are the rules for you –
- You can use any numbers of the tables in your query
- You can only change the order of columns in WHERE conditions
- You need to use either AND or OR clause between conditions of the WHERE clause
- The performance will be measured using the Actual Execution Plan and SET IO Statistics ON
- The result set returned from the query should be the same before changing the order of columns in WHERE condition and after changing the order of columns in WHERE condition.
Winning solutions will be posted on this blog with due credit. Here is the related blog post: SQL SERVER – Does Order of Column in WHERE Clause Matter?
You can click on the above link and read the blog post with an example. Trust me it is a very interesting example and looking forward to reading your observations here.
I often get such questions in my SQL Server Performance Tuning Consultancy Comprehensive Database Performance Health Check. Here are two interesting posts which you may be interested in:
Reference: Pinal Dave (https://blog.sqlauthority.com)
case in point:
If you have a column which contains both text and integers and you try to query only int. using ismumeric funtion, it matters where you place this check in your where clause.
Following will fail:
where convert(int, txtCol) = 1
and isnumeric(txtCol) = 1
Following will work:
where isnumeric(txtCol) = 1
and convert(int, txtCol) = 1
At least this is my obersvation.
I have been trying to research this, issue as well as the effect of the order of conditions in a join (separated by AND).
For example, if Table b is partitioned on Col3, could we see a performance improvement by listing a.Col3 = b.Col3 first in the following join?
SELECT a.Col1, b.col1
FROM TableA AS a
INNER JOIN TableB AS b
ON a.Col1 = b.Col1
AND a.col2 = b.Col2
AND a.Col3 = b.Col3
Some people say that the optimizer will do this internally. I asked this question of another SQL Server Expert and was told that they are processed in order, so it does make a difference. He provided the following queries as evidence:
This query will not error. The first condition is met, so the 2nd condition is never processed.
and cast(‘bob’ as datetime)=’hello’
This query will error since the first condition is impossible.
and cast(‘bob’ as datetime)=’hello’
This was shown as proof that conditions are processed in order. He said that join conditions are processed in the same way.
My initial thinking is that if the column on which the table is partitioned is listed first, SQL Server would only have consider that single partition, saving on reads of the others disks and therefore improving performance of that stored procedure and of the database overall.
When I tested this however, I was unable to see any difference in the execution plan.
If there is no difference, can you explain to me how SQL Server does this, and why the above queries operate as they do?
Also SQL Server 2005 does support Short circuit conditions. I think it was not supported in SQL Server 2000.
This is an interesting debate. I am a PeopleSoft developer and have worked extensivvely on SQLs. I think order of clause has very less impact.
To justify what I say I have a simple example:
There is a big query which joins many tables and so has various mappings. Now I want this query to run conditionally. So I declare a variable outside the SQL, and use it in the SQL. Where ever I place this fairly simple condition ( substitued condition looks like ‘C’ = ‘C’ or ‘C’ = ‘F’) the query exectution doesnt change much. It takes aprox 4 secs where a false boolean is encountered.
When the amount of data to be fetched is less, on re-executing the same query many times we get different timings. This has various parameters like was the Query compiled/executed recnetly, etc.