Today is quick puzzle time.
Does the order column used in WHERE clause matter for performance?
I recently heard from someone it does matter and a senior SQL person was able to reproduce it, but again I have no proof for it and I have not seen it before.
Here are the rules for you -
- You can use any numbers of the tables in your query
- You can only change the order of columns in WHERE clause
- You need to use either AND or OR clause between conditions of the WHERE clause
- Performance will be measured using Actual Execution Plan and SET IO Statistics ON
- The resultset returned from the query should be the same before changing order of columns in WHERE condition and after changing order of columns in WHERE condition.
Winning solutions will be posted on this blog with due credit.
I am giving here few sample queries and as per my observations they are always giving me same performance no matter how I change the order of column in WHERE clause.
Example 1:
USE AdventureWorks
GO
SELECT *
FROM HumanResources.Employee e
INNER JOIN HumanResources.EmployeeAddress ea ON e.EmployeeID = ea.EmployeeID
WHERE ea.AddressID = 188 AND e.ManagerID != 44
GO
USE AdventureWorks
GO
SELECT *
FROM HumanResources.Employee e
INNER JOIN HumanResources.EmployeeAddress ea ON e.EmployeeID = ea.EmployeeID
WHERE e.ManagerID != 44 AND ea.AddressID = 188
GO
Example 2:
USE AdventureWorks
GO
SELECT *
FROM HumanResources.Employee e
INNER JOIN HumanResources.EmployeeAddress ea ON e.EmployeeID = ea.EmployeeID
WHERE e.ManagerID = 44 AND ea.AddressID != 188
GO
SELECT *
FROM HumanResources.Employee e
INNER JOIN HumanResources.EmployeeAddress ea ON e.EmployeeID = ea.EmployeeID
WHERE ea.AddressID != 188 AND e.ManagerID = 44
GO
Example 3:
USE AdventureWorks
GO
-- Many results
SELECT *
FROM Sales.SalesOrderDetail
WHERE SalesOrderID = 43659 AND SpecialOfferID = 1
GO
-- One condition wrong
SELECT *
FROM Sales.SalesOrderDetail
WHERE SalesOrderID = 43649 AND SpecialOfferID = 1
GO
-- One condition wrong
SELECT *
FROM Sales.SalesOrderDetail
WHERE SpecialOfferID = 0 AND SalesOrderID = 43659
GO
Let us see what is your solution to this question.
Reference: Pinal Dave (http://blog.sqlauthority.com)
Hi,
I think Order of columns does matter and it is to do with the Index utilization. What I understand and have seen is if there is an index defined and the order of index definition is different from the order in which the columns are used in the WHERE condition, then there is a possibility that the index may not be used. Hence from performance perspective, query response time increases. This is what I understand.
Regards,
Phani.
Hi Phani,
Do you have any example to prove that, it will be interesting as I also could not create one.
Kind Regards,
Pinal
Hi Dave,
I am not more experience compare than you. I would to like to share what i understand
SET STATISTICS TIME ON
SET STATISTICS IO ON
SELECT *
FROM HumanResources.Employee e
INNER JOIN HumanResources.EmployeeAddress ea ON e.EmployeeID = ea.EmployeeID
WHERE ea.AddressID = 188 AND e.ManagerID != 44
use before each query ,Elopsed time increases.
I am not sure,this is correct or not.
Theoretically it makes a difference with the following index:
(A char(890), B int). If A is constant and B is highly selective then a query for A = … and B = … will be faster if the highly selective column is put first (B). example query: select * from X where not exists (select * from TestTable where A = … and B = …). Then sql server can abort the match as soon as it did not find a row with B = … which is likely because B is selective. The comparison to A can be saved.
In practice I doubt this is implemented.
nice topic, will try to work on a example. I think when you use covering indexes, the order of columns come into effect. I will double check the order of columns thing and post it.
Yes, the order DOES matter… in certain cases… specifically if you use EXISTS in the WHERE clause predicates.
See the second half of my article on this below:
http://bradsruminations.blogspot.com/2010/04/looking-under-hood.html
–Brad
Thank you Sir,
Your post is always the best of the best!
Kind Regards,
As far as I know Order matters but only in some special cases.
Above article from Brad is very helpful.
I have couple of questions :
1. Does Query Optimizer modifies the query in order to achieve better performance.
2. If the above point is true then does optimizer take decision depending on the indexed available on the tables.
Regards
-Paw
Pingback: SQLAuthority News – A Monthly Round Up of SQLAuthority Blog Posts – August 2010 Journey to SQL Authority with Pinal Dave
hi sir
order of the columns in where effects the performance ,
i cant provide you any example
regards
I have a question regarding the where clause.
My SQL Database environment looks like this:
1. No indexes
2. No normalization
3. Tables are too long
Database comes from a vendor so I can’t improve its structure.
Now, I have been asked to improve performance on all reports, interfaces, procedures, views, etc. In some cases reports took as long as 4 hours to run at 3 in the morning when no one was using the server.
I was able to produce these reports within 8 mintues or less during normal business hours.
Now my question is, how is performance affected when the “where clause” is present vs not present? Does it make a difference in performance how I use the “where clause”?
Example 1: where clause not present:
Select *
from table1 inner join table2 on table1.id = table2.id
vs
Select *
from table1 inner join table2 on table1.id = table2.id
where table1.name = ‘Joe’
Example 2: how I use the where clause:
Select *
from table1 inner join table2 on table1.id = table2.id
where table1.name = ‘Joe’
vs
Select *
from table1 inner join table2 on table1.id = table2.id and table1.name = ‘Joe’
Obviously depending on what results I expect I know whether to use the “where clause” or not. Thank you.
Havek:
Without WHERE Clause: Every single row in Table1 and every single row in Table2 are scanned and the results are JOINed together.
With WHERE Clause: Since you say you have no indexes, the same thing will happen… Table1 and Table2 will be fully scanned; however, only Table1 rows that satisfy the WHERE clause will be JOINed to the Table2 rows. (Of course, if you had an index on Table1.Name and you had indexes on Table1.ID and Table2.ID, the query would go like lightning).
In your Example 2, since you are doing an INNER JOIN, it doesn’t matter where you put the predicate… putting in the ON clause or the WHERE clause doesn’t make any difference. However it would make a difference if you were doing an OUTER JOIN.
–Brad
BTW, I love this site. Its normally the first place I come to whenever I have an SQL doubt. Keep up the great work.
Hi Dave,
First of all thank you for your help giving to us all.
Since I’m new to this world, I have a question concerning the performance:
Concerning the performance, is there any difference between these two statements?
SELECT Id, Field_Name
FROM dbo.Table1;
–OR
SELECT Id, Field_Name
FROM dbo.Table1
WHERE Field_Name LIKE ‘%%’;
The result I get is the same.
My point is that I want to use the second statement like this:
–This is a parameter that I get from a user input
DECLARE @Filter NVARCHAR(50);
SET @Filter = ‘Name’;
DECLARE @FilterFieldName NVARCHAR(50);
IF @Filter = ‘Name’
SET @FilterFieldName = ”
ELSE
SET @FilterFieldName = @Filter
SELECT Id, Field_Name
FROM dbo.Table1
WHERE Field_Name LIKE ‘%’ + @FilterFieldName +’%';
Another option I was thinking was, to use an IF, but that force me to write the same thing twice like that.
–This is a parameter that I get from a user input
DECLARE @Filter NVARCHAR(50);
SET @Filter = ‘Name’;
IF @Filter = ‘Name’
–once
SELECT Id, Field_Name
FROM dbo.Table1
ELSE
–twice
SELECT Id, Field_Name
FROM dbo.Table1
WHERE Field_Name LIKE ‘%’ + @Filter +’%';
Thank you again and best regards!
select * from tbl_Task where Convert(datetime,Time,101)=’2011/12/28′
2011/12/28 data is avalable in tbl_Task Table but i retrive data from only date in datetime datatype
Plz sove this query
What is the datatype of “Time”? If it is a datetime, then if you want all data having to do with ’2011/12/28′, then it’s best to do this:
SELECT *
FROM tbl_Task
WHERE [Time]>=’20111228′
AND [Time]<'20111229'
–Brad
Pingback: SQL SERVER – An Important Part of Most SELECT statement – WHERE clause – Quiz – Puzzle – 4 of 31 « SQL Server Journey with SQL Authority
i have 3 tables
table 1 – date, a , b , c ,d ,e ,f , etc
table 2 – date,g , h ,i ,j , etc
table 3 – date,k , l , m, n ,etc
i want output
a,b,c,i,j,m,n
where etc etc = (SELECT MAX(etc) AS Expr1 FROM table1,table2,table3 WHERE (date_m = ’27/01/2012′))
i ant query to show data …..
please help me sir
Hi Pinal,
There is a problem, if i have a number of tables in inner join & on clause every column have a clustered or non clustered index.
Is there any condition, in on clause column have same type of index ?
Thanks
Ashok
Hey..
I wanted to ask u guys, if we have 2 tables A and B, A contains 1 Million Records , and B contains 1000 records, which one will be better??
A.id=B.id
or
B.id=A.id
or there are no difference?
thank u..
sorry i meant which one will be better in where clause!!
thanks