Let us try to understand this subject with example.
We will use Adventurworks database for this purpose. Table which we will be using are HumanResources.Employee (290 rows), HumanResources.EmployeeDepartmentHistory (296 rows) and HumanResources.Department (16 rows).
We will be running following two queries and observe the output. In the resultset the order of first column (EmployeeID) is different in both the cases when whole resultset is same. When compared both the results they are same but the order of rows is different in both the resultset.
Query 1 :
SELECT he.EmployeeID, he.Title, hd.Name, hd.GroupName, hdh.StartDate FROM HumanResources.Employee he LEFT JOIN HumanResources.EmployeeDepartmentHistory hdh ON he.EmployeeID = hdh.EmployeeID RIGHT JOIN HumanResources.Department hd ON hd.DepartmentID = hdh.DepartmentID
Query 2:
SELECT he.EmployeeID, he.Title, hd.Name, hd.GroupName, hdh.StartDate FROM HumanResources.Employee he INNER JOIN HumanResources.EmployeeDepartmentHistory hdh ON he.EmployeeID = hdh.EmployeeID INNER JOIN HumanResources.Department hd ON hd.DepartmentID = hdh.DepartmentID
Now the interesting part of understand why the order is different.
I would like my readers and participate here and give me their opinion. There are many reasons for this happening but what I think is “Effect of Index”.
When we use Left or Right Join, We have a base table Employee and the records are order by the primary key i.e The EmployeeID of the base table by default. But when we use the Inner Join, then the table having smallest number of records are used to order by. Here in our above example, the HumanResources.Department has 16 Records. So the records are sorted by the departmentId of the HumanResources.Department table.
Let me know your opinion and also let me know if you like this kind of small articles.
Reference : Pinal Dave (https://blog.sqlauthority.com)
10 Comments. Leave new
I think it should be stressed that this should be taken as a thought exercise and not used in lieu of an ORDER BY statement.
If an ORDER BY statement is not used, then the order is undefined. You might be able to predict it, but to base code/logic on it is a bad, bad thing to do.
Hi Pinal. I like this article/question!
Here is what I think:
I believe it is the way the join operators organize the rows when joining the two tables together. In the “outer join” query the execution plan shows hash match operators, if you change the hash match operator to nested loop on EmployeeDepartmentHistory and Department you get the same ordered result set as the “inner join” query.
I don’t know enough about how hash match join to comment on why it yeilds a different order than nested set so maybe someone can help me with that.
Here is the modified “outer join” query that yeilds the same ordered result set as the “inner join” query.
SELECT he.EmployeeID, he.Title, hd.Name, hd.GroupName, hdh.StartDate
FROM HumanResources.Employee he
LEFT JOIN HumanResources.EmployeeDepartmentHistory hdh
ON he.EmployeeID = hdh.EmployeeID
RIGHT JOIN HumanResources.Department hd
ON hd.DepartmentID = hdh.DepartmentID OPTION (LOOP JOIN)
Thanks Jacob.
Really OPTION (LOOP JOIN) is very useful to resolve such type of issues.
Although I do find this issue interesting, even as a thought exercise I don’t find it useful… yet.
I was expecting this article to explore the effect of the order of join statements in a query and how the order might be best used to optimise it.
Hi!
Nice article..
talking bout joins is there a way to join these two result sets :
TABLE 1
DATE
01/01/2008
02/01/2008
03/01/2008
TABLE 2
DATE EMP
01/01/2008 123
02/01/2008 123
03/01/2008 456
with condition of Emp=’123′
into these
RESULTS:
DATE EMP
01/01/2008 123
02/01/2008 123
03/01/2008 NULL
thanks…
declare @dt table(dt datetime)
declare @dtz table(dtz datetime,integ int)
insert into @dt
select ’01/01/2008′ union all
select ’02/01/2008′ union all
select ’05/01/2008′
insert into @dtz
select ’01/01/2008′,’0′ union all
select ’02/01/2008′,’0′ union all
select ’05/01/2008′,’1′
select g.dt,z.integ from (select * from @dtz where integ=0)z right join @dt g on g.dt=z.dtz
Could any one guide me with order of tables in query like table with less rows should be first or more rows should be first??,,,please lemme know fast…
Could any one guide me with order of tables in query like table with less rows should be first or more rows should be first??,,,please lemme know fast…
gud explanation..
I have tried this example, however I am getting the same order of rows in all the scenarios, Can you please let me know the reason..