SQL SERVER – Interesting Observation – Query Hint – FORCE ORDER

SQL Server never stops to amaze me. As regular readers of this blog already know that besides conducting corporate training, I work on large-scale projects on query optimizations and server tuning projects. In one of the recent projects, I have noticed that a Junior Database Developer used the query hint Force Order; when I asked for details, I found out that the basic concept was not properly understood by him.

Today, let us try to understand its working and the effect of this hint. Further, we will see the extent of difference in performance created by this one query hint. I also have one interesting question for all of you as well; I will give the answer in one of my later posts.

USE AdventureWorks
GO
SELECT COUNT(*) CountEmployee -- 290 Rows
FROM HumanResources.Employee
GO
SELECT COUNT(*) CountEmployeeAddress -- 290 Rows
FROM HumanResources.EmployeeAddress
GO
SELECT COUNT(*) CountEmployeeDepartmentHistory -- 296 Rows
FROM HumanResources.EmployeeDepartmentHistory
GO
SELECT *
FROM HumanResources.Employee e
INNER JOIN HumanResources.EmployeeAddress ea ON ea.EmployeeID = e.EmployeeID
INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON edh.EmployeeID = e.EmployeeID
GO
SELECT *
FROM HumanResources.Employee e
INNER JOIN HumanResources.EmployeeAddress ea ON ea.EmployeeID = e.EmployeeID
INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON edh.EmployeeID = e.EmployeeID
OPTION (FORCE ORDER)
GO

In above example, I have used three tables and their row count is listed as well. Employee and EmployeeAddress – both have same number rows, and EmployeeDepartmentHistory table has around 6 rows more than the other two tables. Now let us run the query without using OPTION (FORCE ORDER) and run it along with the query hint and check the execution plan. You will find a difference in the query cost.

We all accept that the table with least number of rows should be listed as the base table, and the same is done here. We have two such tables with the least number of rows, which are listed as base tables. Now before we further explain this, let us see the execution plan for the same.

It is very clear from the above execution plan that when I order is forced the query cost goes high. This proves that the SQL Server has already made a good decision with regard to the optimized query plan. When plan is forced in the case of the joins more than 2 table the performance matters. Let us see the execution order of the table in both the cases.

Now, from the example, it is very clear when you force order the query, it evaluates the table Employee first and when it is not forced, it evaluates EmployeeAddress first. Even though both the tables have same number of rows, the query optimizer processes them differently and uses different types of join logic. When the order is not forced, it uses hash join; further, in case of forced order, it uses nested loop – this creates a significant difference in the query cost.

The conclusion of this whole exercise is very simple.

  • SQL Server Query Execution Engine is pretty smart to decide the best execution plan with least query cost for any query.
  • Order of the tables in any query can make a significant impact on the query.

Now the question for you: We have seen that using query hint of OPTION (FORCE ORDER) reduces the performance; give an example wherein we can use this hint to improve the performance?

Please leave your comment here. I will publish the answer to this question with due credit and with my own example in a later post.

Reference: Pinal Dave (http://blog.sqlauthority.com)

24 thoughts on “SQL SERVER – Interesting Observation – Query Hint – FORCE ORDER

  1. HI !! Pinal

    I want to replicate Database from My One Server location to another.
    For Example If One Database is in Ahemeabad and Other In Surat with large set of data the min. size of database is 2-3 GB.
    Here the Problem is that as per requirement I have to put One Synchronization button on my Web Page (in Website).
    Because user has not continuous Internet connectivity he will be ale to synchronize data from that button.
    So I think I need to do Web synchronization.
    If i have to use Sql Server 2008 or Window Application or any Other code Or Microsoft Sync. Framework than plz suggest me.
    I have to obtain this without any tool.
    In short I need to perform replication from Server to Server.
    Please suggest me the rigth way to perform this functionality with some code guideline.

    Thanks,
    Nilesh Rathod

    Like

  2. Hi Pinal,

    What is the difference between the hash join and the nested loop join in these two queries?

    I was also wondering how the estimated query costs are calculated.

    Thanks.

    Like

  3. Hi Pinal

    I can’t think of some specific example, but I guess if I have a query with a combination of different joins (inner-outer). I can use Option (Force Order), in order to have a optimized query.

    Comments ?

    Like

  4. Hey Nilesh

    Your Database size is 2-3 GB, so backing up this DB should take barely any time. You have following options

    1. Full Backup on Source –> Copy the Backup on Destination file share –> Restore the backup

    2. Look into Log-Shipping Option, its not bad

    Replication is your 3rd option but I would rule that out for now b’cos you mentioned internet-connectivity can be an issue

    Like

  5. OPTION (FORCE ORDER) – forced to use ‘NESTED LOOP’ option when joing table EMPLOYEE and Table EMPLOYEE HISTORY and again on EMPLOYEE DEPT HISTORY tables. Though it forced to use indexes and NESTED LOOP ( similar like FOR LOOP) but no predicate(s) on table A or B or C , it would take more time than ‘ JUST HASH MATCH’ and for loop with the data.

    This option would perform well if the query has predicates , say filter on EPLOYEE ID ( or list of IDS ).

    NESTED LOOP works like FOR LOOP record by record with table a ( base table or driving table) and table B.

    Say table a has 1000 records and table b has 1000 records , it works similar to the one mentioned below.

    for a in 1..1000
    for b in 1..1000

    end loop
    end loop

    Like

  6. Hi Pinal,

    You ask whether we can give an example where this hint would improve performance. I cannot give a specific example, but I would point out the danger of assuming that a lower optimizer cost is proof that one execution plan is better than another. You really have to look at the actual IO, CPU, and Duration to make this assessment.

    I see many cases when the optimizer’s choice of hash join is obviously badly suboptimal, even when proper indexing and statistics are present. If we see hash joins in our test environment, we will routinely “play with” hints and measure the actual performance of the query using different join types.

    Sometimes when the optimizer prefers hash joins to nested loops, when I run the query both ways I see IO, CPU and Duration that is much higher for the hash join – sometimes more than one order of magnitude higher, in fact. In these cases, I am quick to remedy the situation by using the INNER LOOP JOIN operator, which effectively does the same thing as FORCE ORDER but which I prefer because it is more explicitly named, and can be applied to only a single join.

    Cheers,
    Chris

    Like

  7. Hi Pinal,

    Came across a strange situation, where we had to use ‘FORCE ORDER’ in a badly written SQL statement for one of our client sites. The query would run without problems on SQL Server 2008 Enterprise with 35 to 40 seconds, but the same query on SQL Server 2008 Standard version would take hours. Once we used ‘FORCE ORDER’ query hint on the SQL Server 2008 Standard version the query would run within a minute.

    The question which comes to mind is why ‘SQL Server 2008 Standard’ version behaving different when compared to ‘SQL Server 2008 Enterprise’ version? Any insight to the difference between Standard and Enterprise version behavior will be greatly appreciated.

    (We had to get this going to setup the application at our client site. We have to revisit the query and rewrite it to work more efficiently.)

    Thanks,
    Vinod

    Like

  8. Hi Pinal,

    Thanks for the information on (Force Order) Hint…but then which is the scerario where we should use this Hint to optimize the performance of the query?

    Thanks,
    Munish Bansal

    Like

    • Hello Munish,

      There is no defined scenario to use FORCE ORDER hint. Because in most cases the SQL Server by default uses the best possible order for optimal plan. But This hint is only for cases where you try it and found considerable performance difference.

      Regards,
      Pinal Dave

      Like

  9. Funny thing. I tried this with SQL Server 2008. Both queries produced exactly same execution plans. And if I force HASH JOIN to the query without FORCE ORDER hint I get quite opposite readings:

    The former one without FORCE ORDER but with forced HASH JOIN query cost goes to 74%.

    The latter one with FORCE ORDER query cost is only 26%.

    I think the problem, when HASH JOIN is forced, is that optimizer must do index scan for both tables (Employee and EmployeeAddress) and build hash table out of one of them. That takes time. Then it have to go through the hash table and calculate hash from the other table and so on.

    In the latter case optimizer scans only the Employee table and does index seek to the EmployyAddress table in a single run. Allthough index seek is done “SELECT COUNT(*) FROM Employee” times it is still faster than filling and comparing hash table.

    But these are just my thoughts and I’m glad to be corrected :)

    Like

  10. It’s been my experience with OPTION(FORCE ORDER) that it only increases preformance if the order your tables appear in SQL is from smallest rowset to largest, such as:

    Company > Division > Employee > Entry

    and it makes things worse if they appear in any other order, such as
    Entry > Employee > Division > Company

    Like

  11. Hi Pinal,

    I need synchronize the tables from client to server and server to client using SQL server Sync framework and also sync from server to client sync based on some conditions.

    regards,
    Hema

    Like

  12. Pinal,

    It’s one thing to say the Force Order hint doesn’t improve performance over the Optimizer, but have you come across a situation where you absolutely needed the Force Order hint?

    Amazingly, I have!
    But reproducing it is not easy..
    I’ll do my best to describe.

    2005 Standard 64
    CREATE TABLE A (
    [A_id] int pk identity(1,1),
    [B_id] int,
    [Ref] varchar(200)
    )
    CREATE TABLE B (
    [B_id] int pk identity(1,1),
    [RefType] int
    )
    CREATE TABLE C (
    [C_id] int pk identity(1,1),
    [D_id] int,

    )
    CREATE TABLE D (
    [D_id] int pk identity(1,1),

    )

    Table A
    The idea here is that [Ref] can hold different keys for different tables. Table B’s [RefType] defines which table/key to join on.
    Table C is that table. Lastly, Table D simply joins with C but really has nothing more to do with the situation other than returning a column.

    — Data —
    Table A (big)
    1,1,’123′
    2,2,’ABC’

    Table B (small)
    1,1
    2,2

    Table C (very big)
    123, 85, …

    Table D (small)
    85, …

    Finally, the fun:
    SELECT *
    FROM A
    JOIN B ON B.B_ID = A.B_ID
    JOIN C ON C.C_ID = CAST(A.Ref as int)
    JOIN D ON D.D_ID = C.D_ID
    WHERE B.RefType = 1

    A is scanned first; B should be next, but sometimes the optimizer chooses C instead.

    Since there is an alpha key ‘abc,’ the query fails because it cannot cast ‘abc’ as int.

    The cure..
    SELECT *
    FROM A
    JOIN B ON B.B_ID = A.B_ID
    JOIN C ON C.C_ID = CAST(A.Ref as int)
    JOIN D ON D.D_ID = C.D_ID
    WHERE B.RefType = 1
    OPTION (FORCE ORDER)

    Have fun with this one!
    If you can find some way to correct the optimizer’s behavior without the Hint, that would be great.

    Oh and one last note..
    Obviously this situation represents a much larger situation and C governs about 20 different joins. Ref is full of all sorts of stuff. And out of all of them, only 1 presented the above error and we didn’t see it until we ran it in 2005 (ran fine in 2000).
    So don’t give up if you can’t reproduce it right away.

    Thanks,
    Benjamin

    Like

  13. “Please leave your comment here. I will publish the answer to this question with due credit and with my own example in a later post.”
    How much later? Or is it already posted, but you forgot to link to it from here?

    I can only see that OPTION (FORCE ORDER) is clearly better when otherwise the Query Optimizer would “timeout” (it’s not time, it’s number of tries it does) due to too many possible alternatives to try. When query plan generation stops due to timeout, the plan isn’t considered “GoodEnough” (which it is when the Optimizer usually quits). FORCE ORDER greatly reduces the number of possibilities, and the timeout may be avoided.

    Like

  14. Thought I’d just throw my situation into the mix. Using a recursive query with 3 subqueries, one of which was across databases in the same instance. Yeah, please don’t ask why and what was the deal with the data model but if I didnt use FORCE ORDER I could never run it properly.

    Like

  15. select * from
    (select distinct top(@n) * from table1)as t1
    left join table2 on…
    left join table3 on …

    where @n is read from some table but is known to be small (<10) and table2 and table3 are big tables

    Like

  16. Pingback: SQL SERVER – Weekly Series – Memory Lane – #050 | Journey to SQL Authority with Pinal Dave

  17. Hi
    We have a similar situation to Lena. A top 1 query that performs very badly without the hint. With FORCE ORDER the query runs 10 times faster. Given that the query runs a lot to poll a work table it improves the overall server performance considerably

    Like

  18. In scenarios where you know that a JOIN to an inline select statement will reduce the size of further huge table joins, FORCE ORDER is very useful. Without the hint, SQL server will join big tables first thus producing a significantly bigger amount of logical reads before the inner select join is able to reduce the whole result set down.

    Like

  19. I got this error:

    Msg 8624, Level 16, State 116, Procedure usp_xxxxxxxx, Line 265
    Internal Query Processor Error: The query processor could not produce a query plan. For more information, contact Customer Support Services.

    and using the FORCE ORDER option on two queries within the procedure fixed it for me. The odd thing was that it was working a couple of days before without the option.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s