During SQLPass summit, I was asked following question by one of the attendees. The question was about Performance Comparison of Except vs NOT IN. Let us quickly discuss this question.
Question: Which of the operator is better for query performance – EXCEPT or NOT IN?
Answer: The EXCEPT operator returns all of the distinct rows from the query to the left of the EXCEPT operator when there are no matching rows in the right query. EXCEPT operator works the same way NOT IN.
In simple words, there is absolutely no difference between how these operators work and there is absolutely no difference in the performance. They are just the same operators with the different name. Let us prove this with a simple example. You can run following two different queries with different operators and see how they perform exactly the same.
-- EXCEPT USE AdventureWorks2014; GO SELECT ProductID FROM Production.Product EXCEPT SELECT ProductID FROM Production.WorkOrder; GO -- NOT IN USE AdventureWorks2014; GO SELECT ProductID FROM Production.Product WHERE ProductID NOT IN ( SELECT ProductID FROM Production.WorkOrder); GO
When you run above script, it gives following execution plan.
Upon carefully observing that you can see that execution plan is absolutely same for both the queries. This essentially proves that both the operators give the same results and the same performance.
Reference: Pinal Dave (https://blog.sqlauthority.com)
Performance may be similar, but EXCEPT handles NULLs as equivalent (i.e. ANSI NULLS OFF) where NOT IN does not, so the results will be different if NULLs exist in the comparison column(s).
I wish SQLServer experts would test edge cases instead of blindly reposting the same bad “interview questions”. This one was posted five separate times to Twitter tonight.
You are not guaranteed that the execution plan will stay the same forever, when using either EXCEPT and NOT IT.
It is possible that if the queries become more complex, then the Optimizer will decide to treat them differently and generate two different execution plans.
Also, looking at cost is not relevant, since you should also look at number of executions per each item of the execution plan tree (especially in the case of Nested Loops).
There are more details which need checking (and in my opinion it is good to do so) when deciding if two queries are the same.
Dear Pinal Sir,
As you have also mentioned “The EXCEPT operator returns all of the distinct rows from the query to the left of the EXCEPT operator”, so there will be difference in the output as it will not happen in the case of “NOT IN”.
In the below example, the both operator works differently.
declare @tab1 table(id int)
insert into @tab1 select 1 union all select 1 union all select 2
declare @tab2 table(id int)
insert into @tab2 select 2
select id from @tab1
select id from @tab2
select id from @tab1
where id not in (select id from @tab2)
Well…..EXCEPT can work on a result set. NOT IN cannot. NOT IN works only on a given column. Therefore, depending upon the business context, EXCEPT and NOT IN cannot be used interchangeably.
The return query There is a distinct difference between EXCEPT and NOT IN key words especially when the sub query used in the NOT IN query returns NULL values. See below sample codes.
declare @table1 table (id smallint, name varchar(100))
declare @table2 table (id smallint, name varchar(100))
insert into @table1 values (1,’Rob’),(2,’John’)
insert into @table2 values (1,’Rob’),(3,’Max’),(Null,’Sam’)
select * from @table1 where id NOT IN (select id from @table2)
select * from @table1
select * from @table2
1 2 John
Therefore, it is best practise to avoid NOT IN key word when you are uncertain that the sub query doesn’t return any NULL values.
Not completely true sir. I have faced a scenario where delete with not in clause got stuck for hours while when I changed the query with except clause it executed in 2 minutes. I have seem the different execution plan for it. I believe it depends on certain scenario as well.
I believe that except performs a distinct operation. Other sources I have read say that due to having to run the distinct the query can be slower, also their query plan comparison shows the distinct which yours does not. It may be best to update this as your answer was the first on Google but the next 10 all said not in performed better due to this.