It has been a while since I posted a puzzle on this blog and lots of people have been sending me an email requesting to have a brain teaser. Today, let us see a very simple puzzle where we are getting strange results when using NOT IN and IS NULL in a T-SQL query.
Here look at two queries if you think about it both look pretty much the same. Both the queries are run for sample database WideWorldImporters. You can get this sample database downloaded from here.
Query 1: IN and IS NOT NULL
Run the following query for the sample database WideWorldImporters.
SELECT * FROM [WideWorldImporters].[Sales].[Orders] WHERE PickedByPersonID IN ( SELECT PickedByPersonID FROM [WideWorldImporters].[Sales].[Orders] WHERE PickedByPersonID IS NOT NULL) GO
The query above will return 62,972 rows.
Query 2: NOT IN and IS NULL
Run the following query for the sample database WideWorldImporters.
SELECT * FROM [WideWorldImporters].[Sales].[Orders] WHERE PickedByPersonID NOT IN ( SELECT PickedByPersonID FROM [WideWorldImporters].[Sales].[Orders] WHERE PickedByPersonID IS NULL) GO
The query above will return 0 rows.
Why Strange Results?
Now the question to you is why just moving NOT from a subquery to outer query is making such a huge difference in the result. If you think about it, logically both the queries are the same. If you put any number for outer query or inner subquery, the result will come the same. However, there is one reason, why the result is different.
Here is the question one time – Why such a huge difference in the Resultset by moving NOT keyword?
Leave your answer in the comments section. All the answers will be kept unpublished for one week and after that, they will be all published. I will select three of the individuals randomly who has left a comment on this blog post and will send them free one month’s access to SQL Server Performance Tuning Practical Workshop – Recorded Classes.
If you have any such puzzle, please do send it to my way and I will be happy to publish with due credit to you.
Update (July 9, 2020): The winner of the contest is Mike Mohanty.Â
Reference:Â Pinal Dave (https://blog.sqlauthority.com)
117 Comments. Leave new
The most straightforward explanation is the inability of any PickedByPersonID value (or even the non-value of NULL) to match any one of the several thousand NULL PickedByPersonID records returned by the sub-query.
In contrast, the first query is a redundant selection of all the Orders records with a non-NULL PickedByPersonID value.
The difference is because your second SELECT is not returning the same results. Initially the second SELECT is returning all records that are NOT NULL and the first SELECT is returning those results (IN). By moving the NOT, the second SELECT is not returning anything and because the first SELECT is only selecting from the results IN the second SELECT the entire query returns 0 rows.
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID NOT IN (
SELECT isnull( PickedByPersonID, 0) as PickedByPersonID
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID IS NULL)
GO
— Any null the result will cause zero rows (Replace the null with ISNULL or COALEASE)
Because the second query always evaluates to FALSE when PIckedByPersonID is compared to NULL, nothing is retuned in the result.
SELECT PickedByPersonID
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID IS NULL
This subquery will always output NULL
The result is 0 due to the ANSI_NULL flag (this is by default ON)
Scenario 1 : ANSI NUILL flag is set to ON
PickedByPersonID is compared with NULL
anything compared with NULL will give the output as “UNKNOWN”
1 = NULL —> UNKNOWN
1 NULL —> UNKNOWN
NULL = NULL —> UNKNOWN
NULL NULL —> UNKNOWN
Scenarios 2 : ANSI NUILL flag is set to OFF
PickedByPersonID is compared with NULL
1 = NULL —> FALSE
1 NULL —> TRUE
NULL = NULL —> TRUE
NULL NULL —> FALSE
Due to this property, the outer query returns 0 rows
If you set the ANSI NUILL flag to OFF,
the SQL query will give the same row count which is 62972
actually: where PickedByPersonID not IN (sub-query always returns null) returns null..
it’s because the NOT IN clause in the second query cannot resolve true or false any results from the subquery because all the PickedByPersonID values are NULL, which by definition is neither true or false.
This is simple. Namely the IN and NOT IN statements act as and = and applied to an array and values in T-SQL cannot be = or to NULL, hence presenting the null values in the array will result in nothing matching or not matching them, and in this case since all values in the array are NULL there will be zero records returned.
My comment did now post the the Not Equal signs in the answer. “” or “!=”
the second query does not return any row because the null value is unknown in the where clause.
query 1::
———
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID IN (
SELECT PickedByPersonID
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID IS NOT NULL)
GO
First query treat like that:-
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID=1 or PickedByPersonID=2 or PickedByPersonID=3 or…….PickedByPersonID=n number
So that this query always give result if any match.
========================================================================================================
query 2::
———
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID NOT IN (
SELECT PickedByPersonID
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID IS NULL)
GO
Second query treat like that:-
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonIDnull
So that this query never give any result. If want correct result than use correct syntax for null values as below.
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID is null
query 1::
———
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID IN (
SELECT PickedByPersonID
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID IS NOT NULL)
GO
First query treat like that:-
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID=1 or PickedByPersonID=2 or PickedByPersonID=3 or…….PickedByPersonID=n
So that this query always give result if any match.
========================================================================================================
query 2::
———
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID NOT IN (
SELECT PickedByPersonID
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID IS NULL)
GO
Second query treat like that:-
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonIDnull
So that this query never give any result. If want correct result than use correct syntax for null values as below.
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID is null
the condition ‘where not in (null, …..)’ allways returns zero rows.
Would the answer by along the lines that the subquery results in a “nothing” value. The 2nd query’s subquery will return all NULL values which would indicate that nothing can be determined to be in the subquery.
In 3 value logic (which is used with SQL), comparing with NULL always results in UNKNOWN.
Second query INNER Statement returns NULLs, causing return values to be UNKNOWN ( = NULL)
(PickedByPersionID = NULL)
The second query effectively executes as…
“`
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID NULL ;
“`
This returns null (with ANSI_NULLS on, which is the default state).
By setting ANSI_NULLS off, both queries return the same results.
The Result set for Query 2 is constrained by the NOT IN (SELECT PickedByPersonID
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID IS NULL)
Selecting PickedByPersonID WHERE PickedByPersonID IS NULL will return an Empty Set (-As NO-ONE PICKED IT!)
This structure could work IF the Nested Query Selected [Sales].[Orders].[OrderID] instead of PickedByPersonID as in:
SELECT *
FROM [WideWorldImporters].[Sales].[Orders]
WHERE [Sales].[Orders].[OrderID] NOT IN (
SELECT [Sales].[Orders].[OrderID]
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID IS NULL)
GO
Brian
(ProcessIT.co.nz)
This query just returns a NULL value: And NOT IN clause cannot evaluate NULLs. So that’s the reason it’s not returning any data.
SELECT PickedByPersonID
FROM [WideWorldImporters].[Sales].[Orders]
WHERE PickedByPersonID IS NULL
NULL will be ignored in aggregate functions as well as in NOT