In today’s blog post let us discuss Performance Comparison – BETWEEN, IN, and Operators. I recently discussed this during the Comprehensive Database Performance Health Check.
Let me ask you the question first, look at the queries here and tell me which one of the queries will run the faster.
SET STATISTICS IO ON GO -- Query with Between SELECT * FROM [WideWorldImporters].[Sales].[InvoiceLines] WHERE InvoiceID >= 20 AND InvoiceID <= 40; GO -- Query with Operators SELECT * FROM [WideWorldImporters].[Sales].[InvoiceLines] WHERE InvoiceID BETWEEN 20 AND 40; GO -- Query with In SELECT * FROM [WideWorldImporters].[Sales].[InvoiceLines] WHERE InvoiceID IN (20,21,22,23,24,25,26,27,28,29,30, 31,32,33,34,35,36,37,38,39,40) GO
Now here is the question for you when you look at the queries here, which one of the queries here will give optimal performance.
Let me post a screenshot of the execution plan.
It is very clear from the execution plan that in our case the query which is using IN keywords is the fastest. Let us see the output of the statistics IO.
BETWEEN: Table ‘InvoiceLines’. Scan count 1, logical reads 162
Operators <= and >= : Table ‘InvoiceLines’. Scan count 1, logical reads 152
IN: Table ‘InvoiceLines’. Scan count 1, logical reads 180
It is very clear from the output of the statistics IO even though the second query looks expensive with the operator it is not that IO intensive.
The final summary is that in any case using IN which gives a clear idea to the SQL Server Query engine what actually we are looking for in our resultset results in the BETTER ESTIMATION from the STATISTICS, leading optimal performance.
Now I have tried with many different queries and every single time I have found that BETWEEN is either slower than IN or gives equally bad/good performance as IN. I have yet to come across the scenario where BETWEEN is better than IN. This leads me to believe that if there is an opportunity one should use the IN operator over BETWEEN.
It is very much possible that this query gives different results for you on your system and also the result is totally dependent on the statistics and data distribution you have in your table. I also believe there are queries out where BETWEEN may be faster than any other option.
Based on everything that we have discussed, I suggest that you test out your queries with your data and do the final conclusion. I will be interested in reading your comments. Should I convert this blog post to SQL in Sixty Seconds video?
Update: The original script used in the video had an error in the WHERE condition which has been fixed. However, this still does not change the opinion that one should check the STATISTICS IO before making the decision which query is optimal for them.
Reference: Pinal Dave (https://blog.sqlauthority.com)