Using SQL Server functions in the WHERE clause of your queries can significantly impact performance. SQL Server must calculate the function for every row in the table, which can be time-consuming, especially for large tables. Instead, using the raw data in your WHERE clause whenever possible is more efficient.
Why Are Functions in the WHERE Clause a Problem?
SQL functions in a WHERE clause can prevent SQL Server from effectively using indexes on the columns used in the function. This happens because the function needs to be executed for each row in the table, effectively turning an indexed search into a full table scan.
Moreover, using functions in the WHERE clause can lead to computation overhead because the function needs to be calculated for each row in the table. This can be especially impactful for large tables with millions of rows.
Why Do People Convert String to Upper Case and Compare?
Case sensitivity can be a tricky issue in SQL. By default, string comparisons in SQL Server are case sensitive because the server uses binary collation. This can lead to unexpected results for those not familiar with this behavior. For example, ‘Smith’ and ‘smith’ would not be seen as equal in a binary collation.
One common workaround is to use the UPPER (or LOWER) function to convert all strings to the same case before comparison. This ensures that ‘Smith’ and ‘smith’ are seen as equal. However, as we’ve discussed, using these functions in the WHERE clause can impact performance.
A more efficient solution is to use the COLLATE clause to specify a case-insensitive collation for the comparison. This allows SQL Server to do a case-insensitive comparison without converting all the strings, which can be much more efficient.
Real-World Testing and the Importance of Indexes
Person table in the
Person schema of the
AdventureWorks2019 database. The
Person the table has a column
LastName that contains text strings, and we assume an index exists in this column.
We executed the following queries:
SET STATISTICS IO ON -- 211 records SELECT * FROM AdventureWorks2019.Person.Person WHERE UPPER(LastName) LIKE 'DIAZ%'; -- 211 records SELECT * FROM AdventureWorks2019.Person.Person WHERE LastName COLLATE SQL_Latin1_General_CP1_CI_AS LIKE 'DIAZ%'; -- 1 records SELECT * FROM AdventureWorks2019.Person.Person WHERE UPPER(LastName) LIKE 'JARVIS%'; -- 1 records SELECT * FROM AdventureWorks2019.Person.Person WHERE LastName COLLATE SQL_Latin1_General_CP1_CI_AS LIKE 'JARVIS%';
The first and third queries use the UPPER function in the WHERE clause. The second and fourth queries use the COLLATE clause to specify a case-insensitive collation, allowing for a case-insensitive comparison of the column without performing any functions.
Here is the execution plan comparison when we searched for DIAZ, which returned 211 rows.
Here is the execution plan comparison when we searched for JARVIS, which returned 1 row.
The statistics for these queries were as follows:
- For ‘DIAZ‘:
- With UPPER: 762 logical reads
- With COLLATE: 659 logical reads
- For ‘JARVIS‘:
- With UPPER: 110 logical reads
- With COLLATE: 5 logical reads
The COLLATE clause performs significantly fewer reads than the UPPER function, demonstrating that it’s more efficient to avoid using functions in the WHERE clause whenever possible. However, it’s important to note that this performance benefit is primarily due to the existence of an index on the
LastName column. If no index existed on the column, the performance difference between the two methods may not be significant.
This deep dive illustrates why using functions in the WHERE clause can significantly impact the performance of your SQL Server queries and why converting strings to upper case for comparison purposes can be problematic. By understanding the underlying reasons and knowing how to avoid using functions in the WHERE clause, you can optimize your queries and improve the overall efficiency of your database operations. Always consider the data type, collation, and indexing strategy when writing your SQL queries to ensure optimal performance.
You can always reach out to me on Twitter.
Reference: Pinal Dave (https://blog.sqlauthority.com)