[Note from Pinal]: This is a 117th episode of Notes from the Field series. Every time I go to do performance tuning consultancy, one thing annoys me and makes me angry. The usage of NOLOCK hint. I think without any argument, this is one of the most abused query hint in the SQL Server universe. Lots of people treat NOLOCK as a silver bullet of performance tuning and that is unfortunate. There is a special purpose of how NOLOCK should be used and what is the end goal of it.
In this episode of the Notes from the Field series database expert Stuart Ainsworth explains about NOLOCK and its best practices. Trust me, you do not want to overuse NOLOCK hint as it can just impact your database integrity negatively.
I’ve been working on a project to help performance tune a mature application; this database has been developed by several different people over the last decade, and there’s been interesting choices made along the way. At some point, the decision was made to use NOLOCK liberally in an attempt to increase throughput on the application. While NOLOCK is not the only issue with this application, development management realized that it is a bad practice, and wanted to reduce the use of this hint.
When is it OK to use NOLOCK?
Before I go too much further, I wanted to make sure that I addressed why the use of NOLOCK is a bad practice under normal circumstances, and when it is appropriate to consider NOLOCK (or its ANSI-compatible sibling, TRANSACTION ISOLATION LEVEL READ UNCOMMITTED). Like most SQL Server hints, the practice of NOLOCK is usually discovered early in the career of a database professional. At some point, the pro encounters a situation where blocking is slowing down application performance, usually in a high-transaction OLTP system with lots of INSERTS or UPDATE’s happening at the same time as SELECT queries are running. Using NOLOCK reduces blocking so that the SELECT statements can return data without waiting for the INSERT or UPDATE to complete. To a young developer, it looks like a magic “go-faster” button, and the temptation is there to use it everywhere.
As database professionals mature, we begin to realize that NOLOCK can return “dirty data”; most people define dirty data as data that not been committed, but it’s more than that. Transactions that UPDATE or INSERT new rows may update an index or run out of space on a data page; this can require new pages to be allocated and move existing rows (the dreaded page split). If data is moving around during a SELECT statement, you can miss previously committed rows or see the same row returned multiple times.
So when is it appropriate to use NOLOCK? In short, when the performance of your SELECT query matters more than accuracy; in practice, that’s a very tiny segment of database applications. There’s cases where having immediate access to information is more important than accuracy; for example, the pedometer on my wrist is telling me how many steps I’ve taken today. It’s not completely accurate, but having immediate access to that information reminds me to walk more. However, if I want to know the balance in my checking account, reliability is more important than performance; it’s nice to see that I just bought a latte 30 seconds ago, but if I had to wait a few minutes before it hit my banking account, I’d be OK with that.
So how do you solve the NOLOCK problem?
To reduce blocking in your database application, you have several options; here are my three favorite:
- Improve the performance of your server. This can mean buying bigger hardware (especially RAM), or tuning your server to reduce hotspots that result in I/O contention. If your application is modifying and reading data faster, the impact of blocking is minimized; it doesn’t actually go away, but it becomes more tolerable.
- Make architectural changes to your application. The goal here is to reduce the possibility that data modification is occurring at the same time as data retrieval; examples could be implementing a queue for modifications or setting up a reporting server to handle the duties of retrieval. This usually introduces delay in between modification and retrieval, so it’s not a quick change to make.
- Look into using snapshot isolation. Snapshot isolation has been around since SQL 2005, but I haven’t seen a lot of developers using it. Overall, snapshot isolation is a relatively low risk method for dealing with blocking (especially compared to NOLOCK), but there are some challenges to implementing it:
- It has to be explicitly enabled on both the database and the transaction. There’s two steps in using snapshot isolation; you have to turn it on, and then you have to use it. Because it’s a database level setting, it’s doesn’t play well with distributed transactions.
- TempDB becomes a lot more important. Snapshot isolation uses tempdb to version rows in order to return unaltered data. If your tempdb was a bottleneck before, then it will become a SERIOUS bottleneck now. Make sure that you follow best practices for tempdb health.
For this particular application, we’re using snapshot isolation in combination with performance tuning the queries where possible. It’s a long process to undo some bad habits, but using better architectural princip
If you want to get started with SQL Server with the help of experts, read more over at Fix Your SQL Server.
Reference: Pinal Dave (http://blog.sqlauthority.com)