Dirty Records Recap
Most SQL people know what a “Dirty Record” is. You might also call that an “Intermediate record”. In case this is new to you here is a very quick explanation. The simplest way to describe the steps of a transaction is to use an example of updating an existing record into a table. When the insert runs, SQL Server gets the data from storage, such as a hard drive, and loads it into memory and your CPU. The data in memory is changed and then saved to the storage device. Finally, a message is sent confirming the rows that were affected.
For a very short period of time the update takes the data and puts it into memory (an intermediate state), not a permanent state. For every data change to a table there is a brief moment where the change is made in the intermediate state, but is not committed. During this time, any other DML statement needing that data waits until the lock is released. This is a safety feature so that SQL Server evaluates only official data.
Basic Table Hints
Some transactions take time and then rollback. In other words, the changes never become official data and in essence never took place. The example below shows a transaction that takes 15 seconds and then rolls back:
SET HireDate = '1/1/1992'
WHERE EmployeeID = 1
WAITFOR DELAY '00:00:15'
If employee #1 was really hired in 1989 and you run the above code, you have incorrect data for fifteen seconds in the intermediate state. During those fifteen seconds, if a 401K vesting program ran an evaluation on all employees hired before 1990, the process would wrongfully overlook employee #1. The safety catch is the 401K process would wait until this transaction is done in order to get the official data.
A great deal of data can be changed in the intermediate state, but never get committed. Locking prevents other processes from making decisions on dirty data. The result is that only committed data is used to isolate transactions. The drawback is that some processes that could run instantly now have to wait. Locking lowers the level of concurrency, which is the ability to allow software to run many processes at the same time.
A Select query can now pull out any record being changed in the intermediate state. You could say that a query will not show any dirty records because all transactions are isolated from each other. By changing one record in your CurrentProducts table, a query looking for records from that table will have to wait. In the figure below the update statement was executed first and puts data into an intermediate state. The query on the right waits for the transaction to be committed.
If you look more closely you can see that the update statement has been running for one minute, 38 seconds and the query on the right was started six seconds later. The committed RetailPrice is $75 and the RetailPrice in the intermediate state of that value is $999. This will be turn two and a half minutes. Upon completion, the price never changed and the $999 value was thrown away (never persisted). The query on the right waits until the data becomes official. This is a safety feature that has a downside in that the select query takes much longer to run.
If you don’t want the query on the right to wait, you have some choices. If you are fine with the query on the right accessing dirty data, you can issue a locking hint for the query. The following code tells the query to run without waiting:
FROM CurrentProducts (READUNCOMMITTED)
In the code example would run without waiting. The result is the query would show the RetailPrice value of $999.00.
In the figure below we see ProductID 1 has a value in the RetailPrice field of $999.00 in the result set. The query did not wait and ran with the newest value it found in memory. In the end the $999.00 was never a committed value in the database. Any evaluation of data running this query could give you a false report. In other words you run the same query three minutes later and get a different result. This is despite the fact that the table may not have really changed at all in that time.
The READUNCOMMITTED table hint allows the query to show uncommitted (dirty) data. The advantage is the query runs much faster. This is a common solution for evaluation queries that don’t need to be exact. A common example is just looking to see how many records are in a table as in the query below:
FROM CurrentProducts (READUNCOMMITTED)
You know this result changes over time and is used for general trends. In this case, SQL Server allows the query to run without waiting. The following statements are equivalent and show a shorter way of accomplishing the same result:
SELECT COUNT(*) FROM CurrentProducts (READUNCOMMITTED)
SELECT COUNT(*) FROM CurrentProducts (NOLOCK)
Readuncommited and nolock are just two of many Table hints used by SQL server to affect how a queries isolcation level can be affected.
Note: If you want to setup the sample JProCo database on your system you can watch this video.
Q 7) What happens when you try to query records in the intermediate state?
- Your query waits until the data is no longer in the intermediate state.
- Your query runs normally.
- Your query splits the results into permanent and intermediate results.
Please post your answer in comment section to win Joes 2 Pros books.
Please leave your answer in comment section below with correct option, explanation and your country of resident.
Every day one winner will be announced from United States.
Every day one winner will be announced from India.
A valid answer must contain country of residence of answerer.
Please check my facebook page for winners name and correct answer.
Winner from United States will get Joes 2 Pros Volume 1.
Winner from India will get Joes 2 Pros Volume 1.
The contest is open till next blog post shows up at http://blog.sqlauthority.com which is next day GTM+2.5.
Reference: Pinal Dave (http://blog.SQLAuthority.com)