SQL SERVER – Database Snapshots and Replica Causing Write Delays

For folks who have not worked with SQL Server Database snapshot feature, please refer to the blog I wrote sometime back (SQL SERVER – 2008 – Introduction to Snapshot Database – Restore From Snapshot). I am giving this reference because some might not be aware of this powerful capability with SQL Server.

In the above article, one of the readers did ask if there is any performance overhead of having multiple Database snapshots with SQL Server.  Well, database snapshots are built on the principle of COW (Copy On Write) where a database page is copied to the snapshots when they are written in the primary database. That is the whole idea of Database snapshots. Hence, if you have multiple database snapshots configured, for each write on the primary DB, we will be indirectly getting “n” more write based on the number of snapshots made on the DB. It is as simple as that.

In one of my customer locations, they were getting heavy performance issue on their writers and when I got an option to inspect, I found there were close to 6 database snapshots that were created in that database. When I asked, they said it is some sort of backup copy so that they can come back onto that version. Being a heavy transactional system, this showed up easily when the load was high.

The immediate question was, how can I identify we have far too many Snapshots created on the databases? I suggested a simple script so that they can start looking into it from time to time:

IF EXISTS( SELECT source_database_id
FROM sys.databases
WHERE source_database_id IS NOT NULL
GROUP BY source_database_id
HAVING COUNT(*) > 1)
AND EXISTS(
SELECT waiting_tasks_count FROM sys.dm_os_wait_stats
WHERE wait_type = 'replica_writes' AND waiting_tasks_count > 0)
BEGIN
SELECT
DB_NAME(source_database_id) AS 'SourceDB',
COUNT(*) AS 'NumSnapshots'
FROM sys.databases
WHERE source_database_id IS NOT NULL
GROUP BY DB_NAME(source_database_id)
HAVING COUNT(*) > 1
SELECT * FROM sys.dm_os_wait_stats
WHERE wait_type = 'replica_writes' AND waiting_tasks_count > 0
END

In their environment this showed up, having 6 snapshots for their main transactional database and they were able to delete some of them based on age.

Our database systems are sensitive to such mistakes and can cause us huge performance overhead. The development team thought that the system was deteriorating over a period of time and were taking this performance overhead without analyzing what could be the problem. I am a firm believer that one should always know the root cause for a problem rather than just solving the problem superficially.

Do let me know if you use Snapshots actively in your environments and have you faced such issues in your SQL Server box? One of the key places where I have seen people use Database Snapshots is when Mirroring is configured and we want to query from our Mirror Server. Are there other places where you have relied on Database Snapshots? Let me know via your comments.

Reference : Pinal Dave (https://blog.sqlauthority.com)

Quest

Snapshot, SQL Backup and Restore
Previous Post
SQL SERVER – Converting Standard SSMS File Results to Delimited – Notes from the Field #085
Next Post
SQL SERVER – Using PowerShell and Native Client to run queries in SQL Server

Related Posts

3 Comments. Leave new

  • William Colwill
    June 19, 2015 6:51 pm

    The performance hit from using snapshots is far worse than the marketing would have you believe. The information on snapshots leads you to believe that at worst you will double your writes because you only need to copy a page once to keep the pre-image of it. This is true, but the problem is with how Microsoft implements snapshots using NTFS sparse files. What we, and others, have seen is a 6-8x increase in IO when performing writes.
    We only use snapshots for short periods of time, either for application upgrades, or for our main reporting system where we snapshot the DB after the nightly ETLs and then drop the snapshot after reporting is done for the day but before the nightly ETLs.
    Sticking the snapshot DB files on higher end flash disk mitigates the IO problem.
    Great idea, bad implementation on Microsoft’s part. And yes, they know about it.

    Reply
    • Very well said and I agree to you. Thanks for writing your views here. Will help other blog readers too.

      Reply
  • thank you for all pinal, its very interesting subjet, i want ask you if you have time, i have probleme with snapshot with customer, please you can tell me your @ for the send image erreur

    Reply

Leave a Reply