SQL SERVER – Database Stuck in “In Recovery” Mode After Restart

Read this blog and I am sure it will bring some instances in your environments. This is one of the common issues I have observed while working with SQL Server from long time now. In this blog we will discuss little details about the issue and possible action you might take.

Whenever there is a restart of SQL Server, all databases would undergo “Recovery” process. This is the stage where the database has to come back online in a consistent state. There are three sub-phases with-in the process. Discovery, Roll forward and Rollback. The names are pretty self-explanatory. Let me explain for those who are interested to learn in detail:

  • Analysis: This is the phase where SQL Server would go through the LDF file and build the in-memory structures to know how much work is needed in the next two phases.
  • Roll forward (redo): During the shutdown of the database, there might be transactions which are committed but not yet written to the MDF file via checkpoint.
  • Rollback (undo): If there were any transactions which were uncommitted then they have to be rolled back to bring the database to a consistent state.

When we would see database in “InRecovery” state?

  • Restart of SQL Server.
  • Database offline and online.
  • Restore of database from backup.

All of the above would is called “recovery” process of the database and all databases must go through three phases as explained earlier.

What should we do?

The very first thing I always check is ERRORLOG. In Errorlog, we should see the very first message in the database (TestMe is the name of the database):

Starting up database ‘TestMe’.

This means the files are opened and recovery is started. After sometime, you should see phase 1.

Recovery of database ‘TestMe’ (28) is 0% complete (approximately 37 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 3% complete (approximately 36 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.

Once phase 1 is complete, it would go with Phase 2 and 3 as shown below.

Recovery of database ‘TestMe’ (28) is 3% complete (approximately 36 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 0% complete (approximately 142 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 7% complete (approximately 19 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 15% complete (approximately 26 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 21% complete (approximately 25 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 27% complete (approximately 20 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 34% complete (approximately 19 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 41% complete (approximately 16 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 48% complete (approximately 14 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 55% complete (approximately 12 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 62% complete (approximately 10 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 69% complete (approximately 9 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 75% complete (approximately 7 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 82% complete (approximately 5 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 88% complete (approximately 3 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 95% complete (approximately 1 seconds remain). Phase 3 of 3. This is an informational message only. No user action is required.

And once it completes, you should use something similar.

3807 transactions rolled forward in database ‘TestMe’ (28). This is an informational message only. No user action is required.
0 transactions rolled back in database ‘TestMe’ (28). This is an informational message only. No user action is required.
Recovery is writing a checkpoint in database ‘TestMe’ (28). This is an informational message only. No user action is required.
Recovery completed for database TestMe (database ID 28) in 30 second(s) (analysis 1289 ms, redo 29343 ms, undo 72 ms.) This is an informational message only. No user action is required

The text in green color explains the three phases which I explained earlier.

What are the possible causes?

  • Huge size of transaction log file.
  • SQL restarted during a long running transaction.
  • Huge number of VLFs.
  • You might be hitting a bug which is fixed in SQL Server. I have referenced KB below.

List of known issues

If you are running SQL Server 2005, 2008, 2008 R2 or SQL 2012, please make sure you have applied fixes given in below.

http://support.microsoft.com/kb/2455009 (FIX: Slow performance when you recover a database if there are many VLFs inside the transaction log in SQL Server 2005, in SQL Server 2008 or in SQL Server 2008 R2)

http://support.microsoft.com/kb/2524743 (FIX: Recovery takes longer than expected for a database in a SQL Server 2008 or in a SQL Server 2008 R2 environment)

The fixes would help in speeding up the phases of recovery. Hope this blog helps you in a direction to look at some of the SQL Server behavior.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Previous Post
SQL SERVER – Security Conversations and Notes with a DBA
Next Post
SQL SERVER – Using Raw Files in SSIS – Notes from the Field #070

Related Posts

24 Comments. Leave new

  • Pinal, can we query database when database is in recovery mode? Please let me know.

    Reply
  • Pinal, can we query database when it is in recovery mode? please let us know.

    Reply
  • My SQL server says 275 hours remaining!!

    What should I do now?

    Reply
    • What is the SQL version? What is the size of LDF? Have you made sure that SQL is not the patches which i mentioned in blog.

      Reply
  • can we drop a database which is in- recovery mode ?

    Reply
  • Yes Farooq you can drop database which is in recovery mode.

    Reply
  • can I query the database how much percentage recovery is completed

    Reply
  • You cannot query the database whilst it is in recovery.
    You need to check the SQL Server Logs, look at the current log, make sure it is ordered with latest date at the top. You should see messages like the following:
    Recovery of database ‘AdventureWorks’ (7) is 4% complete (approximately 23188 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.

    The “4%” and “approximately 23188 seconds remain” will tell you how far recovery had got.

    Reply
    • I do not see this progress in the logs. My database is 200GB and log file is 300 GB (two logs combined).
      Recovery is running for more than 24 hrs and the database is till in Recovery Mode.

      Reply
  • Hi Pinal, I have a question regarding the hotfixes mentioned: For examlpe 2524743 addresses to fix the issue after/for Sql 2008 R2 SP2. I have SP3 installed. But I can’t find this fix in the list of hotfixes installed. Is it installed by applying SP3 and just not shown because it is included in the SP3 branch? Or is it not installed? How can I knwo? Thanks.

    Reply
    • SP3 for SQL 2008 R2 should have the fix for it. If you are still having issue, it might be a new one or some other KB.

      Reply
  • how to check the day previous day error log

    Reply
  • I just wanted to alter the “maximum server memory” value for an instance on a test environment and I didn’t notice that my query window was connected to the wrong instance. Yes you probably guessed it, it was a productive Environment…ouch!!!
    So I dropped down from 57344 MB to 2048 MB.
    Next I got the message “There is insufficient system memory in resource pool ‘internal’ to run this query” and I couldn’t reconnect to the instance via SSMS. So I restarted the whole instance, the database went into “recovery” and when this finished with 100% complete I got the following in the error log:

    – 52764 transactions rolled forward in database *** (7:0). This is an informational message only. No user action is required.

    – 6 transactions rolled back in database *** (7:0). This is an informational message only. No user action is required.

    The instance is currently being used for a big Software Migration process…so can anyone please tell me if I have to be concerned because of data loss now or do you think that everythin is going to be fine?
    Did he actually do like a restore or not?

    The database is currently in Recovery mode “SIMPLE” and a Full-Backup is being taken in the morning at 6 a. m. I don’t know if that is important Information for you…the “recovery” and “Transactions rolled back” is just confusing me.

    Thanks in advance :)

    Reply
    • Those messages don’t indicate any data loss. These are the transactions were not committed and hence rollback happened for them..

      Reply
  • I am getting these errors on my dedicated server because of which database goes into recovery mode.

    I have seen the following error in SQL server error log file: Both mdf & ldf are located in SAS drives, Sql server 2008 R2 web edition is there and also updated sql server 2008 R2 SP3 after reading some reference blog.

    2018-01-20 05:32:19.83 spid7s LogWriter: Operating system error 1117(The request could not be performed because of an I/O device error.) encountered.
    2018-01-20 05:32:19.83 spid7s Write error during log flush.
    2018-01-20 05:32:19.84 spid284 The client was unable to reuse a session with SPID 284, which had been reset for connection pooling. The failure ID is 46. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message.
    2018-01-20 05:32:19.84 spid335 Database aa was shutdown due to error 9001 in routine ‘XdesRMFull::Commit’. Restart for non-snapshot databases will be attempted after all connections to the database are aborted.
    2018-01-20 05:35:03.35 spid22s Recovery completed for database aa(database ID 7) in 151 second(s) (analysis 18856 ms, redo 115386 ms, undo 2023 ms.) This is an informational message only. No user action is required.

    At the same time following errors are generated in event viewer:

    Application Event
    EventID:17053, MSSQLSERVER
    LogWriter: Operating system error 1117(The request could not be performed because of an I/O device error.) encountered.
    Error 1/20/2018 5:32:19 AM MSSQLSERVER 18056 Server
    Error 1/20/2018 5:32:19 AM MSSQLSERVER 9001 Server
    Error 1/20/2018 5:32:19 AM MSSQLSERVER 9001 Server
    Error 1/20/2018 5:32:19 AM MSSQLSERVER 17053 Server

    System Event:
    The IO operation at logical block address 4069 for Disk 1 (PDO name: \Device\00000045) was retried.

    Kindly help

    Reply
  • can i detach and attach the database in recovery ???
    I need to move the .ldf file because there’s no free space on disk!

    Reply
    • I don’t think detach can be done.. ALTER DATABASE .. MODIFY FILE should work to update information in system catalogs.

      Reply
  • Do all three subphases still occur the same way when the database is a secondary in an availability group? I noticed in the AlwaysOn dashboard the ‘redo queue size’ does not budge until phase 2 is complete. As soon as phase 2 completed, the database synced up perfectly again. It was strange to me that the AlwaysOn dashboard did not reflect the same info that the errorlog was telling me.

    Reply
  • My database information in sp_readerrorlog says: “Recovery completed for database in 135 second(s)…”

    But my database still is in RECOVERING state from the sys.databases DMV.

    Any ideas?

    Reply

Leave a Reply

Menu