SQL SERVER – Database Stuck in “In Recovery” Mode After Restart

Read this blog and I am sure it will bring some instances in your environments. This is one of the common issues I have observed while working with SQL Server from long time now. In this blog we will discuss little details about the issue and possible action you might take.

Whenever there is a restart of SQL Server, all databases would undergo “Recovery” process. This is the stage where the database has to come back online in a consistent state. There are three sub-phases with-in the process. Discovery, Roll forward and Rollback. The names are pretty self-explanatory. Let me explain for those who are interested to learn in detail:

  • Analysis: This is the phase where SQL Server would go through the LDF file and build the in-memory structures to know how much work is needed in the next two phases.
  • Roll forward (redo): During the shutdown of the database, there might be transactions which are committed but not yet written to the MDF file via checkpoint.
  • Rollback (undo): If there were any transactions which were uncommitted then they have to be rolled back to bring the database to a consistent state.

When we would see database in “InRecovery” state?

  • Restart of SQL Server.
  • Database offline and online.
  • Restore of database from backup.

All of the above would is called “recovery” process of the database and all databases must go through three phases as explained earlier.

What should we do?

The very first thing I always check is ERRORLOG. In Errorlog, we should see the very first message in the database (TestMe is the name of the database):

Starting up database ‘TestMe’.

This means the files are opened and recovery is started. After sometime, you should see phase 1.

Recovery of database ‘TestMe’ (28) is 0% complete (approximately 37 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 3% complete (approximately 36 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.

Once phase 1 is complete, it would go with Phase 2 and 3 as shown below.

Recovery of database ‘TestMe’ (28) is 3% complete (approximately 36 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 0% complete (approximately 142 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 7% complete (approximately 19 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 15% complete (approximately 26 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 21% complete (approximately 25 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 27% complete (approximately 20 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 34% complete (approximately 19 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 41% complete (approximately 16 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 48% complete (approximately 14 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 55% complete (approximately 12 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 62% complete (approximately 10 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 69% complete (approximately 9 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 75% complete (approximately 7 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 82% complete (approximately 5 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 88% complete (approximately 3 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
Recovery of database ‘TestMe’ (28) is 95% complete (approximately 1 seconds remain). Phase 3 of 3. This is an informational message only. No user action is required.

And once it completes, you should use something similar.

3807 transactions rolled forward in database ‘TestMe’ (28). This is an informational message only. No user action is required.
0 transactions rolled back in database ‘TestMe’ (28). This is an informational message only. No user action is required.
Recovery is writing a checkpoint in database ‘TestMe’ (28). This is an informational message only. No user action is required.
Recovery completed for database TestMe (database ID 28) in 30 second(s) (analysis 1289 ms, redo 29343 ms, undo 72 ms.) This is an informational message only. No user action is required

The text in green color explains the three phases which I explained earlier.

What are the possible causes?

  • Huge size of transaction log file.
  • SQL restarted during a long running transaction.
  • Huge number of VLFs.
  • You might be hitting a bug which is fixed in SQL Server. I have referenced KB below.

List of known issues

If you are running SQL Server 2005, 2008, 2008 R2 or SQL 2012, please make sure you have applied fixes given in below.

http://support.microsoft.com/kb/2455009 (FIX: Slow performance when you recover a database if there are many VLFs inside the transaction log in SQL Server 2005, in SQL Server 2008 or in SQL Server 2008 R2)

http://support.microsoft.com/kb/2524743 (FIX: Recovery takes longer than expected for a database in a SQL Server 2008 or in a SQL Server 2008 R2 environment)

The fixes would help in speeding up the phases of recovery. Hope this blog helps you in a direction to look at some of the SQL Server behavior.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Security Conversations and Notes with a DBA

One of the great things about working in Pluralsight is that I get to meet a lot of people (trust me on this). I am lost for words when it comes to how my day job helps me in achieving some of the coolest things presentations at a number of conferences. The Pluralsight booth is always a place where we see some of the great minds from the industry swing by. So in case you get to a conference and get to see a Pluralsight booth, feel free to swing by, take a moment to talk someone out there, I guarantee your time will be well spent.

In one conferences, I vividly remember a conversation from an enthusiast who was coming from the Banking vertical. He attended all my sessions and made sure he caught me at the booth as I spend most of my time there when not delivering sessions. So it was in-between sessions that this gentleman caught me and started asking me a number of questions around security. The whole discussion took close to an hour but I was glad someone was serious enough in trying to understand these basics.

I was super excited and was talking some of the latest and greatest security capabilities of SQL Server 2014 which I explained in my Plurasight Course. But the individual said they were running on a SQL Server 2008 R2. That got me into a fix because I had to rewind my mind to give him recommendations. I kept thinking hard as I starting giving each of the recommendations.

In this blog, let me recollect some of the conversations I had and pen them down for everyone’s reference and most importantly for my reference.

About User Accounts

SQL Server executes as a set of Windows services. SQL Server is required to run under a domain account to interact with network services, to access domain resources such as file shares, or if it uses linked server connections to other SQL Server instances running on other computers in the domain.
When choosing the service account, consider an account with the least amount of privileges needed to do the job and no more.

Using a domain user that is not a member of the Local Administrator group or of the Domain Administrators group will be the best choice for the SQL Server service account.

Some especially sensitive accounts are detailed below:

  • SQL Server Browser is a name resolution service that provides SQL Server instance connection information to clients. If the SQL Server Browser service account is compromised, an attacker can use the permissions associated with the account to expand their control over the Windows environment. It is recommended that the SQL Server Browser service be run under the NETWORK SERVICE account that has the least privileges.
  • Some other security considerations: do not use LOCAL SYSTEM, LOCAL SERVICE, or NETWORK SERVICE as the service account to run MSFTESQL. This could provide an elevation of privileges for the Full-Text Search service.

Prefer Kerberos over NTLM authentication. Kerberos as the default authentication mode for windows connections to SQL Server is possible if client and server are joined in the same domain, or client and server are in different domains, but these two domains are configured as two-way trusts. Furthermore, Kerberos is available by setting the appropriate SPNs. Verify if SQL Server Name (WSFC solution) or Hostname matches the SQL Server service accounts. If so, this could prevent you from defining an SPN on the correct object, because you will get a User and Computer objects with the same name. Refer to Understanding Kerberos and NTLM authentication in SQL Server Connections for further information on setting Kerberos authentication.

As for securing SQL Server roles and permissions, consider the following:

  • The SA account is a well-known and frequent target of malicious users. Disable the account by using the ALTER LOGIN statement. If this is not an option, consider renaming the account by using the ALTER LOGIN statement, as this action can help protect the account. Refer to disabling SA blog.
  • Restrict the membership of the sysadmin fixed server role to logins that use Windows Authentication, assuring a greater protection of SQL Server.
  • If users without sysadmin rights need to perform certain tasks, consider creating proxy accounts. There are two types of proxy accounts, and each relates to a specific set of tasks: the xp_cmdshell proxy and the SQL Agent job proxy.
  • The configuration required by each type of proxy is different, and the way the authorities they give are used is different. If there is a requirement for users without sysadmin rights to run xp_cmdshell, then the xp_cmdshell proxy must be created with the sp_xp_cmdshell_proxy_account Similarly, if there is a requirement for users without sysadmin rights to own SQL Agent jobs, then one or more SQL Agent job proxy accounts must be created.

Though the one hour of conversations spanned a lot of topics and demonstrations on my laptop using MSA account and more. I think this brain dump of the conversation will become a great reference for me if someone asks me these again. Do let me know if you have read about these before in your interactions? Have you implemented these in your environments?

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Script to Convert Date to Julian Dates

There are many methods to store date values. One of them is Julian date. It is a seven digit number where first four is the year part and the next three digits are the number of days from Jan 01.

If you want to convert this number into a valid datetime value, you can use the following simple method

DECLARE @JULIANDATE INT
SET
@JULIANDATE= 2012146
SELECT DATEADD(DAY,@JULIANDATE%1000-1,DATEADD(YEAR,0,LEFT(@JULIANDATE,4))) AS JULIANDATE

The logic is to create Jan 01 of year part and add last three digits as day part using the DATEADD function

The result is

JULIANDATE
 -----------------------
 2012-05-25 00:00:00.000

Do let me know if you know any other method to convert dates in Julian Dates. I will be happy to publish that on this blog.

Additional note: Not to be confused with the Julian calendar, a Julian date or day number is the number of elapsed days since the beginning of a cycle of 7,980 years invented by Joseph Scaliger in 1583. The purpose of the system is to make it easy to compute an integer difference between one calendar date and another calendar date.

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #008 – Is GO T-SQL Statement?

I often get this question in the email – more frequently than I would like to get.

Question: “Is GO a T-SQL statement?”

Answer: “No, GO is not a T-SQL statement, it is a command which is recognized by SSMS or sqlcmd as a signal to send the current batch to SQL Server Engine.”

Well, here are two articles I recommend to read for more information – MSDN and Explanation SQL Command GO.

Here are few things I would like to add related to the GO statement.

  • If recommend a statement terminator semicolon after each statement, however, do not use semicolon after the GO statement or it will give an error.
  • ODBC or OLE DB API does not recognize the GO statement and it will give a syntax error.
  • You can use GO in a next line of T-SQL Statement. If you provide an integer after GO statement, it will execute that statement multiple times.
  • You can only have comments in the same line as a GO statement, any other statement will error.
  • You can change the command GO to any other value in SSMS by going to Tools > Option

Let me know your thoughts about the GO statement.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – FIX: SQL Server Not Starting – Msg 864, Level 16, State 1- Buffer Pool Extension Feature

Recently I got an email from one of the student who attended my SQL Server 2014 Pluralsight Course from last year, he was trying to use Buffer Pool Extension feature and got into some unique trouble. Here is the email from him:

Hi Pinal,
I watched your course about SQL Server 2014 administration new features on PluralSight. It is really a great course for someone like me who is trying to do fast ramp-up on SQL Server 2014.

I was trying to experiment with buffer pool extension feature. I have enabled the buffer pool extension feature on SQL Server Standard Edition and now I am not able to start my SQL Server service. Could you please help me out? This is not a business down situation, but want to understand what went wrong.

Regards,

(Name Hidden)

The very first thing which I asked him was to provide the error log information. Whenever there is a problem with the SQL startup, I always start looking at the Errorlog. Here is what I saw in the SQL Server Error Log.

2015-02-28 00:39:11.63 spid8s      Starting up database 'master'.

2015-02-28 00:39:11.84 spid8s      Attempting to allocate 6560086 BUF for buffer pool extension for a maximum of 6560085 page descriptors.
2015-02-28 00:39:11.84 spid8s      Error: 864, Severity: 16, State: 1.
2015-02-28 00:39:11.84 spid8s      Attempting to allocate 6560086 BUF for buffer pool extension for a maximum of 6560085 page descriptors.

Above ERRORLOG explains that, SQL server is not able to allocate the required buffers into the buffer pool extension file. If you look closely at both numbers, the difference of one buffer. This would mean that the size of buffer pool extension was more than the maximum allowed size. I looked into the documentation about the size limitation and found below link on MSDN.

https://msdn.microsoft.com/en-us/library/dn133176.aspx

(The buffer pool extension size can be up to 32 times the value of max_server_memory for Enterprise editions, and up to 4 times for Standard edition)

I went back and looked into the error log and on the top found below.

Microsoft SQL Server 2014 - 12.0.2430.0 (X64)
Oct 15 2014 16:05:37
Copyright (c) Microsoft Corporation
Standard Edition (64-bit) on Windows NT 6.3 <X64> (Build 9600: ) (Hypervisor)

So it means that the size given for BPE file was going beyond the expected limit of 4 times of max server memory because of standard edition.

So now, we know why the error is occurring and we are unable to start SQL Server. To fix the error we need to start SQL Server with minimum configuration so the BPE is not initialized. We can use startup parameter f. Here are the detailed steps.

  1. Open command prompt. Use “Run As Administrator”, if applicable.
  2. Type net start MSSQLServer /f /mSQLCMD

Since my machine has default instance I am using MSSQLServer. For named instance it would be MSSQL$<InstanceName>

We have given additional parameter called “m” and passed SQLCMD to let SQL Server know that only SQLCMD can connect.

  1. Once SQL is started, connect to SQL via SQLCMD using below on command prompt.
SQLCMD -S(local) -E

Note that you need to provide your own server name. In above example I am connecting to default instance of SQL.

  1. Once connection is made, we need to disable BPE by using below command
ALTER SERVER CONFIGURATION
SET BUFFER POOL EXTENSION OFF

  1. Once it’s turned off, we can stop SQL by net stop command as below
Net Stop MSSQLServer

  1. Now we can start SQL normally and it should work (unless we have some other problem)

To dig further, I have taken a Virtual Machine of SQL Server Standard Edition in Microsoft Azure Cloud. My machine had below configuration:

RAM = 14 GB
Max Server Memory = 12 GB
D drive is SSD for BPE file. Over there I created folder SQLAuthority_BPE

Here are the steps to reproduce the error in SQL Server Standard Edition.

SP_CONFIGURE 'MAX SERVER MEMORY', 12000
GO
RECONFIGURE WITH OVERRIDE
GO
ALTER SERVER CONFIGURATION
SET BUFFER POOL EXTENSION ON (FILENAME = 'D:\SQLAuthority_BPE\BUFFERPOOLEXT.BPE' ,SIZE = 50 GB)
GO
SELECT * FROM SYS.DM_OS_BUFFER_POOL_EXTENSION_CONFIGURATION
GO
SHUTDOWN WITH NOWAIT

Important: Above script would also shutdown SQL Server because I have added shutdown T-SQL command.

Once it was done, I was not able to start SQL without following the steps which I mentioned earlier. After fixing the problem, I configured BPE with 48 GB is size and it worked fine. Here is the message which we can see in ERRORLOG

2015-02-28 09:57:42.110 spid8s       Buffer pool extension "D:\BPE\BUFFERPOOLEXT.BPE" has been initialized successfully with size is 49152 MB.

Hope this would help in understanding the limitation and fixing the error which I have not seen so far.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – ReadOnly Databases and Notes Using Snapshots

This seems to be the third week that I am writing about ReadOnly databases, this blog is in continuation of last weeks notes. When I wrote the notes from last week, one of my blog reader said if it is possible to get the ReadOnly error without making the database into a ReadOnly database. This got me thinking and I couldn’t sleep that day. And a quick call to my friend to get some ideas – the answer was simple – using Snapshots.

I couldn’t control my excitement and this triggered me to write this special case scenario. As we execute the script, we will learn some of the behavior.

Let us start by creating our database which will be used for testing.

CREATE DATABASE [ReadOnlyDB]
CONTAINMENT
= NONE
ON  PRIMARY
( NAME = N'ReadOnlyDB', FILENAME = N'C:\Temp\ReadOnlyDB.mdf' , SIZE = 4024KB , FILEGROWTH = 1024KB )
LOG ON
( NAME = N'ReadOnlyDB_log', FILENAME = N'C:\Temp\ReadOnlyDB_log.ldf' , SIZE = 20480KB , FILEGROWTH = 10%)
GO
USE ReadOnlyDB
GO
-- Creating our table
CREATE TABLE tbl_SQLAuth (id INT, Longname CHAR(8000))
GO

Next let us go ahead and create a snapshot database out of our ReadOnlyDB.

-- Create a snapshot on our DB
CREATE DATABASE ReadOnlyDB_SS ON
( NAME = ReadOnlyDB, FILENAME =
'C:\Temp\ReadOnlyDB_data_1800.ss' )
AS SNAPSHOT OF ReadOnlyDB;
GO

Now that our snapshot database (ReadOnlyDB_SS) is created, it is important to know that snapshot databases are created as ReadOnly databases. The behavior is similar to creating ReadOnly Databases. To test this, let us try to insert some data into our Snapshot DB.

USE ReadOnlyDB_SS
GO
INSERT INTO tbl_SQLAuth VALUES (1, 'SQLAuth');
GO

Now we will be presented with the same familiar error (3906):

Msg 3906, Level 16, State 1, Line 25

Failed to update database “ReadOnlyDB_SS” because the database is read-only.

Even though we have not explicitly marked the snapshot database as Readonly we are getting this error. I was curious to know, what will happen if we try to mark the snapshot database as ReadOnly again. The error is self-explanatory.

-- Let us set the Snapshot DB as READ_ONLY
USE MASTER
GO
ALTER DATABASE [ReadOnlyDB_SS] SET READ_ONLY
GO

Msg 5093, Level 16, State 1, Line 31

The operation cannot be performed on a database snapshot.

Msg 5069, Level 16, State 1, Line 31

ALTER DATABASE statement failed.

If that was easy, what will happen if we try to make the snapshot database to read_write? The error is self-explanatory again here:

USE [master]
GO
ALTER DATABASE [ReadOnlyDB_SS] SET  READ_WRITE WITH NO_WAIT
GO

Msg 5093, Level 16, State 1, Line 37

The operation cannot be performed on a database snapshot.

Msg 5069, Level 16, State 1, Line 37

ALTER DATABASE statement failed.

Now that gives us two learnings, we cannot mark a Snapshot database as ReadOnly and nor can we mark a snapshot database as Read_Write enabled.

With these learnings in place, let us do the cleanup. Make sure to drop the Snapshot database before deleting the main database. So our cleanup script for this blog looks:

-- Clean up time
USE MASTER
GO
DROP DATABASE ReadOnlyDB_SS
GO
DROP DATABASE ReadOnlyDB
GO

Reference: Pinal Dave (http://blog.sqlauthority.com)

Hey DBA – Protecting a Proactive Attitude – Notes from the Field #069

[Note from Pinal]: This is a 69th episode of Notes from the Field series. Mike Walsh is an amazing person to meet and know. I am fortunate to be his friend. Every single time I meet him, I learn something as a human. He is always looking for an opportunity to improve the life of DBAs and Developers.

Being proactive is a good thing. There are few things which can not go wrong and being proactive is one it. Being proactive is not all about SQL scripts and deployments. It is also about attitude and mindset. If you know Mike, you must be aware of that he is the person who is well aware of human psychology. He knows what DBA and Developer things and how their attitude towards technology. I asked him one question – “How to protect a proactive attitude?” and database expert Mike Walsh decided to guide me with the answer of this question.

Read the entire story in his own words.


My posts lately here have been talking about being proactive. I am going to continue on that theme today, because it is that important. We perform a lot of WellDBA Exams™ – where we go into a client shop, review their existing setup, look at performance and write up our findings.

Each time we do one of these, we find the same types of findings almost everywhere. In fact we have a free resource available where we guide you through a checklist where you can evaluate your own environment against the 7 most common findings we encounter and see detailed information on what to do about those findings.

You can grab that free resource here.

In this post I want to zoom in on a proactive attitude again, and some ways to protect your proactive time. Then I will suggest a plan using that checklist to go through your environment and ensure the big things are taken care of.

Protecting a Proactive Attitude

I know I talk about this in just about every post I share here, but I am trying to cement this thought – a good DBA is a proactive DBA. You cannot just wait for a problem to arise, fight the problem, and then sit back and wait for the next problem.

I know you don’t do that. I know most DBAs don’t do that, at least not on purpose. But in this interrupt driven world, users and other IT teams are constantly clawing at their DBA for requests. These requests have to be met, because customers are expecting them to be done. But I challenge you that many requests don’t need to be done as fast as most DBAs do them.

You should develop a plan to protect your proactive time.

A few tips on how to preserve that time:

  • Schedule it! Make a block of time on your calendar each day, each week, whatever you can do. Guard that time and shut down your e-mail and IM clients (if you are allowed) and focus on doing proactive tasks to make your life easier.
  • Talk to management. When you are reactive you are spending more time, you are less effective and mistakes happen. If you get management to guard your time and people’s access to you for a bit for some proactive time, it will make you more efficient.
  • Work from home if you can. A lot of times when I was a full-time DBA – I could be more proactive when working from home, as the “drop by” visits can’t happen as easily that way.
  • Consider a WellDBA exam from Linchpin People, or a health check from another firm. Sometimes having an outside firm come in, audit your environment and tell you about your misses is the best way to get proactive time prioritized. Plus that approach helps you know where to focus your proactive time.
  • Just do it – Sounds harsh – but sometimes you just have to do it. If you don’t start trying to be more proactive, you’ll always be reactive and that is when mistakes happen. This is when we can come in and do an Exam and uncover the findings that the DBAs will say “yeah I know.. But I didn’t have time”

A Proactive Plan

Download our checklist, or find other SQL Server diagnostic scripts and approaches out there. There are several. Our checklist is less focused on scripts, and more focused on how, what and why you should care about these 7 areas, then suggests an approach to resolve the findings.

Here is the link again, it really is free and not some gimmick to get you on a “to be harassed” list.

Then start going through the checklist. We’ve ordered it in terms of our typical priorities.  For example – if you aren’t able to recover your environment, the other 6 points don’t matter as much.. Make sure you are really secure in each area, make sure you really understand the how and why in each area.

Move on to the next question and repeat the process. It is that simple. Some of the tasks will take more time than others to investigate. Some will take more time than others to fix. But this is a place to get started. And sometimes a place to start is all that is needed.

So, get some time set aside for being proactive, review your environment and get to the point where at least those 7 critical items are being taken care of! All the best on your journey!

If you want to get started with performance analytics and triage of virtualized SQL Servers with the help of experts, read more over at Fix Your SQL Server.

Reference: Pinal Dave (http://blog.sqlauthority.com)