SQL SERVER – Unable to Bring SQL Cluster Resource Online – Online Pending and then Failed

Here is the situation which my client explained and I was asked for help about SQL Cluster Resource.

Hi Pinal,
We are having 2 node windows cluster having 3 SQL Server instances clustered running on Windows 2012 R2 on VMware. We have one instance that will start from the services.msc but not from the Failover Cluster Manager when attempting to bring the service online.  In reality the services start because during the ‘Online pending’ I am able to connect and query the databases on that instance, although it is in the ‘Online pending’ state. 

Do you know what could be the problem?

SQL SERVER - Unable to Bring SQL Cluster Resource Online - Online Pending and then Failed SQL-Cluster

Solarwinds

My first question to him was – is there any error in event log? From the email it sounds like SQL server service is able to start, but the cluster is not able to connect to SQL Server. Here are the errors in event viewer:

  1. [sqsrvres] ODBC sqldriverconnect failed
    [sqsrvres] checkODBCConnectError: sqlstate = 08001; native error = ffffffff; message = [Microsoft][SQL Server Native Client 10.0]SQL Server Network Interfaces: Error Locating Server/Instance Specified [xFFFFFFFF].
  2. [sqsrvres] ODBC sqldriverconnect failed
    [sqsrvres] checkODBCConnectError: sqlstate = HYT00; native error = 0; message = [Microsoft][SQL Server Native Client 10.0]Login timeout expired
  3. [sqsrvres] ODBC sqldriverconnect failed
    [sqsrvres] checkODBCConnectError: sqlstate = 08001; native error = ffffffff; message = [Microsoft][SQL Server Native Client 10.0]A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online.

As I suspected, there is a connectivity issue from local machine to SQL instance. Cluster Service will attempt to connect to the SQL service every few minutes (setting in SQL cluster resource) for the IsAlive check, if this fails, then the SQL resource is restarted even if the instance was online. Here are the steps I generally follow.

  1. Start SQL Service via Net Start NOT via Failover cluster Manager.
net start MSSQL$SQL2014

You need to change instance name/server name. For me it is named instance of SQL Server called SQL2014. If you have default instance, then it would be called as MSSQLServer.

  1. Once it’s started successfully, we need to make a connection to SQL Server via using SQLCMD
SQLCMD -S<ServerName\InstanceName>

To get the exact name you need to open Errorlog and look for “Server Name is” keyword. If you don’t know Errorlog location, then follow this below

SQL SERVER – Where is ERRORLOG? Various Ways to Find its Location

  1. In most of the cases, you would get similar error which we I have provided above. Here are the various things to try.
  2. Make sure the SQL Brower Service is running.
  3. Create a TCP alias to SQL Instance by giving IP and Port
  4. Make sure firewall is allowing the connections.
  5. Create exceptions of sqlservr.exe and sqlbrowser.exe in Firewall

Most of the troubleshooting steps are available in my earlier blog

SQL SERVER – FIX : ERROR : (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server) (Microsoft SQL Server, Error: )

  1. Once the connectivity issue is resolved, stop SQL Service via
net stop MSSQL$SQL2014
  1. Now try to bring SQL Server resource online in a cluster.

My friend informed me that they had a TCP alias already created by port number of SQL Server got changed and hence connections were failing. After creating correct alias, SQL Server was able to come online in failover cluster manager.

Have you ever encountered such situation? Is there anything which you would like to share? Please comment and let me know.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Solarwinds
, ,
Previous Post
SQL SERVER – Performance Monitoring for Analysis Services – Notes from the Field #093
Next Post
SQL SERVER – How to Identify Locked Table in SQL Server?

Related Posts

3 Comments. Leave new

  • Thanks, for sharing this issue it will helpful for many of us.

    Reply
  • carlos alfonso (colombia)
    November 13, 2015 12:13 pm

    Hi
    I have this situation, i can not bring online sql cluster resource after failover. i verified if the virtual server was created in regedit, i can connect via sqlcmd and i can run some sql commands over user database, i created alias and i verified if i can problems with the firewall (i created all exceptions). Nothing permits that the sql resource bring online. BUT, i try to create a new odbc and !Surprise¡ ODBC DRIVER is not marked, like the driver had an error.
    Can you helpme ?, may be provide me some ideas?
    Thank You a lot

    Reply
  • Great article also keep up the good work with the blog always very useful and interesting things

    Reply

Leave a Reply

Menu