While I was playing with SQL Cluster in my lab, I restarted the VMs and found that I was not able to bring SQL Server online. As always I was looking for error message, but there was nothing interesting. Let us see in this blog post how to fix Clustered Instance Online Error.
Here were the observations:
- SQL ERRORLOG is getting created.
- If I start SQL from the services it runs fine.
- If I try to bring SQL resource online in the cluster, it stays for “Online Pending” and then it goes to “Failed” state
To get more about failure in the cluster, I generated cluster log using steps in my own article.
INFO [API] s_ApiGetQuorumResource final status 0.
INFO [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:5447358a-a102-4fc9-95f4-c040e8716859:Netbios
ERR [RES] SQL Server : [sqsrvres] ODBC Error:  [Microsoft][SQL Server Native Client 11.0]SQL Server Network Interfaces: Error Locating Server/Instance Specified [xFFFFFFFF]. (268435455)
ERR [RES] SQL Server : [sqsrvres] ODBC Error: [HYT00] [Microsoft][SQL Server Native Client 11.0]Login timeout expired (0)
ERR [RES] SQL Server : [sqsrvres] ODBC Error:  [Microsoft][SQL Server Native Client 11.0]A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online. (268435455)
INFO [RES] SQL Server : [sqsrvres] Could not connect to SQL Server (rc -1)
INFO [RES] SQL Server : [sqsrvres] SQLDisconnect returns following information
ERR [RES] SQL Server : [sqsrvres] ODBC Error:  [Microsoft][ODBC Driver Manager] Connection not open (0)
INFO [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:52cf277d-234b-4a81-a9a7-0f078fca2a17:Netbios
As per cluster logs, the cluster is not able to connect to SQL Service.
WORKAROUND / SOLUTION
Here are the normal causes of the above error:
- Incorrect client alias created in the configuration manager
- SQL Browser isn’t running when SQL is listening on a non-default port or a named instance.
- TCP port connection issue.
I already have detailed checklist for common causes.
In my lab, I found that I had a TCP alias created and port of SQL Server was changed after reboot, causing the SQL cluster issue.
To fix that forever, I changed SQL Server to listen on a static port instead of dynamic port.
Have you ever encountered same situation where the cluster log has helped you?
Reference: Pinal Dave (http://blog.SQLAuthority.com)