One of my clients contacted me for quick assistance. They were trying to install a Service pack for SQL Server 2016 instance, but it was failing with below error message about cluster service.
From the error message its clear that machine is a node of a cluster. So, I checked the setup logs Detail.txt and looked for rule failure.
(04) 2018-01-17 11:11:24 Slp: Initializing rule : Not clustered or the cluster service is up and online.
(04) 2018-01-17 11:11:24 Slp: Rule is will be executed : True
(04) 2018-01-17 11:11:24 Slp: Init rule target object: Microsoft.SqlServer.Configuration.Cluster.Rules.ClusterServiceFacet
(04) 2018-01-17 11:11:24 Slp: The given key was not present in the dictionary.
(04) 2018-01-17 11:11:24 Slp: at Microsoft.SqlServer.Chainer.Infrastructure.ServiceContainer.GetService(Type serviceType)
at Microsoft.SqlServer.Chainer.Infrastructure.ServiceContainer.GetService[T]()
at Microsoft.SqlServer.Chainer.Infrastructure.ServiceContainer.get_Cluster()
at Microsoft.SqlServer.Configuration.Cluster.Rules.ClusterServiceFacet.Microsoft.SqlServer.Configuration.RulesEngineExtension.IRuleInitialize.Init(String ruleId)
at Microsoft.SqlServer.Configuration.RulesEngineExtension.RulesEngine.Execute(Boolean stopOnFailure)
(04) 2018-01-17 11:11:24 Slp: Rule initialization failed – hence the rule result is assigned as Failed
From above it looks like they do have a cluster, but something is not right because we are seeing exception/error when setup is trying to get details about it.
STEPS WHICH DID NOT WORK
After a lot of checking with others, we understood that there was a cluster having two nodes earlier. As of now, the second node is no more and will never be available. We tried to restart cluster service on this node, but it didn’t help.
The only command which ran was
Get-Cluster
Rest all cluster related commands were failing with below error. We tried Remove-Cluster, Get-ClusterNode and few more command.
The remote server has been paused or is in the process of being started.
Above error indicates that commands were as not able to reach other nodes.
WORKAROUND/SOLUTION
Since another node was not accessible, we need to get rid of that node in a cluster configuration. The only way which I found was to start cluster service in force quorum mode and then clean up cluster configuration.
Before trying below step, you need to evaluate the situation. In this customer situation, the second node was not going to come back, and they were not using cluster at all. The SQL which they were trying to patch was a standalone instance. So, we were OK to destroy the cluster.
To do that, we first need to start cluster in force quorum mode.
After this, we were able to connect to cluster via Failover cluster manager. So, we used “Destroy Cluster” using UI as shown below.
Once completed, we were able to run the patch in SQL without any problem.
Reference:Â Pinal Dave (https://blog.sqlauthority.com)
4 Comments. Leave new
Thanks for sharing such incident & work around.
Same issue was occurred in our environment as well and i have just rebooted the windows server and initiated the SP upgrade again and it works for me.
Thanks Pinal. You saved the day again! On my server, the cluster wasn’t completely destroyed, but it was enough to allow the SQL Server service pack to complete successfully.
if you don’t want to keep information from the old cluster, you can just run the Clear-ClusterNode command (instead of start and destroying the cluster) and re-run the validation rules.