While playing with my lab cluster, I ran into a situation. In this blog, we would learn about a situation where failover was not working from one node to another node. Everything used to work fine on Node1. Let us fix the error related to failover clustered.
Errors from System Events logs:
01/04/2018 04:02:16 PM
Cluster resource ‘SQL Server (InstA)’ of type ‘SQL Server’ in clustered role ‘Instance_A’ failed.
01/04/2018 04:02:19 PM
Service Control Manager
The SQL Server (InstA) service entered the running state.
01/04/2018 04:04:04 PM
Service Control Manager
The SQL Server (InstA) service entered the stopped state.
System Events logs pretty much showed generic errors so, I moved on to the cluster logs and isolated to the timeline when I tested the failover. Here is my earlier blog to learn how to generate cluster logs.
Errors from cluster log. Remember that by default time is in UTC time-zone. In Cluster log, ERR and WARN are generally the interesting messages. The above-highlighted error was very interesting. And when I spent more time in the cluster log, I unearthed these below errors:
0000173c.000029f4::2018/04/01-11:32:16.564 ERR [RHS] s_RhsRpcCreateResType: (126)’ because of ‘Error loading resource DLL sqagtres.dll.’
00000ffc.000027d8::2018/04/01-11:32:16.564 INFO [RCM] result of first load attempt for type SQL Server Agent: 126
0000173c.000029f4::2018/04/01-11:32:16.604 ERR [RHS] s_RhsRpcCreateResType: (126)’ because of ‘Error loading resource DLL sqsrvres.dll.’
00000ffc.000027d8::2018/04/01-11:32:16.604 INFO [RCM] result of first load attempt for type SQL Server: 126
Both the errors found above are talking about a possible faulty resource DLLs for SQL Server resources. We can see number 126 which is the error code. As we can see below, error number 126 = The specified module could not be found.
Since the error above talked about the DLL not being loaded, I tried to register the DLL in question and got the below error:
The module “SQSRVRES.dll” failed to load.
Make sure the binary is stored at the specified path or debug it to check for problems with the binary or dependent .DLL files.
The specified module could not be found.
This is the same what we saw in cluster log earlier.
By looking at various errors so far, I was concerned if the SQL cluster related DLLs are actually present. So, I tried to verify the same in the path C:\windows\system32 folder. Surprisingly, I was not able to find sqsrvres.dll and sqagtres.dll. Instead, both were renamed as,
I renamed the DLLs to the required name sqsrvres.dll and sqagtres.dll. After that, we were able to start SQL resource from Failover Cluster Manager. I also rebooted the affected node so that all the resources come online on their own so that all the DLLs loading happening automatically.
Reference: Pinal Dave (https://blog.sqlauthority.com)