One of my clients took my assistance to deploy Always On availability group. After few months, they added a Disaster Recovery node to windows cluster and availability group also. Later, they contacted me that they are not able to move availability group to DR node. They were getting an error like below in event viewer.
Cluster resource ‘SQLAG_126.96.36.199’ of type ‘IP Address’ in clustered role ‘SQLAG’ failed.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
When I looked at their network configuration, I found that they have two networks.
But when I checked listener role, it was still having only 1 IP address.
They wanted to do DR test, and they did not have any free IP on the DR network. When I asked how an application would connect? They educated me that in DR, they would be using node name to connect rather than listener name.
If above was the way they wanted, then the solution for them was to remove dependency between AG and listener via cluster manager. So, I broke the dependency and failed over AG to the second node without any error. The IP was in the failed state, but AG came online because we removed the dependency.
But in general, when we have multi-subnet AG then we should have multiple IPs for the listener as well. The number of IPs would be equal to the number of subnets.
Have you worked with multi-subnet AG? What are the issues you have faced?
Reference: Pinal Dave (https://blog.sqlauthority.com)