It is interesting to see that in the past few days I have seen a similar pattern of an issue hit by three clients. All of them contacted me for a short engagement for the same issue. As you might know, every single day I keep two slots available for On Demand (50 minutes) so here is one of their emails. Let us learn about adding node in the AlwaysOn availability group.
“Pinal -Need your urgent help On Demand! We have added Node in AlwaysOn AG and things have broken. Basically, this cluster was set up a many months ago. Now, we were attempting to a new node at our DR site to the cluster. After adding the node all of the Disk Drives on the original two servers are showing offline and will not come online.”
When I joined a session with them, I confirmed that they have added a new node to windows cluster which broke the disks. They were using Always On availability group, there was no need of shared storage. They used Add Node in Windows Failover Cluster Manager and keep moving next, next, finish. As soon as they did that, they found that their local drives became clustered. In our case, we had standalone instances for SQL and due to that the disk should be local disk and should not show up in the cluster administrator.
WORKAROUND / SOLUTION
- We need to remove the disk resource from Failover Cluster Manager interface. Since there is no shared storage in our case, we do not want disks to be part of a cluster.
- Then we need to bring these drives online in Disk Management (after making sure, step # 1 is completed)
ROOT CAUSE / CAUTION
While adding a new node to Windows Cluster, make sure that we uncheck “Add all eligible Storage to the Cluster”. I don’t know why Microsoft has this checked by default when there is no shared storage.
Hope this blog would help you in saving downtime which would be caused due to one checkbox. Just a single checkbox can help adding node issue in availability group.
Reference: Pinal Dave (https://blog.sqlauthority.com)
Yes, we faced similar issue when our windows person has evicted an AG cluster node while fixing the quorum failure issue. While adding the node back, missed the checkbox. Finally we got the culprit.
This article save a lot of downtime on production servers. Thanks Much Pinal
Great post! Thanks Pinal!