The best way I have learnt about building SQL Server AlwaysOn Availability Group is by contacting my close friends who have written the only book on this topic. It is not common that I ping them for such technical queries but that is what friendship is made off. Recently, while preparing a demo for AlwaysOn Availability Group, I ran into interesting issue and was able to fix it as well.
First, let us look at the configuration.
SERVER1 – Primary Replica
SERVER2 – Secondary Replica
They are synchronous and configured for automatic failover. I have shutdown SERVER1 and as expected, SERVER2 became primary. As soon as SERVER1 came back, it became secondary but databases were NOT synchronizing. That was a surprise to me as I was expecting to see the data movement from SERVER2 to SERVER1 now.
As always, my troubleshooting checklist has ERRORLOG is my first point of investigation. If you have not seen earlier, you can use below blog to know how to find ERRORLOG
On SERVER1, I found below messages multiple times.
2015-08-24 01:24:18.480 Logon Database Mirroring login attempt failed with error: 'Connection handshake failed. An OS call failed: (8009030c) 0x8009030c(The logon attempt failed). State 67.'. [CLIENT: 10.0.0.2]
There are some interesting learning/finding from above messages.
- Database Mirroring and AlwaysOn uses same endpoint and that is why we are seeing Database Mirroring in error message even if I am using AlwaysOn Availability Group.
- IP Address in the message belong to SERVER2 (which is primary as of now).
This means that SERVER2 is not able to talk to SERVER1 via Mirroring port. I searched on hex code 8009030c is not helping much because it means – The logon attempt failed, which is already there is the same message.
Steps which I tried after searching on internet (and they didn’t help)
- I logged in to SERVER2 via remote desktop and tried making connection to SERVER1 via Management Studio and it was working fine.
- Restart endpoints on both replicas. No joy!
ALTER ENDPOINT hadr_endpoint STATE = STOPPED;
ALTER ENDPOINT hadr_endpoint STATE = STARTED;
- PING from and to each other. This also worked well which means DNS was not an issue.
- Verified that SQL was running under domain account and account was not locked.
None of the above helped. Taking last hint, Then I started backtracking to recall what I did with my SQL Service account. It did not take long to realize that I had change service account domain password in recent past. I rarely use AlwaysOn Availability Group so I changed the service account password on SERVER1 but forgot on SERVER2.
When I was looking at errorlog, I also found below
SQL Server failed to communicate with filter daemon launch service (Windows error: The service did not start due to a logon failure.). Full-Text filter daemon process failed to start. Full-text search functionality will not be available.
Above message came when SQL was trying to start full-text service automatically. Since I didn’t change the password for full-text service, I was seeing above message.
Solution (which worked for me): Correct the Service account password for SQL Server related service via SQL Server Configuration Manager.
When I corrected service account password, the databases were synchronized and I was happy again.
Reference: Pinal Dave (http://blog.sqlauthority.com)