[Note from Pinal]: In this episode of the Notes from the Field series database expert Mike Lawell explains about Availability Group and the Six Critical Steps for DBAs to Follow. A few weeks ago, I asked questions in conferance, how many people know about availability group. Only a handful of people raised their hands. I was expecting this as many have heard about AlwaysOn and Availability Groups but not everyone has implemented the same. Most of the time, there are no clear guidelines for the same. In this blog post Mike tells an amazing story about Six Critical Steps every DBA must know about Availability group. He makes this simple concept very easy. Read the experience of Mike in her own words.
Your boss has asked you to implement SQL Server AlwaysOn Availability Groups in your environment. Exciting as it may, you’re an accidental DBA with little to no experience implementing any high availability solution, let alone, a fairly new availability group.
The first thing you say is, fantastic! Then, hmmm, fantastic. You realize this is above your head, but it is a great opportunity to learn.
This happened to one of my clients, where the DBA barely had SQL Server Database Mirroring experience. The product had a required up time of 99.999% (yeah, right), that they were having problems maintaining. They wanted to improve their availability and business continuity.
The DBA had a lot of anxiety with implementing AGs in this environment without experiencing considerable downtime.
Fortunately they were able to engage our services to lead them through the process. We migrated from mirroring to AGs with approximately a minute of downtime into a 4 server multi-subnet environment to a new set of servers.
There are many environments where AGs will simply not work (or have a hidden disaster waiting). For instance, you should not use availability groups in SQL Server 2012 or 2014, if you’re doing cross database transactions or distributed transactions. If you have a failover during a distributed transaction, it is possible that your database will enter a suspect mode (btw, this changes in SQL Server 2016).
How many servers and replicas will you need? Which databases do you want in which AG? What might the licensing costs be? What should/can your quorum mode be? Do your applications even support AGs?
Develop the high level plan to the best of your knowledge. Don’t worry about being wrong, you will miss something. Take the time to think of every application or service, etc. that will connect to the AG. Document all of the jobs, alerts, mail configuration, operators, everything you can think of (find a script to do it).
3. Test the AG Build
Now that you have some idea of what is possible and you think you know what needs to be done, build an AG as a proof of concept in an isolated environment to test your theory of how it should work. Make sure you use the same drive letters as production. You will learn that you will miss important configurations, steps, etc. Tear the AG down then rebuild it, until you are familiar with the steps, write the steps you take down (every step). Do this until you’re confident you have all of the steps assembled and you haven’t missed anything. You will miss something, because most database environments are complex, and many DBAs don’t know all of the services connecting to their databases.
4. Script the Implementation
This part really helps in minimizing downtime. You don’t want to be clicking through SSMS trying to get all of the configurations correct in production and risk additional downtime at failover or worse.
I built the AGs ahead of time in the production environment and added a test database so I could test the failover between machines before go live.
The scripts were PowerShell scripts pulling the configuration from a csv file. I learned how to script PowerShell from the internet. Make sure your configuration file is tested. I create a spreadsheet with the important data (created from a script run against production) about database name, data file location, and more. Don’t type the configurations if you can avoid it. This is where you get into trouble especially in complex environments. I have some examples of the PowerShell AG and SQLCMD Log Shipping scripts I used on my blog at SQLServerAssociates.com
5. Test the Scripts
This is the most important step as even the best plan must be tested. It must work repeatedly in your development environment without errors. Don’t compromise with this step. If you want to avoid extended downtime because you received an error that you’ll have to research to fix, test your scripts.
6. Know your Environment
You must be able to talk confidently about AGs and defend your configuration to management. You will need to document your environment for management, operations, and other DBAs (including yourself). Use Visio diagrams for visuals.
Migrating your environment to AGs can be a great learning experience, but don’t be afraid to engage a consultant to help you do it right. The justification is usually very easy when talking to management when you equate downtime to revenue lost.
If you want to get started with SQL Server with the help of experts, read more over at Fix Your SQL Server.
Reference: Pinal Dave (http://blog.sqlauthority.com)