[Note from Pinal]: This is a 97th episode of Notes from the Field series. If you know Mike Walsh, he is an amazing person to meet and know. I am fortunate to be his friend. Every single time I meet him, I learn something as a human.
In this episode of the Notes from the Field series database expert Mike Walsh gives a challenge to all of us. He explains to us why we should study our mistakes. He is indeed very right that every mistake gives us opportunity to correct ourselves.
So another post from me on my friend Pinal’s blog that is on the periphery of technology and professional development.
This post is admittedly inspired by a scare I had with a chainsaw this past weekend. I blogged about that over on my personal site but I wanted to talk about a mindset of studying the failures (both your own and those of others) and trying to always learn from mistakes. You can learn any technology you want, you can be the best DBA you could possibly be – but if you don’t study and learn from failures, you’ll never be as good as you could be.
I can talk about this a bit for two reasons. One is – I’m a human. And as one, I make mistakes from time to time. So I’m qualified to talk about failure – because I’ve done it so much. The other is – I’m a consultant. At Linchpin People – we do a lot of things with SQL Server for our clients. One of them is rescuing projects gone bad. I’ve picked up other people’s mistakes and tried to piece them together for them, and do it regularly.
So on to a few thoughts about learning from mistakes.
What Do You Mean, “Study Mistakes”?
In the post on my own blog, I talked about how chainsaw protective pants/chaps saved me a really bad day. I didn’t always wear safety gear when using chainsaws until recently. I’m not sure why, but I didn’t. What made me change my mind this year? I studied mistakes.
You see – I have to do a lot of work at my house in the woods for clearing fields for farm animals (In my spare time, when I’m not helping clients tune million dollar systems, I’m cleaning up after sheep and chickens – and more animals once I’m done clearing more space) this year. And I’ve used a chainsaw before here and there – but never with the size trees that I’ve had to cut this year. Never at the frequency that I’ve been cutting. So I started talking to people who’ve been there and done that. I asked questions and listened to advice. But I also read accident reports (I speak on aviation disasters and learning from them, In the Fire Service we study Line of Duty Deaths a lot to see what mistakes others made. It sounds macabre but it is a vital learning tool). I read a lot of accident reports – and I saw mistakes and common patterns in chainsaw users who died or had serious injuries. In a lot of the cases there were nearly always mistakes made, but then a lot of times they were compounded by not having the right safety gear on when those mistakes happened. I learned about being intentional and avoiding mistakes and changed some habits – but I also saw that over half of the accidents wouldn’t have been reported if the right protective gear was on. So I bought it. And wear it. And it works.
We can do that as technologists too.
DBAs – What can you do?
Study Mistakes – Read blog posts. Look at the forums. Look and see what gets people “hurt” with SQL Server. Look at the horror stories about how SANs really can fail. Learn what goes wrong. Go to lessons learned and “you’re doing it wrong” style presentations. When you go to these – don’t get the “this couldn’t happen to me” attitude. Understand this CAN happen to you and ask yourself what you can do differently.
Be Creative – When I was doing my various levels of emergency medical technician training we would often engage in scenario based discussions. We’d think up “what if” situations. We’d think of the time of dispatch (how a call would come in) through the time responding to that type of call, to arriving, and we’d think of various what-if’s along the way. We’d change the presentation of the patient, we’d think about a turn for the worse. All of this training developed muscle memory, it got us thinking about the unknown. So when we were thrown into the unknown, it wasn’t a totally foreign feeling.
We can do that as technologists also! We can think of what could go wrong, we can think of the things a few levels deep and imagine where things can break. And then we can do something about the scenarios that we should deal with. A good way to start this is writing checklists for upgrades or migrations or deployments. As you write the checklist, you can begin to analyze what may happen at various steps and build time and solutions in to deal with those situations.
Be Honest – It’s 2015, so I am assuming that you are a human if you are reading this. So you’ll make mistakes. It will happen. It may be big, it may be small, and it won’t be your last one either. In the moments after you recover you have options. One is ignore what happened, brush it under the rug and hope it doesn’t happen again. Another is blame others. Still another, though, is to investigate what happened. Understand the mechanics. Do a root cause analysis and see what you can do differently next time. Learn from your own mistakes, encourage your team to have open and honest lessons learned meetings where everyone talks, everyone listens and the uncomfortable conversations can be had. If that can’t happen? Then you’re doomed to repeat history.
If you want to get started with performance analytics and triage of virtualized SQL Servers with the help of experts, read more over at Fix Your SQL Server.
Reference: Pinal Dave (http://blog.sqlauthority.com)