[Note from Pinal]: This is a 58th episode of Notes from the Field series. Mike Walsh is an amazing person to meet and know. I am fortunate to be his friend. Every single time I meet him, I learn something as a human. He is always looking for an opportunity to improve the life of DBAs and Developers.
In one of the recent communication with Mike, I had asked him a question that what is actually Baselines and Performance Monitoring? How do we define what is normal and what is not normal? Where do we draw lines when we talk about performance? Database expert Mike Walsh decided to help us with the answer of this question.
Read the entire story in his own words.
Last time it was my time to post here on SQL Authority, I suggested that DBAs make a proactive investment in their environments.
I can’t stress this enough – procrastination is our enemy as technologists. Especially as DBAs. We have things we should do on a regular basis to keep our environments moving along. I gave a few suggestions of places to start in that post three months ago. If you’ve not started down the road on those things, stop reading this and do those things first.
If you’ve started being proactive in those areas already? Keep reading and let’s talk about being proactive in another area:
Baselines and Performance Monitoring – Why?
It’s really important to know what “normal” looks like for a lot of reasons when it comes to performance in your environment.
The answers to these questions all require you to know what normal looks like:
- When will this environment exhaust the resources available in our current setup?
- Things are slow – what’s the cause?
- Can we consolidate onto fewer servers?
Without having a sense of what normal looks like for server performance – you can’t accurately answer those questions. For some of them you need to watch what normal looks like over time. And normal can look different on different servers.
Regularly looking at your performance metrics and comparing them on a regular basis is an important step in being proactive. The process will teach you about your environment and let you see trends develop over time. It will also teach you about SQL Server performance more as you dig into the various counters and understand what they mean.
Baselines and Performance Monitoring – An Approach
There are a lot of approaches to take here. You can run your favorite perfmon counters on a regular basis and look at the results in Excel. You can use a performance and event monitoring tool like SQL Sentry – and look at the data it collects over time. Or any number of approaches in between.
While I often advise my consulting clients to use several of these approaches – a good place for you to start learning and doing is with a free tool called PAL – Performance Analysis of Logs. This tool is available on Codeplex and I describe how to use this tool in a couple blog posts and videos on the Linchpin People blog (http://www.linchpinpeople.com/sql-server-performance-baseline-presentation-seacoast-sql/ is a presentation I give on baselines and baseline metric collection; http://www.linchpinpeople.com/how-create-sql-server-baseline-using-pal/ shows a couple videos on how to use the PAL tool and create a perfmon template).
Rather than regurgitate what those resources, and the posts I link to in them, describe – I’ll suggest a few action items that you can follow right now. Those links will help you on your journey here. Even if your environment has no issues, the baseline can prove valuable to you in the future – and the process is guaranteed to teach you something about SQL Server performance you didn’t know before you started. I’ve been working with SQL server for 16 years and I still learn something every time I engage in serious baseline study.
Baselines and Performance Monitoring – Action Plan
Some suggested next steps you can take to get on the road to being proactive about performance:
- Learn about the PAL tool – use the links here to start.
- Collect “Fake” Performance Data – Start with a test environment if you aren’t sure, watch the impact and get comfortable with the process.
- Collect Real Performance Data – Once you are comfortable with the approach, do the steps in production. Track some data for a couple days, a day or a week depending on the sample interval you go with.
- Analyze The Data – Run the performance metrics through the PAL tool and look for alerts. Keep in mind that not all alerts require immediate action, but look at the alerts and learn about them. Read the tips in the PAL tool, look at blogs like this one to understand what is going on and see if you have issues.
- Rinse and Repeat – Try this again in a month or a few months. What’s changed? Are you doing more activity (maybe batches/second or transactions/sec or user count) and seeing your resources still performing well? That’s great. Do you see performance slowing down but activity not really that much higher? Dig in and see why you are falling off your baseline.
The point is – get started looking at your baselines. You may not have an urgent and immediate need for a good set of baselines today – but when that day comes, you’ll wish you took the time to collect that data. I can’t tell you how often we get called in to help a customer experiencing performance concerns and a baseline would help at least get a sense for what changed from when things were good. We can still help get them out of their mess all the same – but if we had a few baselines over time we can sometimes more quickly figure out where the most beneficial changes can come.
If you want to get started with performance analytics and triage of virtualized SQL Servers with the help of experts, read more over at Fix Your SQL Server.
Reference: Pinal Dave (http://blog.sqlauthority.com)