[Note from Pinal]: This is a 18th episode of Notes from the Field series. Performance is very crucial subject and to come up with the right strategy along with a performance one needs to have more knowledge related to the overall performance of the server. However, most of the organization does not come up with the best strategies related to performance. It is very crucial that DBA collects performance statistics of the entire collection methodologically and scientifically.
In this episode of the Notes from the Field series database expert David Klee explains a very crucial issue DBAs and Developer faces in their career – how to collect performance statistics. Linchpin People are database coaches and wellness experts for a data driven world. Read the experience of John in his own words.
Performance statistics are critically vital for the long-term health and capacity management of an enterprise environment. Ongoing performance metric collection ensures that you, the DBA, have the metrics of a system’s operating state whenever a performance problem is reporter – either real-time or historically. Too many times when I am troubleshooting a well-known performance problem, the client has no metrics about the system state during the time window where end users reported performance problems.
Let’s fix that. We need some performance statistic collection.
Why Collect Around the Clock?
Are you able to definitively state that, at 10:48pm tonight, you will have unusually high disk latency on your SQL Server archival data volume? If so, become a consultant and the world is yours! For the rest of us, our reality would be an email tomorrow morning stating that the application was ‘slow’ last night and the customer service reps received complaints. We would normally not have any method to get a clear picture of the state of our system last night, and without the numbers, are now left to start looking at the current system state to see if the issue reported is still lingering in the system.
Constant, around-the-clock performance statistic sampling is the simplest method of achieving this goal. Wouldn’t you like to be able to generate charts like the following at any time to get a simple picture of what your servers were doing at any given time?
A picture says a thousand words, and this simple graph quickly tells the reader the times where the system was busy. The underlying data was collected with Windows Perfmon on a five-minute interval throughout the week, and loaded into a database where an SSRS report is emailed out to the administrators every morning. The reports are quickly reviewed each morning as the DBA performs the usual morning review, and if an anomaly is present, would be quickly discovered.
Methods for Collection
Quite a number of third-party utilities from your favorite tooling vendors exists to assist you with ongoing performance statistic collection. They work great and I highly endorse any tool that makes you more efficient. But, if you do not have one of these utilities, or the price is prohibitive, you already have a tool that can handle this task for you. Windows Performance Monitor, or Perfmon, is built into the Windows Server operating system, and can be configured to constantly record performance statistics for future use.
Use the directions at Technet (link to http://technet.microsoft.com/en-us/library/cc749249.aspx) to set up an ongoing performance data collector set to collect information around the clock. I generally use a five-minute polling interval to start with. If you know you are experiencing undiagnosed performance issues, increase the polling interval.
What to Collect
The following list of Windows and SQL Server WMI counters should get you the basics needed to help with troubleshooting current performance problems. Feel free to modify this list according to your environment and your specific needs. The following web site contains a list of the performance counters and their descriptions.
Note: if a counter has multiple instances of a selected object, make sure you select <All instances> so the counters are separated appropriately. This option helps you get individual statistics for objects like CPU cores or disk drive letters.
Windows Server Counters
|Counter Group||Counter Set|
|Page Faults / sec|
|Pages / sec|
|Network Interface||Bytes Received / sec|
|Bytes Sent / sec|
|Paging File||% Usage|
|PhysicalDisk||% Idle Time|
|Average Disk Bytes / Read|
|Average Disk Bytes / Write|
|Average Disk sec / Read|
|Average Disk sec / Write|
|Disk Read Bytes / sec|
|Disk Reads / sec|
|Disk Write Bytes / sec|
|Disk Writes / sec|
|Processor||% Privileged Time|
|% Processor Time|
|% User Time|
|System||Processor Queue Length|
SQL Server Counters
|Counter Group||Counter Set|
|Process (sqlserver.exe)||% Processor Time|
|% Privileged Time|
|SQL Server:Buffer Manager||Lazy writes/sec|
|Page life expectancy|
|SQL Server:Memory Manager||Total Server Memory (KB)|
|Target Server Memory (KB)|
|Memory Grants Pending|
|SQL Server:Access Methods||Forwarded Records/sec|
|SQL Server:Locks||Number of Deadlocks/sec|
|SQL Server:SQL Statistics||Batch Requests / sec|
|Compilations / sec|
|Recompilations / sec|
You can also build custom collectors for SQL Server-specific items that WMI will not serve. Use your favorite job scheduler to routinely execute queries against system DMOs and store the results into custom tables in a utility database. To get a good idea of the sort of metrics that you can collect, reference some of the many objects in Glenn Berry’s fantastic diagnostic queries, available at http://sqlserverperformance.wordpress.com. Items such as disk stall rates, instance and database level CPU, memory, and disk consumption metrics, and background CPU activity can all be sampled with these queries.
Now, what do you do with this data that you are collecting? Let’s make some sense of it! Stay tuned until next time when we discuss how to start sifting through the great quantities of data that we are gathering.
If you want to get started with performance statistics SQL Servers with the help of experts, read more over at Fix Your SQL Server.
Reference: Pinal Dave (http://blog.sqlauthority.com)