This is one of the toughest topics to address, and I am always looking at new tools that can help me learn and understand SQL Server better. In many cases, I fall back to MSDN documentation and official statements from the Microsoft site because a lot of R&D generally has been done by them already. Though I learn about these recommendations, I make it a point to try them out in my limited capacity of testing. I am always of the opinion – “Accept but yet validate.” This reinforces our understanding and we will never forget the concepts thereafter.
Coming back on the topic, what is the tool that you use to check for configuration settings on your SQL Server box? In the past I have talked about SQL Server Management Studio Standard reports and a number of reports that can help. Today, let me tell you about a tool that has helped me learn some new things every now and then – it is called Microsoft SQL Server 2012 Best Practices Analyzer. As I said before, since it comes from Microsoft directly, I use the recommendations and try to learn from the tool. The documentation is crisp and neat too.
As the name suggests, it is a guideline to best practices recommended by Microsoft. This tool can help analyze the configuration setting of your SQL Server box for:
- Analysis Services
- Database Engine
- Integration Services
- Replication Settings
- Reporting Services
Let me talk about some of the nice rules available for Database Engine. These are just a subset of 100’s of rules available in the tool. I am just calling out few of them here for reference:
- It is a security best practice to know if there is a DDOS attack on the server by auditing Failed Logins. Failed Logins are enabled by default, but the rule makes sure there is a check in case it was disabled by the administrator.
- I have not been a fan of auto-growth happening on the server. Auto-growth is a safety net but administrators need to make sure the files are grown ahead of time during the maintenance window. The rule checks if auto-growth failed for some reason or if it took longer than expected. The answer to this is enabling Instant File Initialization.
- Placing data and log files on the same volume is never a great idea. It is recommended to place these on different physical drives.
- If a database is not enabled with CHECKSUM, it will be difficult to identify IO related inconsistencies. Read more from the KB. The rule makes sure these are enabled at the DB level.
- Having too many VLFs can be a problem. SQL Server 2012 does put some data into the Error Log, but this rules does a check and lets us know.
- It is a security risk to grant more than read permissions on the BINN folder inside SQL Server. We don’t want the binaries to vanish or get corrupt by malicious users. So it is the case for the DATA folder where we are placing our system files.
- If password policies are not enforced in the system, the rule raises the same as alert. It is a good practice to have password policies of expiration enabled on the server.
The number of rules can keep going on. As I discussed earlier, these are great learning resources for us. We can read on each of the rule and then do our R&D to learn more.
Monitoring performance using the Database Performance Analyzer
Moving now from standard configuration settings to the performance topic. There is no single starting point when it comes to troubleshooting performance inside any database. When it comes to performance tuning, you need to think differently. With complex systems built on top of databases, you need something that is built radically different. SQL Server did introduce a concept called Wait Stats and it has been a popular mechanism for many to have a macro high level understanding of the various waits inside SQL Server. In fact I have written a book on Wait Stats based on the 28 days series on Wait Stats over my blog.
From what I have seen in Database Performance Analyzer from SolarWinds (you might also know it byt its former name Confio Ignite), it has enough and more reference to these Waits. Using Waits as an indicator to server health is something unique I saw from this tool. If you are a DBA and want to tune your systems from the inside out, then it is worth taking a note of this tool. Wait-based reports are embedded almost all over the place. Let me start by looking at the home dashboard. It is crisp and to-the-point for high level information like CPU, Memory, Disk, Waits. If there is something to be worried about, then a warning symbol indicates areas of concern. We will look at some of them as we start exploring the tool.
From the main dashboard, if we select the server that we are monitoring, we will be presented with the above screen. It is a great way to look at the trends of various statements and the waits happening inside the system. What surprised me is the way the waits are also tracked for Oracle, Sybase, Oracle (RAC), DB2 servers from the same console. The enterprises of today are a heterogeneous environment and tools like these that give a single pane of view of waits are great tools.
The Trends graph shows the waits and maximum time taken by batch of queries. The Advisor pane at the bottom was the icing on the cake, it shows the query that consumed maximum CPU or Memory on the server. Clicking on the “more…” button we can view the query and take corrective measures on reducing the same. Let us look at one such recommendation below:
Here at the details pane, we can see the query that got executed and possible recommendations for a better plan are presented. In this case, we get the information about possible “Missing Indexes” for the given query. As a DBA, it is almost impossible to go to each query and find what the missing indexes are. Or worse, in SQL Server we do get a list of missing indexes but it fails to tell which queries will benefit. This screen is a classic example of both coming together. A nifty addition for sure. The bottom pane shows the various waits that affect the query or we can get a day view to see what the top waits are.
In this above view, I have taken an historical trend of a typical day to see what is causing possible waits inside our systems. The very fact that we have some sort of CPU contention is quite evident with the “Red” bar consistently.
Who doesn’t like a little help from the system? Above is a typical example where we see high WRITELOG waits and if a junior DBA is looking at the system, they will not know where to start. On clicking on each of the waits, we are presented with a simple steps of documentation to what one can do to mitigate these waits. Though these recommendations are given, I highly encourage you to consult your senior DBA before taking any steps.
The storage trends were a nice visualization for me. The snapshot shows the top IO users and orders them based on that. Here you can see the tempDB ldf file seems to be under contention. The data shows a sparkline trend of latency for read/writes.
DPA also has the information of standard performance counters grouped by resources. This is great basic information and for a seasoned DBA, this can be an awesome tool to view from a single interface.
These are standard performance counters from the comfort of a web browser so a DBA can watch what is happening inside the server. One can set thresholds for various grouping and also keep alerts. These in my opinion significantly make a DBA proactive rather than reactive when a problem occurs.
If you are looking for detailed performance analysis, but lack the time and/or knowledge to decipher Wait Stats and server resource metrics, this tool will not disappoint you. It gives you pointers to work with the essential performance characteristics. Do let me know your experience working with Database Performance Analyzer, I would surely like to learn your perspective using this tool.
Reference: Pinal Dave (http://blog.sqlauthority.com)