Hey DBA – Baselines and Performance Monitoring – Why? – Notes from the Field #058

[Note from Pinal]: This is a 58th episode of Notes from the Field series. Mike Walsh is an amazing person to meet and know. I am fortunate to be his friend. Every single time I meet him, I learn something as a human. He is always looking for an opportunity to improve the life of DBAs and Developers.

In one of the recent communication with Mike, I had asked him a question that what is actually Baselines and Performance Monitoring? How do we define what is normal and what is not normal? Where do we draw lines when we talk about performance? Database expert Mike Walsh decided to help us with the answer of this question.

Read the entire story in his own words.


Last time it was my time to post here on SQL Authority, I suggested that DBAs make a proactive investment in their environments.

I can’t stress this enough – procrastination is our enemy as technologists. Especially as DBAs. We have things we should do on a regular basis to keep our environments moving along. I gave a few suggestions of places to start in that post three months ago.  If you’ve not started down the road on those things, stop reading this and do those things first.

If you’ve started being proactive in those areas already? Keep reading and let’s talk about being proactive in another area:

Baselines and Performance Monitoring – Why?

It’s really important to know what “normal” looks like for a lot of reasons when it comes to performance in your environment.

The answers to these questions all require you to know what normal looks like:

  • When will this environment exhaust the resources available in our current setup?
  • Things are slow – what’s the cause?
  • Can we consolidate onto fewer servers?

Without having a sense of what normal looks like for server performance – you can’t accurately answer those questions. For some of them you need to watch what normal looks like over time. And normal can look different on different servers.

Regularly looking at your performance metrics and comparing them on a regular basis is an important step in being proactive. The process will teach you about your environment and let you see trends develop over time. It will also teach you about SQL Server performance more as you dig into the various counters and understand what they mean.

Baselines and Performance Monitoring – An Approach                 

There are a lot of approaches to take here. You can run your favorite perfmon counters on a regular basis and look at the results in Excel. You can use a performance and event monitoring tool like SQL Sentry – and look at the data it collects over time. Or any number of approaches in between.

While I often advise my consulting clients to use several of these approaches – a good place for you to start learning and doing is with a free tool called PAL – Performance Analysis of Logs. This tool is available on Codeplex and I describe how to use this tool in a couple blog posts and videos on the Linchpin People blog (http://www.linchpinpeople.com/sql-server-performance-baseline-presentation-seacoast-sql/ is a presentation I give on baselines and baseline metric collection; http://www.linchpinpeople.com/how-create-sql-server-baseline-using-pal/ shows a couple videos on how to use the PAL tool and create a perfmon template).

Rather than regurgitate what those resources, and the posts I link to in them, describe – I’ll suggest a few action items that you can follow right now. Those links will help you on your journey here. Even if your environment has no issues, the baseline can prove valuable to you in the future – and the process is guaranteed to teach you something about SQL Server performance you didn’t know before you started. I’ve been working with SQL server for 16 years and I still learn something every time I engage in serious baseline study.

Baselines and Performance Monitoring – Action Plan

Some suggested next steps you can take to get on the road to being proactive about performance:

  • Learn about the PAL tool – use the links here to start.
  • Collect “Fake” Performance Data – Start with a test environment if you aren’t sure, watch the impact and get comfortable with the process.
  • Collect Real Performance Data – Once you are comfortable with the approach, do the steps in production. Track some data for a couple days, a day or a week depending on the sample interval you go with.
  • Analyze The Data – Run the performance metrics through the PAL tool and look for alerts. Keep in mind that not all alerts require immediate action, but look at the alerts and learn about them. Read the tips in the PAL tool, look at blogs like this one to understand what is going on and see if you have issues.
  • Rinse and Repeat – Try this again in a month or a few months. What’s changed? Are you doing more activity (maybe batches/second or transactions/sec or user count) and seeing your resources still performing well? That’s great. Do you see performance slowing down but activity not really that much higher? Dig in and see why you are falling off your baseline.

The point is – get started looking at your baselines. You may not have an urgent and immediate need for a good set of baselines today – but when that day comes, you’ll wish you took the time to collect that data. I can’t tell you how often we get called in to help a customer experiencing performance concerns and a baseline would help at least get a sense for what changed from when things were good. We can still help get them out of their mess all the same – but if we had a few baselines over time we can sometimes more quickly figure out where the most beneficial changes can come.

If you want to get started with performance analytics and triage of virtualized SQL Servers with the help of experts, read more over at Fix Your SQL Server.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Fixing Backup Error – Operating system error 5(Access is denied.) – SQL in Sixty Seconds #077

This error has to be one of the most famous error. I have installed SQL Server quite a many times, but I keep on getting this error once in a while. This is also one of the most searched error online. I have previously blogged about this but today, I planned to create a small video on the same subject. In this video I demonstrate how one can resolve the permissions issue with this error.

ere are few other blog posts related to this error, which I have written earlier.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – List the Name of the Months Between Date Ranges

Here is another interesting question, I received the other day.

“How to list the name of the months between two date ranges?”

Very interesting question. I had no script ready for it so I asked my friend who used to my co-worker earlier and he has sent me the script which is listed below.

DECLARE @StartDate  DATETIME,
@EndDate    DATETIME;
SELECT @StartDate = '20140301' -- March
,@EndDate   = '20140901'; -- September
SELECT  DATENAME(MONTH, DATEADD(MONTH, nos.monthnos, @StartDate)) AS MonthName
FROM    (SELECT 1 monthnos UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6
UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
UNION SELECT 10 UNION SELECT 11 UNION SELECT 12) nos
WHERE     nos.monthnos <= DATEDIFF(MONTH, @StartDate, @EndDate);

Above script does return name of the months between date ranges. Let me know if there is any other way to achieve the same.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER -Fix Error – Cannot open backup device. Operating system error 5(Access is denied.)

This has to be the most popular error in taking database backups. There are many reasons for this error and there are different ways to resolve it. I recently faced this error and for a while I was taking backup and found a very interesting solution to the problem. The error was as mentioned below:

Msg 3201, Level 16, State 1, Line 1
Cannot open backup device ‘d:\Log\aw.bak’. Operating system error 5(Access is denied.).
Msg 3013, Level 16, State 1, Line 1
BACKUP DATABASE is terminating abnormally.

Personally, I know this error happens when

  1. There is no directory with Log
  2. There is not enough space on drive d:\
  3. There is not enough permissions on d:\Log drive

In my case all of the above were not the reason for above error. The error was there because of the file was read only. There was already another backup file which existed in the folder with the name aw.bak and it was marked as read only. Due to the same reason, my backup was failing. Once I removed the checkbox for reading only, the backup was successful.

Here are few other blog posts related to this error, which I have written earlier.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Finding if Current Week is Odd or Even – Script

Here is an interesting question I received from my friend who is working in Bank as a DBA.

“Pinal,

We have a requirement in bank that every 2nd and 4th week we keep more cash in our bank where as we can keep less cash on other weeks. I want to write an automated script which indicates that if the current week is ODD or EVEN. Based on this information, I can write more actions in my procedures. Do you have such script which can help me?”

Very interesting question. The matter of fact, I have a script which I have been using quite a while for similar logic. The script is not written by me, but I have it with me as a resource for quite a while. Here is the script.

DECLARE @CurDate DATETIME
SET
@CurDate = GETDATE()
SELECT
WeekOfMoth = DATEPART(wk, @CurDate)
-
DATEPART(wk,DATEADD(m, DATEDIFF(M, 0, @CurDate), 0)) + 1,
CASE WHEN (DATEPART(wk, @CurDate)
-
DATEPART(wk,DATEADD(m, DATEDIFF(M, 0, @CurDate), 0)) + 1) % 2 = 1
THEN 'Odd' ELSE 'Even' END EvenOrOdd

If I run above script for today’s date 12/7/2014, it will give me following results.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Reverse String Word By Word

Earlier I wrote a blog post where I have reversed the entire string which was passed to the function. You can read that over here User Defined Functions (UDF) to Reverse String. In that blog post I wrote a function which would reverse an entire string. However, recently I read a question in SQLBangalore where the user wanted to reverse string, but wanted to keep all the words in the same order. For example,

If the string is – “I am Pinal Dave”, it should be reversed as “Dave Pinal am I.”

Well, here is the function which does the same task.

CREATE FUNCTION [dbo].[fn_ReverseWordsInSentence]
(
@ip VARCHAR(MAX)
)
RETURNS VARCHAR(MAX)
BEGIN
DECLARE
@op VARCHAR(MAX)
SET  @op = ''
DECLARE @Lenght INT
WHILE
LEN(@ip) > 0
BEGIN
IF CHARINDEX
(' ', @ip) > 0
BEGIN
SET
@op = SUBSTRING(@ip,0,CHARINDEX(' ', @ip)) + ' ' + @op
SET @ip = LTRIM(RTRIM(SUBSTRING(@ip,CHARINDEX(' ', @ip) + 1,LEN(@ip))))
END
ELSE
BEGIN
SET
@op = @ip + ' ' + @op
SET @ip = ''
END
END
RETURN
@op
END
-- Usage
SELECT  [dbo].[fn_ReverseWordsInSentence] ('My Name is Pinal Dave')
/*
Dave Pinal is Name My
*/

Let me know if you have a better way to solve this problem.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – FIX: Error 5161 – An Unexpected file id was Encountered

Recently I was planning to give a demo about increasing tempdb files and its performance benefit. So while making the demo at home, I was doing multi-tasking. Talking to my daughter, having food and making demo. So, I ran the script and restarted SQL Server service but SQL Service didn’t start.

I looked into the ERRORLOG (and I would suggest you to look at that log in case of any SQL startup issues) and found below errors before SQL shutdown messages.

2014-12-02 17:03:24.42 spid18s     Error: 5161, Severity: 16, State: 1.

2014-12-02 17:03:24.42 spid18s     An unexpected file id was encountered. File id 3 was expected but 4 was read from “D:\TempDB\tempdev3.ndf”. Verify that files are mapped correctly in sys.master_files. ALTER DATABASE can be used to correct the mappings.

2014-12-02 17:03:24.42 spid18s     Could not create tempdb. You may not have enough disk space available. Free additional disk space by deleting other files on the tempdb drive and then restart SQL Server. Check for additional errors in the event log that may indicate why the tempdb files could not be initialized.

2014-12-02 17:03:24.93 spid18s     SQL Trace was stopped due to server shutdown. Trace ID = ‘1’. This is an informational message only; no user action is required.

Since error message before shutdown were talking about tempDB database, I knew that I might have done something wrong while doing multi-tasking. I went back to query windows and read it completely. Initially I added files on C drive (first two commands) but later realized that I should not keep them on C as I already have space issues there, so I moved them using MODIFY FILE (last two commands)

USE [master]
GO
ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev2',
FILENAME = N'C:\TempDB\tempdev2.ndf', SIZE = 1024MB, FILEGROWTH = 0)
GO
ALTER DATABASE [tempdb] ADD FILE ( NAME = N'tempdev3',
FILENAME = N'C:\TempDB\tempdev3.ndf', SIZE = 1024MB, FILEGROWTH = 0)
GO

-- Space issues on C, moving them to D
ALTER DATABASE [tempdb] MODIFY FILE
( NAME = tempdev2, FILENAME = N'D:\TempDB\tempdev3.ndf', SIZE = 10MB, FILEGROWTH = 0)
GO
ALTER DATABASE [tempdb] MODIFY FILE
( NAME = tempdev3, FILENAME = N'D:\TempDB\tempdev3.ndf', SIZE = 10MB, FILEGROWTH = 0)
GO

I called my daughter to check and she told me that she can see tempdev3 many times. Bummer! The second last command should have said tempdev2. mdf

I understood the problem but SQL is not getting started now, how can I modify it? Well, the easiest option is removed and add the file with a proper name

1. Start SQL via trace flag 3608

From the command prompt, we can use

net start mssqlserver /T3608

My machine has a default instance, that’s why I have used MSSQLServer. For named instance, we need to use MSSQL$<InstanceName>

If we open ERRORLOG, we should see below

Recovering only master database because traceflag 3608 was specified. This is an informational message only. No user action is required.

2. Run ALTER Database

Run ALTER DATABASE command as below to remove unwanted files. I have connected via SQLCMD and I am removing Tempdev2 because that has file as tempdev3. This can be done from Management Studio as well.

We can also run below query to find the current mapping. Database ID = 2 is always tempDB database.

SELECT name, physical_name FROM sys.master_files WHERE database_id = 2

In my case I got below output before removal.

NAME                PHYSICAL_NAME
-------------       -------------------------------
tempdev             E:\...\tempdb.mdf
templog             E:\...\templog.ldf
tempdev2            E:\TempDB\tempdev3.ndf
tempdev3            E:\TempDB\tempdev3.ndf

Based on the output you receive, you have to make appropriate changes.

3. Stop SQL Service and start normally.

Once above steps are performed, we should be able to connect to SQL and make changes to TempDB correctly this time.

Have you ever encountered similar tempdb errors? How did you fix them?

Reference: Pinal Dave (http://blog.sqlauthority.com)