SQL SERVER – Keeping MSDB System Database Lean and Fit – Notes from the Field #033

[Notes from Pinal]: I am working with SQL Server for almost 10 years. I have seen the most ignored databases on any database server is system database. We hardly pay attention to them. Just for the sake of it – try to run queries in your master database and see how many user databases already exists there. I am sure unless and until you have strict policies in the place, there will be a handful of the tables available there. Similarly try to ask DBA question about what is the exact importance of the MSDB database and you will find it that there are so many of the DBA without its knowledge. I often talk about this with my friend Tim. I asked him if he can help us out  with the tricks to keep MSDB system database at its optimal settings.

Linchpin People are database coaches and wellness experts for a data driven world. In this 33rd episode of the Notes from the Fields series database expert Tim Radney (partner at Linchpin People) explains in a very simple word how to keep MSDB system database at optimal settings. 


When doing a review of a client’s database server, one of the checks I perform is checking to see how much backup history is being kept within MSDB. SQL Server does a fantastic job logging all the backups and restores we perform and depending how your routine is configured you are probably only keeping enough backup files to meet your SLA’s. What most DBA’s or Admins don’t think about is all that backup and restore history being retained with MSDB.

The tables within msdb that hold this history include:

  • restorefile
  • restorefilegroup
  • restorehistory
  • backupfile
  • backupfilegroup
  • backupmediafamily
  • backupmediaset
  • backupset

Microsoft SQL Server has a system stored procedure that helps us maintain the history however it is not scheduled to run by default. We have to create a process to run the stored procedure on a routine basis. You will need to know how much history is enough to keep for your organization. I have seen this range from 90 days to 14 days.

The stored procedure is sp_delete_backupshistory within the MSDB database.

This stored procedure takes a single parameter of date. This date is the cutoff date of history to keep. Any history past the date provided will be deleted.

Over the years I have found MSDB ranging from several megabytes to nearly 20 GB.  I recall one client who had a database server with hundreds of databases that were being log shipped to a secondary with 15 minute log backup and the system had been in place for many years. MSDB had grown to almost 20 GB. The client had complained that if they ever tried to restore a database within the GUI that it would lock up SSMS.

A simple script to execute to purge this history older than 30 days is below.

USE msdb
GO
DECLARE @CutOffDate DATETIME
SET
@CutOffDate = CONVERT(VARCHAR(10), DATEADD(dd, -30,GETDATE()), 101)
EXEC sp_delete_backuphistory @CutOffDate
GO

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Note: Tim has also written an excellent book on SQL Backup and Recovery, a must have for everyone.

Reference: Pinal Dave (http://blog.sqlauthority.com)

About these ads

SQL SERVER – SQL Server High Availability Options – Notes from the Field #032

[Notes from Pinal]: When it is about High Availability or Disaster Recovery, I often see people getting confused. There are so many options available that when the user has to select what is the most optimal solution for their organization they are often confused. Most of the people even know the salient features of various options, but when they have to figure out one single option to use they are often not sure which option to use. I like to give ask my dear friend time all these kinds of complicated questions. He has a skill to make a complex subject very simple and easy to understand.

Linchpin People are database coaches and wellness experts for a data driven world. In this 26th episode of the Notes from the Fields series database expert Tim Radney (partner at Linchpin People) explains in a very simple words the best High Availability Option for your SQL Server. 


Working with SQL Server a common challenge we are faced with is providing the maximum uptime possible.  To meet these demands we have to design a solution to provide High Availability (HA). Microsoft SQL Server depending on your edition provides you with several options.  This could be database mirroring, log shipping, failover clusters, availability groups or replication.

Each possible solution comes with pro’s and con’s.  Not anyone one solution fits all scenarios so understanding which solution meets which need is important.  As with anything IT related, you need to fully understand your requirements before trying to solution the problem.  When it comes to building an HA solution, you need to understand the risk your organization needs to mitigate the most.

I have found that most are concerned about hardware failure and OS failures. Other common concerns are data corruption or storage issues.  For data corruption or storage issues you can mitigate those concerns by having a second copy of the databases. That can be accomplished with database mirroring, log shipping, replication or availability groups with a secondary replica.  Failover clustering and virtualization with shared storage do not provide redundancy of the data.

I recently created a chart outlining some pros and cons of each of the technologies that I posted on my blog.

I like to use this chart to help illustrate how each technology provides a certain number of benefits.  Each of these solutions carries with it some level of cost and complexity.  As a database professional we should all be familiar with these technologies so we can make the best possible choice for our organization.

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Note: Tim has also written an excellent book on SQL Backup and Recovery, a must have for everyone.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Good Value for Page Life Expectancy – Notes from the Field #026

[Notes from Pinal]: In the past, I have been wrong many times, but I was man enough to accept my mistakes and correct myself. Page Life Expectancy is a very similar subject for me. In the past when I had written a blog post based on Microsoft’s white paper, I was corrected by SQL Experts immediately for my error in judgement and incorrect information. I accepted my mistakes and corrected it. I just shared this story with my good friend Tim Radney and he was very kind to give me guidance on this subject. I asked him if he can write further on this topic and help people understand this complex subject in simple words.

Linchpin People are database coaches and wellness experts for a data driven world. In this 26th episode of the Notes from the Fields series database expert Tim Radney (partner at Linchpin People) explains in a very simple words what should be the good value for Page Life Expectancy (PLE).


When troubleshooting SQL Server performance issues one of the top places to look is memory.  One of the more common methods to check for memory pressure is to check to see the memory counter ‘Page Life Expectancy’ (PLE).  The value returned is represented in seconds.  A very old and dated recommendation is that this value be 300 or higher which is only 5 minutes.  With the standard servers in place today, this value is too low.

What is the proper value to monitor for your server, you may ask, well it really depends on the amount of memory allocated to your instance of SQL.  When working with my clients the value I start with is taking (max memory in GB) / 4 * 300.  This would be the minimum value I would expect to see.  On a server with 32GB allocated to SQL this value would be (32/4)*300 = 2400 seconds or 40 minutes.

For a server with 32 GB of ram, maintaining a PLE value of 2400 or higher is a good value, but what is equally important is to know what the baseline value is for the instance.  What is the average value the server has during certain times during the day?  If you capture this value and alarm/alert when you have big dips then you will be more equipped to get an early detection of a potential issue.  Things like improper indexed queries, large data pulls, etc.

I blogged awhile back and included the query that I run to retrieve the Page Life Expectancy.  http://timradney.com/2013/03/08/what-is-page-life-expectancy-ple-in-sql-server/

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Note: Tim has also written an excellent book on SQL Backup and Recovery, a must have for everyone.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Database File Names and Extentions – Notes from the Field #025

[Notes from Pinal]: People believe Shakespeare’s birth date is on April 26. ” What is there in a name?” – I often hear this statement, which famously quotes from Shakespeare.  I think we developer believe in this statement as most of our various are either BAR or FOO. However, this is extremely inefficient and not convenient. There are many reasons we should have meaningful names to our objects, variables and various elements. There are times when incorrect naming convention can just get user in trouble or lead organizations to disaster. My friend Tim shares this humorous and interesting story. Read on

Linchpin People are database coaches and wellness experts for a data driven world. In this 25th episode of the Notes from the Fields series database expert Tim Radney (partner at Linchpin People) explains in a very simple words of names and extensions.


Database file naming standards are important. With Microsoft SQL Server the default naming standard is not a requirement so administrators can change from the default .mdf, .ldf and .ndf.  Recently I encountered a SQL Server instance that was having performance issues. The customer was reporting very slow response times and couldn’t figure out what the issue was.

Reluctantly the administrator reached out to the database team to get a professional opinion.  For the database team this was the first they learned of this new server so we quickly performed our standard check. Two things stood out which was high CPU utilization and the executable consuming the most CPU was antivirus as well as high disk IO.  We stopped the “on access” scan and performance of the SQL Server improved as well as disk IO dropped very low.

We reached out to our antivirus team and was assured that our standard policies were applied to this server which includes excludes for our standards for database file naming.  We continued to research deviations from our standard server build against this instance, since we did not build it and that is when we found the admin/vendor/customer (non dba) chose a different naming standard for the files.  The extensions they chose were.001 for data, .002 for logs, .003 for ndf.

Due to not having proper excludes for those files by the antivirus software, the customer created a denial of service attack against their self. The on access scan effectively created such a load that it prevented anything else from running.

This is not the first time I have encountered this type of issue and blogged about this in 2012. http://timradney.com/2012/06/18/file-extensions-for-sql-server-database-files/

If you decide that for whatever reason you would like to deviate from the default naming standard or your companies naming standard for database files, please get with the people that manage your antivirus software to make sure they create exclusions for your new naming standard.

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Note: Tim has also written an excellent book on SQL Backup and Recovery, a must have for everyone.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Finding Jobs Shrinking Database Files – Notes from the Field #023

[Notes from Pinal]: Search on the internet about ‘Shrinking Database’, you will find plenty of advice why it is bad and it should not be done. If you do not believe me, just try it yourself. Even I have blogged about it before that it is an absolutely bad thing to do. However, there are very few blogs which talks about how to solve this ancient problem. The reason, I call it ancient problem is that there are quite a few organizations which are doing this for many years and they have many different places where they have included Shrinking Database code. There are quite a few places, where I see that people have created database jobs to shrink the database. I was talking to Tim about this and he comes up with a beautifully simple script where he demonstrated how to find out jobs which are shrinking database files.

Linchpin People are database coaches and wellness experts for a data driven world. In this 16th episode of the Notes from the Fields series database expert Tim Radney (partner at Linchpin People) explains a very simple script to find out jobs shrinking database files.


Often when analyzing a SQL Server Instance I come across jobs that are automating shrinking a database file.  Checking for a shrinking operation is important when analyzing a server for a number of reasons.  One is that anytime a data file or log file has to grow, transactions have to wait until the growth operation is complete thus causing a performance impact.  Shrinking a log file can contribute to high virtual log file counts and shrinking a data file will lead to fragmenting the database.

I check to see if any database maintenance plans exist and if so I check to see if auto shrink has been chosen, occasionally I find this is the case.  I also have a TSQL script that will search for any reference to the word ‘shrink’ in a job step.  I find custom jobs like this more often than auto shrink being used in a maintenance plan.  Often times the justification for someone configuring a job like this is due to drive space issues.

In cases where the shrink is to reclaim drive space it is most often due to not having a proper backup routine in place for the transaction logs or a process that keeps a transaction open for an extended period of time.  The transaction log will need to be large enough to handle your index maintenance, ETL processes and transactions that occur between log backups.   If you find that you are having an erratic large growth, then you will need to examine what transactions or processes are holding open an active transaction for such a long duration or the frequency of your log backups.

Below is the script I use to search for the word shrink in any tsql job step.

DECLARE @search VARCHAR(100)
SET @Search = 'shrink'
SELECT  A.[job_id],
B.[name],
[step_id],
[step_name],
[command],
[database_name]
FROM    [msdb].[dbo].[sysjobsteps] A
JOIN [msdb].dbo.sysjobs B ON A.job_id = B.[job_id]
WHERE   command LIKE '%' + @Search + '%'
ORDER BY [database_name],
B.[name],
[step_id]

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Note: Tim has also written an excellent book on SQL Backup and Recovery, a must have for everyone.


Related Articles:

Reference: Pinal Dave (http://blog.sqlauthority.com)