If you are an SQL Server Consultant, there is never a single dull moment in your life. Quite often you are called in for fixing something, but then you always end up fixing something else!
I was recently working on an offshore project where I was called in to tune high transaction OLTP server. During work, I demanded that I should have a server which is very similar to live database so I could inspect all the settings and data. I may end up running a few queries which may or may not change the server settings. The Sr. DBA agreed and instructed the Jr. DBA accordingly to restore a database from their backup.
In a few minutes, their Jr. DBA arrived running to us, asking for our help. The problem was that when he attempted to restore the database, he was not able to restore the database as their backup. This eventually threw an error. Now this was very alarming for them because their backup file had some issues. I suggested trying some backups but again, the same issue occurred on all of them.
The CTO, who was also present at the location, got very upset with this situation. He then asked when the last successful restore test was done. As expected, the answer was NEVER. There were no successful restore tests done before.
During that time, I was present and I could clearly see the stress, confusion, carelessness and anger around me. I did not appreciate the feeling and I was pretty sure that no one in there wanted the atmosphere like me.
The next step was to take a backup of their database right away. The CTO wanted to observe the complete process so he could figure out where the error originated and where was its exact location.
Here is the error they were receiving when they attempted to restore the database:
Msg 3243, Level 16, State 1, Line 1
The media family on device ‘D:\TestDB.bak’ was created using Microsoft Tape Format version 1.22. SQL Server supports version 1.0.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.
The first thing I attempted to do is to take a backup of the database on another spare drive by myself. After having that backup, I tried to restore it to the database. As what I expected, it was successfully restored. Now that event provided me the information that there was nothing wrong with how the backup was done. The problem was located somewhere else. I asked them about their current process of taking a backup and storing it. They provided me the following diagram:
Looking at the diagram, I decided to test their FTP software. I uploaded our organization’s FTP space. After it was done, I restored the database again and it worked without an error. This led me to think that there is something wrong in their UNIX server. After talking to their UNIX administrator, he informed me that they were running some cron jobs to compress the files and they were also moving those files to a larger drive. After listening to this, I was very sure there was something wrong during that process which had something to do with the .bak file because it was not able to work properly.
After a few tests, the UNIX administrator accepted that that cron job was changing the structure of the file. In the meantime, he disabled the process to avoid this corruption. The CTO then asked me if there was any way they could tell if the backup which was already taken was good as gold or if there was a corruption in it.
Here is the command that you can run on your database file so you can check if the database file is intact or if it has any corruption in it:
The question is, “Do you practice this on your production server’s backup?”
Reference: Pinal Dave (http://blog.SQLAuthority.com)