SQL SERVER – Keeping MSDB System Database Lean and Fit – Notes from the Field #033

[Notes from Pinal]: I am working with SQL Server for almost 10 years. I have seen the most ignored databases on any database server is system database. We hardly pay attention to them. Just for the sake of it – try to run queries in your master database and see how many user databases already exists there. I am sure unless and until you have strict policies in the place, there will be a handful of the tables available there. Similarly try to ask DBA question about what is the exact importance of the MSDB database and you will find it that there are so many of the DBA without its knowledge. I often talk about this with my friend Tim. I asked him if he can help us out  with the tricks to keep MSDB system database at its optimal settings.

Linchpin People are database coaches and wellness experts for a data driven world. In this 33rd episode of the Notes from the Fields series database expert Tim Radney (partner at Linchpin People) explains in a very simple word how to keep MSDB system database at optimal settings. 


When doing a review of a client’s database server, one of the checks I perform is checking to see how much backup history is being kept within MSDB. SQL Server does a fantastic job logging all the backups and restores we perform and depending how your routine is configured you are probably only keeping enough backup files to meet your SLA’s. What most DBA’s or Admins don’t think about is all that backup and restore history being retained with MSDB.

The tables within msdb that hold this history include:

  • restorefile
  • restorefilegroup
  • restorehistory
  • backupfile
  • backupfilegroup
  • backupmediafamily
  • backupmediaset
  • backupset

Microsoft SQL Server has a system stored procedure that helps us maintain the history however it is not scheduled to run by default. We have to create a process to run the stored procedure on a routine basis. You will need to know how much history is enough to keep for your organization. I have seen this range from 90 days to 14 days.

The stored procedure is sp_delete_backupshistory within the MSDB database.

This stored procedure takes a single parameter of date. This date is the cutoff date of history to keep. Any history past the date provided will be deleted.

Over the years I have found MSDB ranging from several megabytes to nearly 20 GB.  I recall one client who had a database server with hundreds of databases that were being log shipped to a secondary with 15 minute log backup and the system had been in place for many years. MSDB had grown to almost 20 GB. The client had complained that if they ever tried to restore a database within the GUI that it would lock up SSMS.

A simple script to execute to purge this history older than 30 days is below.

USE msdb
GO
DECLARE @CutOffDate DATETIME
SET
@CutOffDate = CONVERT(VARCHAR(10), DATEADD(dd, -30,GETDATE()), 101)
EXEC sp_delete_backuphistory @CutOffDate
GO

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Note: Tim has also written an excellent book on SQL Backup and Recovery, a must have for everyone.

Reference: Pinal Dave (http://blog.sqlauthority.com)

About these ads

SQL SERVER – SQL Server High Availability Options – Notes from the Field #032

[Notes from Pinal]: When it is about High Availability or Disaster Recovery, I often see people getting confused. There are so many options available that when the user has to select what is the most optimal solution for their organization they are often confused. Most of the people even know the salient features of various options, but when they have to figure out one single option to use they are often not sure which option to use. I like to give ask my dear friend time all these kinds of complicated questions. He has a skill to make a complex subject very simple and easy to understand.

Linchpin People are database coaches and wellness experts for a data driven world. In this 26th episode of the Notes from the Fields series database expert Tim Radney (partner at Linchpin People) explains in a very simple words the best High Availability Option for your SQL Server. 


Working with SQL Server a common challenge we are faced with is providing the maximum uptime possible.  To meet these demands we have to design a solution to provide High Availability (HA). Microsoft SQL Server depending on your edition provides you with several options.  This could be database mirroring, log shipping, failover clusters, availability groups or replication.

Each possible solution comes with pro’s and con’s.  Not anyone one solution fits all scenarios so understanding which solution meets which need is important.  As with anything IT related, you need to fully understand your requirements before trying to solution the problem.  When it comes to building an HA solution, you need to understand the risk your organization needs to mitigate the most.

I have found that most are concerned about hardware failure and OS failures. Other common concerns are data corruption or storage issues.  For data corruption or storage issues you can mitigate those concerns by having a second copy of the databases. That can be accomplished with database mirroring, log shipping, replication or availability groups with a secondary replica.  Failover clustering and virtualization with shared storage do not provide redundancy of the data.

I recently created a chart outlining some pros and cons of each of the technologies that I posted on my blog.

I like to use this chart to help illustrate how each technology provides a certain number of benefits.  Each of these solutions carries with it some level of cost and complexity.  As a database professional we should all be familiar with these technologies so we can make the best possible choice for our organization.

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Note: Tim has also written an excellent book on SQL Backup and Recovery, a must have for everyone.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Planned and Unplanned Availablity Group Failovers – Notes from the Field #031

[Note from Pinal]: This is a new episode of Notes from the Fields series. AlwaysOn is a very complex subject and not everyone knows many things about this. The matter of the fact is there is very little information available on this subject online and not everyone knows everything about this. This is why when a very common question related to AlwaysOn comes, people get confused.

In this episode of the Notes from the Field series database expert John Sterrett (Group Principal at Linchpin People) explains a very common issue DBAs and Developer faces in their career and is related to Planned and Unplanned Availablity Group Failovers. Linchpin People are database coaches and wellness experts for a data driven world. Read the experience of John in his own words.


Whenever a disaster occurs it will be a stressful scenario regardless of how small or big the disaster is. This gets multiplied when it is your first time working with newer technology or the first time you are going through a disaster without a proper run book. Today, were going to help you establish a run book for creating a planned failover with availability groups.

To make today’s session simple were going to have two instances of SQL Server 2012 included in an availability group and walk through the steps of doing an unplanned failover.  We will focus on using the user interface and T-SQL to complete the failovers. We are going to use a two replica Availability Group where each replica is in another location. Therefore, we will be covering Asynchronous (non automatic failover) the following is a breakdown of our availability group utilized today.

Seeing the following screen might be scary the first time you come across an unplanned failover.  It looks like our test database used in this Availability Group is not functional and it currently isn’t. The database status is not synchronizing which makes sense because the primary replica went down so it couldn’t synchronize. With that said, we can still failover and make it functional while we troubleshoot why we lost our primary replica.

To start we are going to right click on the availability group that needs to be restarted and select failover.

This will bring up the following wizard, which will walk you through several steps needed to complete the failover using the graphical user interface provided with SQL Server Management Studio (SSMS).

You are going to see warning messages simply because we are in Asynchronous commit mode and can not guarantee ‘no data loss’ when we do failover.

Just incase you missed it; you get another screen warning you about potential data loss because we are in Asynchronous mode.

Next we get to connect to the specific replica we want to become the primary replica after the failover occurs. In our case, we only have two replicas so this is trivial.

In order to failover, it’s required to connect to the replica that will become primary.  The following screen shows that the connection has been made successfully.

Next, you will see the final summary screen. Once again, this reminds you that the failover action will cause data loss as were using Asynchronous commit mode due to the distance between instances used for disaster recovery.

Finally, once the failover is completed you will see the following screen.

If you followed along this long you might be wondering what T-SQL scripts are generated for clicking through all the sections of the wizard. If you have used Database Mirroring in the past you might be surprised.  It’s not too different, which makes sense because the data is being replicated via SQL Server endpoints just like the good old database mirroring.

Now were going to take a look at how to do a failover with just T-SQL. First, were going to need to open a new query window and run our query in SQLCMD mode. Just incase you haven’t used SQLCMD mode before we will show you how to enable it below.

Now you can run the following statement. Notice, we connect to the replica we want to become primary after failover and specify to force failover to allow data loss. We can use the following script to failback over when our primary instance comes back online.

-- YOU MUST EXECUTE THE FOLLOWING SCRIPT IN SQLCMD MODE.
:Connect SQL2012PROD1
ALTER AVAILABILITY GROUP [AGSQL2] FORCE_FAILOVER_ALLOW_DATA_LOSS;
GO

Are your servers running at optimal speed or are you facing any SQL Server Performance Problems? If you want to get started with the help of experts read more over here: Fix Your SQL Server.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Parsing SSIS Catalog Messages – Notes from the Field #030

[Note from Pinal]: This is a new episode of Notes from the Field series. SQL Server Integration Service (SSIS) is one of the most key essential part of the entire Business Intelligence (BI) story. It is a platform for data integration and workflow applications. The tool may also be used to automate maintenance of SQL Server databases and updates to multidimensional cube data.

In this episode of the Notes from the Field series I requested SSIS Expert Andy Leonard to discuss one of the most interesting concepts of SSIS Catalog Messages. There are plenty of interesting and useful information captured in the SSIS catalog and we will learn together how to explore the same.


The SSIS Catalog captures a lot of cool information by default. Here’s a query I use to parse messages from the catalog.operation_messages table in the SSISDB database, where the logged messages are stored.

This query is set up to parse a default message transmitted by the Lookup Transformation. It’s one of my favorite messages in the SSIS log because it gives me excellent information when I’m tuning SSIS data flows. The message reads similar to:

Data Flow Task:Information: The Lookup processed 4485 rows in the cache. The processing time was 0.015 seconds. The cache used 1376895 bytes of memory.

The query:

USE SSISDB
GO
DECLARE @MessageSourceType INT = 60
DECLARE @StartOfIDString VARCHAR(100) = 'The Lookup processed '
DECLARE @ProcessingTimeString VARCHAR(100) = 'The processing time was '
DECLARE @CacheUsedString VARCHAR(100) = 'The cache used '
DECLARE @StartOfIDSearchString VARCHAR(100) = '%' + @StartOfIDString + '%'
DECLARE @ProcessingTimeSearchString VARCHAR(100) = '%' + @ProcessingTimeString + '%'
DECLARE @CacheUsedSearchString VARCHAR(100) = '%' + @CacheUsedString + '%'
SELECT operation_id
, SUBSTRING(MESSAGE,
(
PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1),
((
CHARINDEX(' ',
MESSAGE,
PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1))
-
(
PATINDEX(@StartOfIDSearchString, MESSAGE) + LEN(@StartOfIDString) + 1))) AS LookupRowsCount
, SUBSTRING(MESSAGE,
(
PATINDEX(@ProcessingTimeSearchString,MESSAGE) + LEN(@ProcessingTimeString) + 1),
((
CHARINDEX(' ',
MESSAGE,
PATINDEX(@ProcessingTimeSearchString,MESSAGE) + LEN(@ProcessingTimeString) + 1))
-
(
PATINDEX(@ProcessingTimeSearchString, MESSAGE) + LEN(@ProcessingTimeString) + 1))) AS LookupProcessingTime
, CASE WHEN (CONVERT(numeric(3,3),SUBSTRING(MESSAGE,
(
PATINDEX(@ProcessingTimeSearchString,MESSAGE) + LEN(@ProcessingTimeString) + 1),
((
CHARINDEX(' ',
MESSAGE,
PATINDEX(@ProcessingTimeSearchString,MESSAGE) + LEN(@ProcessingTimeString) + 1))
-
(
PATINDEX(@ProcessingTimeSearchString, MESSAGE) + LEN(@ProcessingTimeString) + 1))))) = 0
THEN 0
ELSE CONVERT(bigint,SUBSTRING(MESSAGE,
(
PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1),
((
CHARINDEX(' ',
MESSAGE,
PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1))
-
(
PATINDEX(@StartOfIDSearchString, MESSAGE) + LEN(@StartOfIDString) + 1))))
/
CONVERT(numeric(3,3),SUBSTRING(MESSAGE,
(
PATINDEX(@ProcessingTimeSearchString,MESSAGE) + LEN(@ProcessingTimeString) + 1),
((
CHARINDEX(' ',
MESSAGE,
PATINDEX(@ProcessingTimeSearchString,MESSAGE) + LEN(@ProcessingTimeString) + 1))
-
(
PATINDEX(@ProcessingTimeSearchString, MESSAGE) + LEN(@ProcessingTimeString) + 1))))
END AS LookupRowsPerSecond
, SUBSTRING(MESSAGE,
(
PATINDEX(@CacheUsedSearchString,MESSAGE) + LEN(@CacheUsedString) + 1),
((
CHARINDEX(' ',
MESSAGE,
PATINDEX(@CacheUsedSearchString,MESSAGE) + LEN(@CacheUsedString) + 1))
-
(
PATINDEX(@CacheUsedSearchString, MESSAGE) + LEN(@CacheUsedString) + 1))) AS LookupBytesUsed
,CASE WHEN (CONVERT(bigint,SUBSTRING(MESSAGE,
(
PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1),
((
CHARINDEX(' ',
MESSAGE,
PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1))
-
(
PATINDEX(@StartOfIDSearchString, MESSAGE) + LEN(@StartOfIDString) + 1)))))= 0
THEN 0
ELSE CONVERT(bigint,SUBSTRING(MESSAGE,
(
PATINDEX(@CacheUsedSearchString,MESSAGE) + LEN(@CacheUsedString) + 1),
((
CHARINDEX(' ',
MESSAGE,
PATINDEX(@CacheUsedSearchString,MESSAGE) + LEN(@CacheUsedString) + 1))
-
(
PATINDEX(@CacheUsedSearchString, MESSAGE) + LEN(@CacheUsedString) + 1))))
/
CONVERT(bigint,SUBSTRING(MESSAGE,
(
PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1),
((
CHARINDEX(' ',
MESSAGE,
PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1))
-
(
PATINDEX(@StartOfIDSearchString, MESSAGE) + LEN(@StartOfIDString) + 1))))
END AS LookupBytesPerRow
FROM [catalog].[operation_messages]
WHERE message_source_type = @MessageSourceType
AND MESSAGE LIKE @StartOfIDSearchString
GO

Note that you have to set some parameter values:

  • @MessageSourceType [int] – represents the message source type value from the following results:
    Value     Description
    10           Entry APIs, such as T-SQL and CLR Stored procedures
    20           External process used to run package (ISServerExec.exe)
    30           Package-level objects
    40           Control Flow tasks
    50           Control Flow containers
    60           Data Flow task
    70           Custom execution message
    Note: Taken from Reza Rad’s (excellent!) helper.MessageSourceType table found here.
  • @StartOfIDString [VarChar(100)] – use this to uniquely identify the message field value you wish to parse.
    In this case, the string ‘The Lookup processed ‘ identifies all the Lookup Transformation messages I desire to parse.
  • @ProcessingTimeString [VarChar(100)] – this parameter is message-specific. I use this parameter to specifically search the message field value for the beginning of the Lookup Processing Time value.
  • For this execution, I use the string ‘The processing time was ‘.
  • @CacheUsedString [VarChar(100)] – this parameter is also message-specific. I use this parameter to specifically search the message field value for the beginning of the Lookup Cache  Used value. It returns the memory used, in bytes.
    For this execution, I use the string ‘The cache used ‘.
  • The other parameters are built from variations of the parameters listed above.

The query parses the values into text. The string values are converted to numeric values for ratio calculations; LookupRowsPerSecond and LookupBytesPerRow. Since ratios involve division, CASE statements check for denominators that equal 0.

Here are the results in an SSMS grid:

This is not the only way to retrieve this information. And much of the code lends itself to conversion to functions. If there is interest, I will share the functions in an upcoming post.

If you want to get started with SSIS with the help of experts, read more over at Fix Your SQL Server.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Windows File/Folder and Share Permissions – Notes from the Field #029

[Note from Pinal]: This is a 29th episode of Notes from the Field series. Security is the task which we should give it to the experts. If there is a small overlook or misstep, there are good chances that security of the organization is compromised. This is very true, but there are always devils’s advocates who believe everyone should know the security. As a DBA and Administrator, I often see people not taking interest in the Windows Security hiding behind the reason of not expert of Windows Server. We all often miss the important mission statement for the success of any organization – Teamwork. In this blog post Brian tells the story in very interesting lucid language. Read On!

In this episode of the Notes from the Field series database expert Brian Kelley explains a very crucial issue DBAs and Developer faces on their production server. Linchpin People are database coaches and wellness experts for a data driven world. Read the experience of Brian in his own words.


When I talk security among database professionals, I find that most have at least a working knowledge of how to apply security within a database. When I talk with DBAs in particular, I find that most have at least a working knowledge of security at the server level if we’re speaking of SQL Server. One area I see continually that is weak is in the area of Windows file/folder (NTFS) and share permissions.

The typical response is, “I’m a database developer and the Windows system administrator is responsible for that.” That may very well be true – the system administrator may have the primary responsibility and accountability for file/folder and share security for the server. However, if you’re involved in the typical activities surrounding databases and moving data around, you should know these permissions, too. Otherwise, you could be setting yourself up where someone is able to get to data he or she shouldn’t, or you could be opening the door where human error puts bad data in your production system.

File/Folder Permission Basics:

I wrote about file/folder permissions a few years ago to give the basic permissions that are most often seen. Here’s what you must know as a minimum at the file/folder level:

  • Read – Allows you to read the contents of the file or folder. Having read permissions allows you to copy the file or folder.
  • Write  – Again, as the name implies, it allows you to write to the file or folder. This doesn’t include the ability to delete, however, nothing stops a person with this access from writing an empty file.
  • Delete – Allows the file/folder to be deleted. If you overwrite files, you may need this permission.
  • Modify – Allows read, write, and delete.
  • Full Control – Same as modify + the ability to assign permissions.

File/Folder permissions aggregate, unless there is a DENY (where it trumps, just like within SQL Server), meaning if a person is in one group that gives Read and antoher group that gives Write, that person has both Read and Write permissions.

As you might expect me to say, always apply the Principle of Least Privilege. This likely means that any additional permission you might add does not need Full Control.

Share Permission Basics:

At the share level, here are the permissions.

  • Read – Allows you to read the contents on the share.
  • Change – Allows you to read, write, and delete contents on the share.
  • Full control – Change + the ability to modify permissions.

Like with file/folder permissions, these permissions aggregate, and DENY trumps.

So What Access Does a Person / Process Have?

Figuring out what someone or some process has depends on how the location is being accessed:

  • Access comes through the share (\\ServerName\Share) – a combination of permissions is considered.
  • Access is through a drive letter (C:\, E:\, S:\, etc.) – only the file/folder permissions are considered.

The only complicated one here is access through the share. Here’s what Windows does:

  1. Figures out what the aggregated permissions are at the file/folder level.
  2. Figures out what the aggregated permissions are at the share level.
  3. Takes the most restrictive of the two sets of permissions.

You can test this by granting Full Control over a folder (this is likely already in place for the Users local group) and then setting up a share. Give only Read access through the share, and that includes to Administrators (if you’re creating a share, likely you have membership in the Administrators group). Try to read a file through the share. Now try to modify it. The most restrictive permission is the Share level permissions. It’s set to only allow Read. Therefore, if you come through the share, it’s the most restrictive.

Does This Knowledge Really Help Me?

In my experience, it does. I’ve seen cases where sensitive files were accessible by every authenticated user through a share. Auditors, as you might expect, have a real problem with that. I’ve also seen cases where files to be imported as part of the nightly processing were overwritten by files intended from development. And I’ve seen cases where a process can’t get to the files it needs for a process because someone changed the permissions.

If you know file/folder and share permissions, you can spot and correct these types of security flaws. Given that there are a lot of database professionals that don’t understand these permissions, if you know it, you set yourself apart. And if you’re able to help on critical processes, you begin to set yourself up as a linchpin (link to .pdf) for your organization.

If you want to get started with performance tuning and database security with the help of experts, read more over at Fix Your SQL Server.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – SSIS Look Up Component – Cache Mode – Notes from the Field #028

[Notes from Pinal]: Lots of people think that SSIS is all about arranging various operations together in one logical flow. Well, the understanding is absolutely correct, but the implementation of the same is not as easy as it seems. Similarly most of the people think lookup component is just component which does look up for additional information and does not pay much attention to it. Due to the same reason they do not pay attention to the same and eventually get very bad performance.

Linchpin People are database coaches and wellness experts for a data driven world. In this 28th episode of the Notes from the Fields series database expert Tim Mitchell (partner at Linchpin People) shares very interesting conversation related to how to write a good lookup component with Cache Mode.


In SQL Server Integration Services, the lookup component is one of the most frequently used tools for data validation and completion.  The lookup component is provided as a means to virtually join one set of data to another to validate and/or retrieve missing values.  Properly configured, it is reliable and reasonably fast.

Among the many settings available on the lookup component, one of the most critical is the cache mode.  This selection will determine whether and how the distinct lookup values are cached during package execution.  It is critical to know how cache modes affect the result of the lookup and the performance of the package, as choosing the wrong setting can lead to poorly performing packages, and in some cases, incorrect results.

Full Cache

The full cache mode setting is the default cache mode selection in the SSIS lookup transformation.  Like the name implies, full cache mode will cause the lookup transformation to retrieve and store in SSIS cache the entire set of data from the specified lookup location.  As a result, the data flow in which the lookup transformation resides will not start processing any data buffers until all of the rows from the lookup query have been cached in SSIS.

The most commonly used cache mode is the full cache setting, and for good reason.  The full cache setting has the most practical applications, and should be considered the go-to cache setting when dealing with an untested set of data. With a moderately sized set of reference data, a lookup transformation using full cache mode usually performs well.  Full cache mode does not require multiple round trips to the database, since the entire reference result set is cached prior to data flow execution.

There are a few potential gotchas to be aware of when using full cache mode.  First, you can see some performance issues – memory pressure in particular – when using full cache mode against large sets of reference data.  If the table you use for the lookup is very large (either deep or wide, or perhaps both), there’s going to be a performance cost associated with retrieving and caching all of that data.  Also, keep in mind that when doing a lookup on character data, full cache mode will always do a case-sensitive (and in some cases, space-sensitive) string comparison even if your database is set to a case-insensitive collation.  This is because the in-memory lookup uses a .NET string comparison (which is case- and space-sensitive) as opposed to a database string comparison (which may be case sensitive, depending on collation).  There’s a relatively easy workaround in which you can use the UPPER() or LOWER() function in the pipeline data and the reference data to ensure that case differences do not impact the success of your lookup operation.  Again, neither of these present a reason to avoid full cache mode, but should be used to determine whether full cache mode should be used in a given situation.

Full cache mode is ideally useful when one or all of the following conditions exist:

  • The size of the reference data set is small to moderately sized
  • The size of the pipeline data set (the data you are comparing to the lookup table) is large, is unknown at design time, or is unpredictable
  • Each distinct key value(s) in the pipeline data set is expected to be found multiple times in that set of data

Partial Cache

When using the partial cache setting, lookup values will still be cached, but only as each distinct value is encountered in the data flow.  Initially, each distinct value will be retrieved individually from the specified source, and then cached.  To be clear, this is a row-by-row lookup for each distinct key value(s).

This is a less frequently used cache setting because it addresses a narrower set of scenarios.  Because each distinct key value(s) combination requires a relational round trip to the lookup source, performance can be an issue, especially with a large pipeline data set to be compared to the lookup data set.  If you have, for example, a million records from your pipeline data source, you have the potential for doing a million lookup queries against your lookup data source (depending on the number of distinct values in the key column(s)).  Therefore, one has to be keenly aware of the expected row count and value distribution of the pipeline data to safely use partial cache mode.

Using partial cache mode is ideally suited for the conditions below:

  • The size of the data in the pipeline (more specifically, the number of distinct key column) is relatively small
  • The size of the lookup data is too large to effectively store in cache
  • The lookup source is well indexed to allow for fast retrieval of row-by-row values

No Cache

As you might guess, selecting no cache mode will not add any values to the lookup cache in SSIS.  As a result, every single row in the pipeline data set will require a query against the lookup source.  Since no data is cached, it is possible to save a small amount of overhead in SSIS memory in cases where key values are not reused.  In the real world, I don’t see a lot of use of the no cache setting, but I can imagine some edge cases where it might be useful.

As such, it’s critical to know your data before choosing this option.  Obviously, performance will be an issue with anything other than small sets of data, as the no cache setting requires row-by-row processing of all of the data in the pipeline.

I would recommend considering the no cache mode only when all of the below conditions are true:

  • The reference data set is too large to reasonably be loaded into SSIS memory
  • The pipeline data set is small and is not expected to grow
  • There are expected to be very few or no duplicates of the key values(s) in the pipeline data set (i.e., there would be no benefit from caching these values)

Conclusion

The cache mode, an often-overlooked setting on the SSIS lookup component, represents an important design decision in your SSIS data flow.  Choosing the right lookup cache mode directly impacts the fidelity of your results and the performance of package execution.  Know how this selection impacts your ETL loads, and you’ll end up with more reliable, faster packages.

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Reference: Pinal Dave (http://blog.sqlauthority.com)

Developer’s Life – Attitude and Communication – They Can Cause Problems – Notes from the Field #027

[Note from Pinal]: This is a 27th episode of Notes from the Field series. The biggest challenge for anyone is to understand human nature. We human have so many things on our mind at any moment of time. There are cases when what we say is not what we mean and there are cases where what we mean we do not say. We do say and things as per our mood and our agenda in mind. Sometimes there are incidents when our attitude creates confusion in the communication and we end up creating a situation which is absolutely not warranted.

In this episode of the Notes from the Field series database expert Mike Walsh explains a very crucial issue we face in our career, which is not technical but more to relate to human nature. Read on this may be the best blog post you might read in recent times.


In this week’s note from the field, I’m taking a slight departure from technical knowledge and concepts explained. We’ll be back to it next week, I’m sure.

Pinal wanted us to explain some of the issues we bump into and how we see some of our customers arrive at problem situations and how we have helped get them back on the right track. Often it is a technical problem we are officially solving – but in a lot of cases as a consultant, we are really helping fix some communication difficulties. This is a technical blog post and not an “advice column” in a newspaper – but the longer I am a consultant, the more years I add to my experience in technology the more I learn that the vast majority of the problems we encounter have “soft skills” included in the chain of causes for the issue we are helping overcome.

This is not going to be exhaustive but I hope that sharing four pieces of advice inspired by real issues starts a process of searching for places where we can be the cause of these challenges and look at fixing them in ourselves. Or perhaps we can begin looking at resolving them in teams that we manage.

I’ll share three statements that I’ve either heard, read or said and talk about some of the communication or attitude challenges highlighted by the statement.

1 – “But that’s the SAN Administrator’s responsibility…”

I heard that early on in my consulting career when talking with a customer who had serious corruption and no good recent backups – potentially no good backups at all. The statement doesn’t have to be this one exactly, but the attitude here is an attitude of “my job stops here, and I don’t care about the intent or principle of why I’m here.” It’s also a situation of having the attitude that as long as there is someone else to blame, I’m fine…  You see in this case, the DBA had a suspicion that the backups were not being handled right.  They were the DBA and they knew that they had responsibility to ensure SQL backups were good to go – it’s a basic requirement of a production DBA. In my “As A DBA Where Do I start?!” presentation, I argue that is job #1 of a DBA. But in this case, the thought was that there was someone else to blame. Rather than create extra work and take on responsibility it was decided to just let it be another team’s responsibility. This failed the company, the company’s customers and no one won. As technologists – we should strive to go the extra mile. If there is a lack of clarity around roles and responsibilities and we know it – we should push to get it resolved. Especially as the DBAs who should act as the advocates of the data contained in the databases we are responsible for.

2 – “We’ve always done it this way, it’s never caused a problem before!”

Complacency. I have to say that many failures I’ve been paid good money to help recover from would have not happened had it been for an attitude of complacency. If any thoughts like this have entered your mind about your situation you may be suffering from it. If, while reading this, you get this sinking feeling in your stomach about that one thing you know should be fixed but haven’t done it.. Why don’t you stop and go fix it then come back..

  • “We should have better backups, but we’re on a SAN so we should be fine really.”
  • “Technically speaking that could happen, but what are the chances?”
  • “We’ll just clean that up as a fast follow”

..and so on.

In the age of tightening IT budgets, increased expectations of up time, availability and performance there is no room for complacency. Our customers and business units expect – no demand – the best. Complacency says “we will give you second best or hopefully good enough and we accept the risk and know this may hurt us later.

Sometimes an organization will opt for “good enough” and I agree with the concept that at times the perfect can be the enemy of the good. But when we make those decisions in a vacuum and are not reporting them up and discussing them as an organization that is different. That is us unilaterally choosing to do something less than the best and purposefully playing a game of chance.

3 – “This device must accept interference from other devices but not create any”

I’ve paraphrased this one – but it’s something the Federal Communications Commission – a federal agency in the United States that regulates electronic communication – requires of all manufacturers of any device that could cause or receive interference electronically. I blogged in depth about this here (http://www.straightpathsql.com/archives/2011/07/relationship-advice-from-the-fcc/) so I won’t go into much detail other than to say this… If we all operated more on the premise that we should do our best to not be the cause of conflict, and to be less easily offended and less upset when we perceive offense life would be easier in many areas!

This doesn’t always cause the issues we are called in to help out. Not directly. But where we see it is in unhealthy relationships between the various technology teams at a client. We’ll see teams hoarding knowledge, not sharing well with others and almost working against other teams instead of working with them. If you trace these problems back far enough it often stems from someone or some group of people violating this principle from the FCC.

To Sum It Up

Technology problems are easy to solve. At Linchpin People we help many customers get past the toughest technological challenge – and at the end of the day it is really just a repeatable process of pattern based troubleshooting, logical thinking and starting at the beginning and carefully stepping through to the end. It’s easy at the end of the day. The tough part of what we do as consultants is the people skills. Being able to help get teams working together, being able to help teams take responsibility, to improve team to team communication? That is the difficult part, and we get to use the soft skills on every engagement. Work on professional development (http://professionaldevelopment.sqlpass.org/) and see continuing improvement here, not just with technology. I can teach just about anyone how to be an excellent DBA and performance tuner, but some of these soft skills are much more difficult to teach.

If you want to get started with performance analytics and triage of virtualized SQL Servers with the help of experts, read more over at Fix Your SQL Server.

Reference: Pinal Dave (http://blog.sqlauthority.com)