SQL SERVER – Quick Note about JOIN – Common Questions and Simple Answers

This blog post is written in response to the T-SQL Tuesday post of JOIN. This is a very interesting subject. Years ago, I wrote my article about SQL SERVER – Introduction to JOINs – Basic of JOINs, ‑ till date, it is my most favorite article on the blog.

Today we are going to talk about join and lots of things related to the JOIN. I recently started office hours to answer questions and issues of the community. I receive so many questions that are related to JOIN. I will share few of the same over here. Most of them are basic, but note that the basics are of great importance.

Without further ado, let me continue with the question and answers.

Q: Which one of the following is a better method?
Method 1:
SELECT t1.*, t2.*
FROM t1,t2
WHERE t1.col1 = t2.col1

Method 2:
SELECT t1.*, t2.*
FROM t1 INNER JOIN t2 ON t1.col1 = t2.col1

A: The answer to this question will bring some interesting conversation. I strongly prefer method 2 because it is much cleaner to understand and if I have to use table level hints or so on, it is much convenient to do the same. I would suggest going ahead with method 2. Currently with regard to performance and execution plan, both the methods show the same (most of the time). However, with respect to standard and future innovation, method 2 is the way to go.

When I have to perform a performance tuning task, and if I see method 1, I usually ask the developer to convert it to method 2 as I feel much more comfortable with method 2. Additionally, when you have to work with OUTER JOIN, you will have to do so.

Q: What is better ‑ subquery or JOIN?
Subquery:
SELECT t1.*
FROM t1
WHERE t1.col1 IN (SELECT t2.col1 FROM t2)

Join:
SELECT t1.*
SELECT t1.*, t2.*
FROM t1 INNER JOIN t2 ON t1.col1 = t2.col1

In this case, there is no right answer. You should use the one that gives you optimal performance. I have seen cases when the subquery gives optimal performance as well join giving optimal performance when compared to each other. I have seen either of them performing so well that I think one has to test out both the methods before selecting one. If you are facing situation where you are not sure which method you should select, I suggest that you go with your intuition. I still prefer JOIN over any other method, but in this case, I will suggest you to test your options.

Q: How to simulate Join?
A: I get this question a lot of times, and I have no answer. Here, I want your help as I do not even understand this question.

Q: How can I change my LEFT JOIN to RIGHT JOIN and get the same answer?
A: Sure. Here is quick example of the same:

Left Join:
SELECT t1.col1, t2.col2
FROM t1 LEFT JOIN t2 ON ON t1.col1 = t2.col1

Right Join:
SELECT t1.col1, t2.col2
FROM t2 RIGHT JOIN t1 ON ON t1.col1 = t2.col1

Both of the above options will give you same result. However, the real question is why you want to do that. What is the reason that you want to change the left join to right join?

Q: Does it matter how I write tables in my join if I am using INNER JOIN only?
A: No it does not matter in case of INNER JOIN as the result will be the same, and the SQL Server Engine will figure out the optimal execution plans for your query. As your question clearly suggests that for any other kind of join (i.e., OUTER JOIN, CROSS JOIN), it will matter for sure. Additionally, there are cases with INNER JOIN ‑ when order is forced on them, they have shown a little performance enhancement. Here is a quick example of the same.

If you have attended my session of Virtual Tech Days few days ago, you would have seen the example of the how forceorder hint works.

Q: Is there a quick tutorial to Joins?
A: I have written an article on this subject earlier, and as I said earlier in this article, I personally like the same a lot. Here you can read about the same: Introduction to JOINs – Basic of JOINs.

Q: Is there any book available to learn T-SQL, which explains various concepts like this easily?
A: I am bit biased but you can read about my books over here.

Q: Is SELF JOIN is a type of INNER JOIN or OUTER JOIN?
A: In fact, it is both an inner as well as outer join. Self Join is a very interesting subject. Here is an interesting article that I have written earlier on this subject: SQL SERVER – The Self Join – Inner Join and Outer Join .

Q: In case of the OUTER JOIN, where should I put the condition?
A: This question requires a detailed answer, I have written a detailed blog post on this subject over here: How ON Clause Effects Resultset in LEFT JOIN .

Q: What is Optimal LEFT JOIN or NOT IN?
A: I personally prefer LEFT JOIN as I have seen LEFT JOIN doing better in many cases. Once again, I suggest you should test it with your query. Here is a quick example of the same: Differences Between Left Join and Left Outer Join.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER- Differences Between Left Join and Left Outer Join

There are a few questions that I had decided not to discuss on this blog because I think they are very simple and many of us know it. Many times, I even receive not-so positive notes from several readers when I am writing something simple. However, assuming that we know all and beginners should know everything is not the right attitude.

Since day 1, I have been keeping a small journal regarding questions that I receive in this blog. There are around 200+ questions I receive every day through emails, comments and occasional phone calls. Yesterday, I received a comment with the following question:

What are the differences between Left Join and Left Outer Join? Click here to read original comment.

This question has triggered the threshold of receiving the same question repeatedly. Here is the answer:

There is absolutely no difference between LEFT JOIN and LEFT OUTER JOIN. The same is true for RIGHT JOIN and RIGHT OUTER JOIN. When you use LEFT JOIN keyword in SQL Server, it means LEFT OUTER JOIN only.

I have already written in-depth visual diagram discussing the JOINs. I encourage all of you to read the article for further understanding of the JOINs:

Read Introduction to JOINs – Basic of JOINs

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQLAuthority News – Statistics and Best Practices – Virtual Tech Days – Nov 22, 2010

I am honored that I have been invited to speak at Virtual TechDays on Nov 22, 2010 by Microsoft. I will be speaking on my favorite subject of Statistics and Best Practices.

This exclusive online event will have 80 deep technical sessions across 3 days – and, attendance is completely FREE. There are dedicated tracks for Architects, Software Developers/Project Managers, Infrastructure Managers/Professionals and Enterprise Developers. So, REGISTER for this exclusive online event TODAY.

Statistics and Best Practices
Timing: 11:45am-12:45pm
Statistics are a key part of getting solid performance. In this session we will go over the basics of the statistics and various best practices related to Statistics. We will go over various frequently asked questions like a) when to update statistics, b) different between sync and async update of statistics c) best method to update statistics d) optimal interval of updating statistics. We will also discuss the pros and cons of the statistics update. This session is for all of you – whether you’re a DBA or developer!

You can register for this event over here.

If you have never attended my session on this subject I strongly suggest that you attend the event as this is going to be very interesting conversation between us. If you have attended this session earlier, this will contain few new information which will for sure interesting to share with all.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – SELF JOIN Not Allowed in Indexed View – Limitation of the View 9

Update: Please read the summary post of all the 11 Limitations of the view SQL SERVER – The Limitations of the Views – Eleven and more…
Previously, I wrote an article about SQL SERVER – The Self Join – Inner Join and Outer Join, and that blog post seems very popular because of its interesting points. It is quite common to think that Self Join is also only Inner Join, but the reality is that it can be anything. The concept of Self Join is very useful that we use it quite often in our coding. However, this is not allowed in the Index View. I will be using the same example  that I have created earlier for the said article.

Let us first create the same table for an employee. One of the columns in this table contains the ID of the manger, who is an employee of that company, at the same time. This way, all the employees and their managers are present in the same table. If we want to find the manager of a particular employee, we need to use Self Join.

USE TempDb
GO
-- Create a Table
CREATE TABLE Employee(
EmployeeID INT PRIMARY KEY,
Name NVARCHAR(50),
ManagerID INT
)
GO
-- Insert Sample Data
INSERT INTO Employee
SELECT 1, 'Mike', 3
UNION ALL
SELECT 2, 'David', 3
UNION ALL
SELECT 3, 'Roger', NULL
UNION ALL
SELECT 4, 'Marry',2
UNION ALL
SELECT 5, 'Joseph',2
UNION ALL
SELECT 7, 'Ben',2
GO
-- Check the data
SELECT *
FROM Employee
GO

We will now utilize Inner Join to find the employees and their managers’ details.

-- Inner Join
SELECT e1.Name EmployeeName, e2.name AS ManagerName
FROM Employee e1
INNER JOIN Employee e2
ON e1.ManagerID = e2.EmployeeID
GO

Now let us try to create View on the table. This will allow well construction of the View without any issues associated with it.

-- Create a View
CREATE VIEW myJoinView
WITH SCHEMABINDING
AS
SELECT
e1.Name EmployeeName, e2.name AS ManagerName
FROM dbo.Employee e1
INNER JOIN dbo.Employee e2
ON e1.ManagerID = e2.EmployeeID
GO

Now let us try to create a Clustered Index on the View.

-- Attempt to Create Index on View will thrown an error
CREATE UNIQUE CLUSTERED INDEX [IX_MyJoinView] ON [dbo].[myJoinView]
(
[EmployeeName] ASC
)
GO

Unfortunately, the above attempt will not allow you to create the Clustered Index, as evidenced by an error message. It will throw following error suggesting that SELF JOIN is now allowed in the table.

Msg 1947, Level 16, State 1, Line 2
Cannot create index on view “tempdb.dbo.myJoinView”. The view contains a self join on “tempdb.dbo.Employee”.

The generic reason provided is that it is very expensive to manage the view for SQL Server when SELF JOIN is implemented in the query.

If any of you has a better explanation of this subject, please post it here through your comments, and I will publish it with due credit.

The complete script for the example is given below:

USE TempDb
GO
-- Create a Table
CREATE TABLE Employee(
EmployeeID INT PRIMARY KEY,
Name NVARCHAR(50),
ManagerID INT
)
GO
-- Insert Sample Data
INSERT INTO Employee
SELECT 1, 'Mike', 3
UNION ALL
SELECT 2, 'David', 3
UNION ALL
SELECT 3, 'Roger', NULL
UNION ALL
SELECT 4, 'Marry',2
UNION ALL
SELECT 5, 'Joseph',2
UNION ALL
SELECT 7, 'Ben',2
GO
-- Check the data
SELECT *
FROM Employee
GO
-- Inner Join
SELECT e1.Name EmployeeName, e2.name AS ManagerName
FROM Employee e1
INNER JOIN Employee e2
ON e1.ManagerID = e2.EmployeeID
GO
-- Create a View
CREATE VIEW myJoinView
WITH SCHEMABINDING
AS
SELECT
e1.Name EmployeeName, e2.name AS ManagerName
FROM dbo.Employee e1
INNER JOIN dbo.Employee e2
ON e1.ManagerID = e2.EmployeeID
GO
-- Attempt to Create Index on View will thrown an error
CREATE UNIQUE CLUSTERED INDEX [IX_MyJoinView] ON [dbo].[myJoinView]
(
[EmployeeName] ASC
)
GO
/*
Msg 1947, Level 16, State 1, Line 2
Cannot create index on view "tempdb.dbo.myJoinView". The view contains a self join on "tempdb.dbo.Employee".
*/
-- Clean up
DROP VIEW myJoinView
DROP TABLE Employee
GO

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Outer Join Not Allowed in Indexed Views – Limitation of the View 8

Update: Please read the summary post of all the 11 Limitations of the view SQL SERVER – The Limitations of the Views – Eleven and more…

This blog post was previously published over here. I am republishing it in the series Limitation of the Views with a few modifications.

While reading the white paper Improving Performance with SQL Server 2008 Indexed Views, I noticed that it says outer joins are NOT allowed in the indexed views. Here, I have created an example to demonstrate why this is so.

Rows can logically disappear from an Indexed View based on OUTER JOIN when you insert data into a base table. This makes the OUTER JOIN view to be increasingly updated, which is relatively difficult to implement. In addition, the performance of the implementation would be slower than for views based on standard (INNER) JOIN.

The reader was confused with my answer and wanted me to explain it further. Here is the example that I have quickly put together to demonstrate the behavior described in the above statement:

USE tempdb
GO
-- Create Two Tables
CREATE TABLE BaseTable (ID1 INT, Col1 VARCHAR(100))
CREATE TABLE JoinedTable (ID2 INT, Col2 VARCHAR(100))
GO
-- Insert Values in Tables
INSERT INTO BaseTable (ID1,Col1)
SELECT 1,'First'
UNION ALL
SELECT 2,'Second'
GO
INSERT INTO JoinedTable (ID2,Col2)
SELECT 1,'First'
UNION ALL
SELECT 2,'Second'
UNION ALL
SELECT 3,'Third'
UNION ALL
SELECT 4,'Fourth'
GO
-- Use Outer Join
SELECT jt.*
FROM BaseTable bt
RIGHT OUTER JOIN JoinedTable jt ON bt.ID1 = jt.ID2
WHERE bt.ID1 IS NULL
GO

The script above will give us the following output:

-- Now Insert Rows in Base Table
INSERT INTO BaseTable (ID1,Col1)
SELECT 3,'Third'
GO
-- You will notice that one row less retrieved from Join
SELECT jt.*
FROM BaseTable bt
RIGHT OUTER JOIN JoinedTable jt ON bt.ID1 = jt.ID2
WHERE bt.ID1 IS NULL
GO
-- Clean up
DROP TABLE BaseTable
DROP TABLE JoinedTable
GO

After running this script, you will notice that as the base table gains one row, the result loses one row. Going back to the white paper I mentioned earlier, I believe this is an expensive way to manage the same issue as to why it is not allowed in Indexed View.

Additionally, SQL Server Expert Ramdas provided excellent explanations regarding NULL and why resultset maintenance is expensive, over here.

“A disadvantage of outer joins in SQL is that they generate nulls in the result set. Those nulls are indistinguishable from other nulls that are not generated by the outer join operation. There is no “standard” semantics for nulls in SQL but in many common situations, the appearance of nulls in outer joins doesn’t really correspond to the way nulls are returned and used in other places. Therefore, the presence of nulls in outer joins creates a certain amount of ambiguity.”

This series is indeed getting very interesting. What are your suggestions?

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQLAuthority News – 2 Sessions at TechInsight 2010 – June 29 – July 1, 2010

Earlier this month, I got the opportunity to visit Malaysia for community sessions on June 29 – July 1, 2010 at Kuala Lumpur, Malaysia, which I would consider as valuable experience. I presented two different sessions at the event. The event was extremely popular in local community, and I had great time meeting people in Malaysia. I must say that the best thing about Kuala Lumpur is the people and their response.

Malaysia Twin Towers
Malaysia Twin Towers

Techinsights is a major technology conference to network with like-minded peers and also up-skill your knowledge on latest technologies. An event that offers opportunity to dabble in hardcore technologies with in-depth and hands-on demonstration by Microsoft MVPs and industry experts local and abroad. This three-day event will challenge what you think you already know. You’ll return to the office with cutting-edge insights and expertise that will make life easier for you (and everyone else) at work. This round, we have a special highlight on new technologies such as SharePoint 2010, Visual Studio 2010, SQL Server 2008 R2, Silverlight 4, Windows 7, Windows Server 2008 R2 and many more. TechInsight is an event created by techies for techies. There is no marketing involved. It is indeed an experience to rediscover the uber-geek within you. Sign up today to secure your seat.

Techinsight - 2 Sessions
Techinsight – 2 Sessions

I presented two sessions there. Both of my sessions were in the TOP 5 sessions of Development track. Additionally, my session on Join got the highest ranking ever in Dev Track.

1) My Join, Your Join and Our Joins – The Story of Joins

Joins are very mysterious; there are many myths and confusions. This session will address all of them and also tell the story of how they act when it is about performance. Does the order of table in Join matter? Does the right or left join any different to each other? Does the Join increase IO? When is an outer join not an outer join and inner join? All these questions are answered and many more stories of Joins are included. Learn the simple tricks to get the maximum out of this tool.

Session Evaluations

Overall session rating 7.5
How valuable was the content presented 7.467741935
How effectively did the presenter communicate the content 7.596774194

2) Spatial Database – The Indexing Story

The world was believed to be flat but no more. Now SQL Server supports the spatial datatypes and many more functions. This session addresses the most vital part of Spatial datatypes and talks about how to improve the performance for the application, which is already blazing fast. We will look at how indexes are behaving with different spatial datatypes and how they can help to improve the performance and also learn the pitfalls to avoid them affecting performance.

Session Evaluations

Overall session rating 7.237288136
How valuable was the content presented 7.322033898
How effectively did the presenter communicate the content 7.457627119

I must express my special thanks to all the organizers of the event – Ervin, Walter, Raymond, and Patrick (in no particular order). They did an excellent job, and all the attendees of the event had great time as well. The food was awesome, and the response was excellent. After one month, when I am writing this review, I am still thinking of the wonderful experience I had from this event. This makes me want to not miss this event any year.

Techinsight - Event Organizers
Techinsight – Event Organizers

This one event is truly TechEd quality event in Malaysia. Kudos to the organizers and Microsoft.

Techinsight - Kuala Lumpur, Malaysia
Techinsight – Kuala Lumpur, Malaysia

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – The Self Join – Inner Join and Outer Join

Self Join has always been an note-worthy case. It is interesting to ask questions on self join in a room full of developers. I often ask – if there are three kind of joins, i.e.- Inner Join, Outer Join and Cross Join; what type of join is Self Join? The usual answer is that it is an Inner Join. In fact, it can be classified under any type of join. I have previously written about this in my interview questions and answers series. I have also mentioned this subject when I explained the joins in detail over SQL SERVER – Introduction to JOINs – Basic of JOINs.

When I mention that Self Join can be the outer join, I often get a request for an example for the same. I have created example using AdventureWorks Database of Self Join earlier, but that was meant for inner join as well. Let us create a new example today, where we will see how Self Join can be implemented as an Inner Join as well as Outer Join.

Let us first create the same table for an employee. One of the columns in the same table contains the ID of manger, who is also an employee for the same company. This way, all the employees and their managers are present in the same table. If we want to find the manager of a particular employee, we need use self join.

USE TempDb
GO
-- Create a Table
CREATE TABLE Employee(
EmployeeID INT PRIMARY KEY,
Name NVARCHAR(50),
ManagerID INT
)
GO
-- Insert Sample Data
INSERT INTO Employee
SELECT 1, 'Mike', 3
UNION ALL
SELECT 2, 'David', 3
UNION ALL
SELECT 3, 'Roger', NULL
UNION ALL
SELECT 4, 'Marry',2
UNION ALL
SELECT 5, 'Joseph',2
UNION ALL
SELECT 7, 'Ben',2
GO
-- Check the data
SELECT *
FROM Employee
GO

We will now use inner join to find the employees and their managers’ details.

-- Inner Join
SELECT e1.Name EmployeeName, e2.name AS ManagerName
FROM Employee e1
INNER JOIN Employee e2
ON e1.ManagerID = e2.EmployeeID
GO

From the result set, we can see that all the employees who have a manager are visible. However we are unable to find out the top manager of the company as he is not visible in our resultset. The reason for the same is that due to inner join, his name is filtered out. Inner join does not bring any result which does not have manager id. Let us convert Inner Join to Outer Join and then see the resultset.

-- Outer Join
SELECT e1.Name EmployeeName, ISNULL(e2.name, 'Top Manager') AS ManagerName
FROM Employee e1
LEFT JOIN Employee e2
ON e1.ManagerID = e2.EmployeeID
GO

Once we convert Inner Join to Outer Join, we can see the Top Manager as well. Here we have seen how Self Join can behave as an inner join as well as an outer join.

As I said earlier, many of you know these details, but there are many who are still confused about this concept. I hope that this concept is clear from this post.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Outer Join Not Allowed in Indexed Views

I recently received an email that contains a question from one of my readers. I have already replied the answer to his email, but I would still like to bring it to your attention and ask if you think I could have done any better with the example I gave.

The question was raised when the email sender read the white paper, Improving Performance with SQL Server 2008 Indexed Views. If you scroll all the way down through the said white paper, there are several questions and answers.

Q: Why can’t I use OUTER JOIN in an Indexed view?

A: Rows can logically disappear from an Indexed view based on OUTER JOIN when you insert data into a base table. This makes the OUTER JOIN view to be increasingly updated, which is relatively difficult to implement. In addition, the performance of the implementation would be slower than for views based on standard (INNER) JOIN.

The reader was confused with my answer and wanted me to explain it further. Here is the example which I have quickly put together to demonstrate the behavior described in the above statement.
USE tempdb
GO
-- Create Two Tables
CREATE TABLE BaseTable (ID1 INT, Col1 VARCHAR(100))
CREATE TABLE JoinedTable (ID2 INT, Col2 VARCHAR(100))
GO
-- Insert Values in Tables
INSERT INTO BaseTable (ID1,Col1)
SELECT 1,'First'
UNION ALL
SELECT 2,'Second'
GO
INSERT INTO JoinedTable (ID2,Col2)
SELECT 1,'First'
UNION ALL
SELECT 2,'Second'
UNION ALL
SELECT 3,'Third'
UNION ALL
SELECT 4,'Fourth'
GO
-- Use Outer Join
SELECT jt.*
FROM BaseTable bt
RIGHT OUTER JOIN JoinedTable jt ON bt.ID1 = jt.ID2
WHERE bt.ID1 IS NULL
GO

The script above will give us the following output:

-- Now Insert Rows in Base Table
INSERT INTO BaseTable (ID1,Col1)
SELECT 3,'Third'
GO
-- You will notice that one row less retrieved from Join
SELECT jt.*
FROM BaseTable bt
RIGHT OUTER JOIN JoinedTable jt ON bt.ID1 = jt.ID2
WHERE bt.ID1 IS NULL
GO
-- Clean up
DROP TABLE BaseTable
DROP TABLE JoinedTable
GO

After running this script, you will notice that as the base table gains one row, the result loses one row. Going back to the white paper mentioned earlier, I believe this is expensive to manage for the same reason why it is not allowed in Indexed View.

Let me know if you have a better example to demonstrate this behavior in the Outer Join.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Default Statistics on Column – Automatic Statistics on Column

During the SQL Server Training, I frequently noticed confusion in people in terms of Statistics. Many people have no idea on how Statistics works. There are so many misconceptions with respect to Statistics. I recently had an interesting conversation with one attendee who believed that Statistics only exists on Column if there is an Index on the Column, or if we explicitly create Statistics on it.

The truth is, Statistics can be in a table even though there is no Index in it. If you have the auto- create and/or auto-update Statistics feature turned on for SQL Server database, Statistics will be automatically created on the Column based on a few conditions. Please read my previously posted article, SQL SERVER – When are Statistics Updated – What triggers Statistics to Update, for the specific conditions when Statistics is updated.

Let us see one example where we could observe how Statistics is created automatically.

/*
In this example we will see effect of unused index on updating database
We will create unused indexes on table and see the performance degradation for insert
*/
USE AdventureWorks
GO
ALTER DATABASE AdventureWorks
SET AUTO_CREATE_STATISTICS ON;
GO
-- Create Table
CREATE TABLE StatsTable (ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
City VARCHAR(100))
GO
-- Insert One Hundred Thousand Records
INSERT INTO StatsTable (ID,FirstName,LastName,City)
SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York'
WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino'
WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles'
ELSE 'Houston' END
FROM
sys.all_objects a
CROSS JOIN sys.all_objects b
GO
/* Now Check the statistics on the Table
As the table is just created there should not be any statistics on it
and will display "This object does not have any statistics or indexes."
*/
sp_helpstats 'StatsTable', 'ALL'
GO

From the example above, it is very clear that if the auto-update Statistics setting is enabled in the database, it will create the necessary Statistics based on the columns where certain conditions applied.

-- Run following few queries on the table
SELECT *
FROM StatsTable
WHERE ID = 110
GO
SELECT *
FROM StatsTable
WHERE City = 'Houston'
GO
/* Now Check the statistics on the Table again
You will see two different statistics created on respective columns
used in WHERE clause.
*/
sp_helpstats 'StatsTable', 'ALL'
GO

/* Now let us try with multiple Column in WHERE clause */
SELECT *
FROM StatsTable
WHERE ID = 110 AND City = 'Houston' AND FirstName = 'Bob'
GO
/* Now Check the statistics on the Table again
You will see it will create statistics for the column
used in WHERE clause; if it was not created earlier.
*/
sp_helpstats 'StatsTable', 'ALL'
GO
-- Clean up Database
DROP TABLE StatsTable
GO

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Merge Operations – Insert, Update, Delete in Single Execution

This blog post is written in response to T-SQL Tuesday hosted by Jorge Segarra (aka SQLChicken).

I have been very active using these Merge operations in my development. However, I have found out from my consultancy work and friends that these amazing operations are not utilized by them most of the time. Here is my attempt to bring the necessity of using the Merge Operation to surface one more time.

MERGE is a new feature that provides an efficient way to do multiple DML operations. In earlier versions of SQL Server, we had to write separate statements to INSERT, UPDATE, or DELETE data based on certain conditions; however, at present, by using the MERGE statement, we can include the logic of such data changes in one statement that even checks when the data is matched and then just update it, and similarly, when the data is unmatched, it is inserted.

One of the most important advantages of MERGE statement is that the entire data are read and processed only once. In earlier versions, three different statements had to be written to process three different activities (INSERT, UPDATE or DELETE); however, by using MERGE statement, all the update activities can be done in one pass of database table.

I have written about these Merge Operations earlier in my blog post over here SQL SERVER – 2008 – Introduction to Merge Statement – One Statement for INSERT, UPDATE, DELETE. I was asked by one of the readers that how do we know that this operator was doing everything in single pass and was not calling this Merge Operator multiple times.

Let us run the same example which I have used earlier; I am listing the same here again for convenience.

--Let’s create Student Details and StudentTotalMarks and inserted some records.
USE tempdb
GO
CREATE TABLE StudentDetails
(
StudentID INTEGER PRIMARY KEY,
StudentName VARCHAR(15)
)
GO
INSERT INTO StudentDetails
VALUES(1,'SMITH')
INSERT INTO StudentDetails
VALUES(2,'ALLEN')
INSERT INTO StudentDetails
VALUES(3,'JONES')
INSERT INTO StudentDetails
VALUES(4,'MARTIN')
INSERT INTO StudentDetails
VALUES(5,'JAMES')
GO
CREATE TABLE StudentTotalMarks
(
StudentID INTEGER REFERENCES StudentDetails,
StudentMarks INTEGER
)
GO
INSERT INTO StudentTotalMarks
VALUES(1,230)
INSERT INTO StudentTotalMarks
VALUES(2,255)
INSERT INTO StudentTotalMarks
VALUES(3,200)
GO
-- Select from Table
SELECT *
FROM StudentDetails
GO
SELECT *
FROM StudentTotalMarks
GO
-- Merge Statement
MERGE StudentTotalMarks AS stm
USING
(SELECT StudentID,StudentName FROM StudentDetails) AS sd
ON stm.StudentID = sd.StudentID
WHEN MATCHED AND stm.StudentMarks > 250 THEN DELETE
WHEN
MATCHED THEN UPDATE SET stm.StudentMarks = stm.StudentMarks + 25
WHEN NOT MATCHED THEN
INSERT
(StudentID,StudentMarks)
VALUES(sd.StudentID,25);
GO
-- Select from Table
SELECT *
FROM StudentDetails
GO
SELECT *
FROM StudentTotalMarks
GO
-- Clean up
DROP TABLE StudentDetails
GO
DROP TABLE StudentTotalMarks
GO

The Merge Join performs very well and the following result is obtained.

Let us check the execution plan for the merge operator. You can click on following image to enlarge it.

Let us evaluate the execution plan for the Table Merge Operator only.

We can clearly see that the Number of Executions property suggests value 1. Which is quite clear that in a single PASS, the Merge Operation completes the operations of Insert, Update and Delete.

I strongly suggest you all to use this operation, if possible, in your development. I have seen this operation implemented in many data warehousing applications.

Reference: Pinal Dave (http://blog.SQLAuthority.com)