Question: What is a Merge Join?
Answer: The Merge Join transformation provides an output that is generated by joining two sorted data sets using a FULL, LEFT, or INNER join. The Merge Join transformation requires that both inputs be sorted and that the joined columns have matching meta-data. The user cannot join a column that has a numeric data type with a column that has a character data type. If the data have a string data type, the length of the column in the second input must be less than or equal to the length of the column in the first input with which it is merged.
USE pubs
GO
SELECT a.au_id
FROM authors a
JOIN titleauthor b ON a.au_id = b.au_id
OPTION (MERGE JOIN)
GO
Merge join itself is very fast, but it can be an expensive choice if sort operations are required. However, if the data volume is large and the desired data can be obtained presorted from existing B-tree indexes, merge join is often the fastest available join algorithm. Because the query optimizer usually selects the best execution plan for a given select statement, it is not necessary to enforce the desirable join type, but sometimes it can be useful. User can enforce the desirable join type by using the OPTION clause.
Following query will be benefited by MERGE JOIN because authors and titleauthor both has a primary key index on au_id. Due to Primary Key on au_id it is physically sorted by au_id.
Reference: Pinal Dave (https://blog.sqlauthority.com)
2 Comments. Leave new
Great Post Dave!! I have used this a ton lately in SSIS!!
is it the same as:
SELECT a.au_id
FROM authors a
JOIN titleauthor b ON a.au_id = b.au_id
ORDER BY a.au_id