SQL SERVER – The Self Join – Inner Join and Outer Join

Self Join has always been an note-worthy case. It is interesting to ask questions on self join in a room full of developers. I often ask – if there are three kind of joins, i.e.- Inner Join, Outer Join and Cross Join; what type of join is Self Join? The usual answer is that it is an Inner Join. In fact, it can be classified under any type of join. I have previously written about this in my interview questions and answers series. I have also mentioned this subject when I explained the joins in detail over SQL SERVER – Introduction to JOINs – Basic of JOINs.

When I mention that Self Join can be the outer join, I often get a request for an example for the same. I have created example using AdventureWorks Database of Self Join earlier, but that was meant for inner join as well. Let us create a new example today, where we will see how Self Join can be implemented as an Inner Join as well as Outer Join.

Let us first create the same table for an employee. One of the columns in the same table contains the ID of manger, who is also an employee for the same company. This way, all the employees and their managers are present in the same table. If we want to find the manager of a particular employee, we need use self join.

USE TempDb
GO
-- Create a Table
CREATE TABLE Employee(
EmployeeID INT PRIMARY KEY,
Name NVARCHAR(50),
ManagerID INT
)
GO
-- Insert Sample Data
INSERT INTO Employee
SELECT 1, 'Mike', 3
UNION ALL
SELECT 2, 'David', 3
UNION ALL
SELECT 3, 'Roger', NULL
UNION ALL
SELECT 4, 'Marry',2
UNION ALL
SELECT 5, 'Joseph',2
UNION ALL
SELECT 7, 'Ben',2
GO
-- Check the data
SELECT *
FROM Employee
GO

selfjoini1 SQL SERVER   The Self Join   Inner Join and Outer Join

We will now use inner join to find the employees and their managers’ details.

-- Inner Join
SELECT e1.Name EmployeeName, e2.name AS ManagerName
FROM Employee e1
INNER JOIN Employee e2
ON e1.ManagerID = e2.EmployeeID
GO

selfjoini2 SQL SERVER   The Self Join   Inner Join and Outer Join

From the result set, we can see that all the employees who have a manager are visible. However we are unable to find out the top manager of the company as he is not visible in our resultset. The reason for the same is that due to inner join, his name is filtered out. Inner join does not bring any result which does not have manager id. Let us convert Inner Join to Outer Join and then see the resultset.

-- Outer Join
SELECT e1.Name EmployeeName, ISNULL(e2.name, 'Top Manager') AS ManagerName
FROM Employee e1
LEFT JOIN Employee e2
ON e1.ManagerID = e2.EmployeeID
GO

selfjoini3 SQL SERVER   The Self Join   Inner Join and Outer Join

Once we convert Inner Join to Outer Join, we can see the Top Manager as well. Here we have seen how Self Join can behave as an inner join as well as an outer join.

As I said earlier, many of you know these details, but there are many who are still confused about this concept. I hope that this concept is clear from this post.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

53 thoughts on “SQL SERVER – The Self Join – Inner Join and Outer Join

  1. SELECT e1.Name EmployeeName, ISNULL(e2.Name, ‘Top Manager’) AS ManagerName
    FROM Employee e1
    LEFT JOIN Employee e2
    ON e1.ManagerID = e2.EmployeeID

    Like

  2. Getting Group By Count In Self Join

    SELECT e1.ProductModelID ,count(*) TotalProductModelID,avg(e1.ProductDescriptionID ) as avgProductDescriptionID
    FROM Production.ProductModelProductDescriptionCulture e1
    inner JOIN Production.ProductModelProductDescriptionCulture e2
    ON e1.ProductModelID = e2.ProductModelID

    group by e1.ProductModelID

    Like

  3. I don’t quite understand this. A ‘self join’ is a concept. Just like a ‘join’ on multiple tables. It is divorced from any notion of the ‘type’ of join. They are separate concerns.

    The ‘self’ part refers to how the tables relate and the ‘inner’ or ‘outer’ part refers to the implementation you employ depending on business requirements.

    Like

  4. Well said Kam. This example is as much an exercise in realizing the need for an outer join as it is an exercise is the concept of a “self join”.

    Separate the two facts:

    1: The outer join is required because a record we desire in the result set is being trimmed by using an inner join. When joining two sets of data, an inner join requires the field(s) being joined on to have corresponding values in both sets of data for those records to show in the result set.

    We desire to see the complete list of employees, but one employee has a NULL value in the ManagerID field being joined on….so that employee record would be trimmed from the result set using an inner join. Hence, the need for the outer join.

    Use set based thinking – don’t get caught up on the fact that the two sets of related data are from the same table.

    2: The “self join” concept is expanding the database programmer’s thinking to include the fact that you can join a table to itself, and will need to if the data in the table is hierarchical. (Employee – Manager, Child – Parent, etc…)

    Like

  5. Hi All,

    How can we join two tables which having no common column. For ref:
    Table 1:
    ID, Name

    Table 2:
    Phone, Address

    I want to create a view to produce the data as like:

    Table:
    ID, NAme, Phone, Address

    pls help me regarding this.

    Sanjay

    Like

  6. Pingback: SQL SERVER – SELF JOIN Not Allowed in Indexed View – Limitation of the View 9 Journey to SQL Authority with Pinal Dave

  7. i’m looking for a query which can fetch all the employee in-detail where employee id is given as an input and the expected result should be his name, id[team member], his manager name, id[team lead], again his manager name, id[project manager], again his manager name, id[director]. the result should be from top hierarchy or all level of managers.

    Like

  8. Hi, i wold like to know how can we retrive Manager who is having more than 2 employee reference . According to the above table it should display

    mary – david
    joseph – david
    ben – david

    how can we do this using self join

    Like

  9. Pingback: SQL SERVER – Puzzle to Win Print Book – Write T-SQL Self Join Without Using LEAD and LAG « Journey to SQLAuthority

  10. how can i change product category and product subcategory and product tables into one parent child table using select statement is there a way to do these

    Like

  11. Hi,

    I want to know what is the impact of INNER JOIN with OUTER JOIN / OUTER APPLY. Also need to know the impact of adding the inner join inside the outer apply instead of using it outside the outer apply

    select few_columns
    from table1 a
    left outer join table2 b
    on b.col1 = a.col1
    inner join table3 c
    on c.col2 = b.col2

    or

    select few_columns
    from table1 a
    outer apply
    (select col1, col2, col3
    from table2 b
    where b.col1 = a.col1
    )z
    inner join table3 c
    on c.col2 = z.col2

    or

    select few_columns
    from table1 a
    outer apply
    (select col1, col2, col3
    from table2 b
    inner join table3 c
    on c.col2 = b.col2
    where b.col1 = a.col1
    )z

    Please let me know the difference in the performance & the resultset

    Like

  12. You dont need to use IsNULL when ur using outerjoin
    select a1.name as employee, a2.name as ManagerName from employee as a1
    left join employee as a2
    on
    a1.managerid = a2.employeeid

    Like

  13. My query is as below :

    select b.Booking_id,booked_by,Datetime,FP_no,Name,co_name,
    Booking_type,occupation,Nationality,Address,email_id,
    Pan_no,sitting_arrang,state,city,pin_code,Booking_from,Booking_to,
    Venue_type,Type_of_function,Food_Pickup_hour,Food_Pickup_min,Min_people,
    Cost_per_plate,Total,Av_item_total,Grant_amount,Advance,Due,Payment_mode,card_type,
    Card_no,av.av_item,av.av_rate, food.Food_menu
    from Booking_details b, av_details av, booking_food_details food
    where b.Booking_id=@Booking_id
    and b.Booking_id=av.Booking_id
    and b.Booking_id=food.Booking_id

    Now the problem here is that av_details and booking_food_details have different number of rows,Suppose
    booking_food_details has a,b,c and, av_details has d,e,f,g,h

    so when I run it with data it shows me 5 lines of entry and av_details rows repeats itself 5 times, like :

    booking_food_details av_details
    d a
    e b
    f c
    g a
    h b
    Rather it should display:

    booking_food_details av_details
    d a
    e b
    f c
    g
    h

    Like

  14. hello all,
    I have data in following format

    id productID price1 price2 price3
    1 100 4.0 null null
    2 100 NULL 8 null
    3 100 null null 12
    4 108 18 null NULL
    5 108 null 28 null
    6 108 null null 20

    However requirement of data is in following data..

    id productID price1 price2 price3 IDLIst
    1 100 4.0 8 12 1,2,3
    4 108 18 28 20 4,5,6

    Thanks In Advance

    Like

  15. Pingback: SQL SERVER – Weekly Series – Memory Lane – #037 | Journey to SQL Authority with Pinal Dave

  16. There are specific breeds lower susceptible to chocolate however never give your dog chocolate
    by choice. Often described as “a large dog in the body of a small dog,” it iis simipar to the Norwegian Buhund and related to modern Welshi Corgis as well
    aas Shetland Sheepdogs. Territorial aggression is common in
    canines.

    Like

  17. Select A.ID, A.FirstColumn, A.SecondColumn,SUM(B.FirstColumn) + SUM(B.SecondColumn) As Total
    From Employee A, Employee B
    Where B.ID <= A.ID
    Group By A.ID, A.FirstColumn, A.SecondColumn

    Like

  18. NEED HELP

    Suppose I have 2 Tables

    1. Transactions(Transactionid(pk),Transactiontypeid,Relatedtransactionid(which is also a transaction id from Transactions table))

    2. Transactiontypes (Transactiontypeid,Transactiontype_name)

    I want the result as follow:

    (Transactionid,Transaction name,Related TransactionType_name)

    PLZ suggest the Query.

    Like

  19. Dear Pinal,

    btw…You are the MAN of SqlServer help – I have received benefit from your posts many many times! Thanks you!

    I have a little more difficult self-join problem I’d like to see if you could solve.

    Below reflects a real situation for a source table in an application that I need to use for correcting data downstream to find the currently assigned id. It is a table that logs party merges that is a simple “from” party id column and a “to” party id column.

    Where its different than the typical employee/manager situation is that I need to be able to find the top last id in the chain of merges so that I can reference the correct party id when tying them to transactions. The source application has historically sometimes not merge correctly leaving child transactions still tied to the original party id. Therefore the data warehouse has to correct these as it assigns the dimensionality to the transactions coming in. The current process uses a TSql function that takes in an id and looks up the top parent for the returned value. But that’s causing performance problems so I need to re-think the approach to make it set-based so that it can be derived either in the query through join to a CTE or maintained as a separate lookup table for the join. Bottom line is I need turn a list of all single level from/to party ids into a new list of the every party_id involved in a merge along with its topmost parent party id.

    — Below builds sample party merge table and displays with typical
    — recursive outer join

    USE TempDb
    GO
    — Create a Table
    CREATE TABLE Party_merge(
    FromPartyID INT PRIMARY KEY,
    ToPartyID INT
    )
    GO
    — Insert Sample Data
    — 3 gets gets 4 children (1, 2, 8, 9) assigned to merge to it and then itself gets merged to id=10
    — 2 is the new parent for 4, 5, and 7
    — 11 gets 12 assigned. Only parent 12 has only one child (11)

    — Note:
    — 14 gets parent assigned as 14 (due to source system issue)

    INSERT INTO Party_merge
    SELECT 1, 3
    UNION ALL
    SELECT 2, 3
    UNION ALL
    SELECT 14, 14
    UNION ALL
    SELECT 4, 2
    UNION ALL
    SELECT 5, 2
    UNION ALL
    SELECT 7, 2
    UNION ALL
    SELECT 8, 3
    UNION ALL
    SELECT 9, 3
    UNION ALL
    SELECT 3, 10
    UNION ALL
    SELECT 11, 12
    GO
    — Check the data

    SELECT *
    FROM Party_merge
    ORDER BY FromPartyID, ToPartyID
    GO

    —————————————————–
    SELECT p1.FromPartyID, p1.ToPartyId
    FROM Party_merge p1
    LEFT OUTER JOIN Party_merge p2
    ON p1.ToPartyID = p2.FromPartyID
    WHERE p1.FromPartyID p1.ToPartyID
    —————————————————

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s