SQL SERVER – Introduction to JOINs – Basic of JOINs

The launch of Gandhinagar SQL Server User Group was a tremendous, astonishing success! It was overwhelming to see a large gathering of enthusiasts looking up to me (I was the Key Speaker) eager to enhance their knowledge and participate in some brainstorming discussions. Some members of User Group had requested me to write a simple article on JOINS elucidating its different types.

INNER JOIN

This join returns rows when there is at least one match in both the tables.
SQL SERVER - Introduction to JOINs - Basic of JOINs inner join

OUTER JOIN

There are three different Outer Join methods.

LEFT OUTER JOIN
This join returns all the rows from the left table in conjunction with the matching rows from the right table. If there are no columns matching in the right table, it returns NULL values.
SQL SERVER - Introduction to JOINs - Basic of JOINs left join

RIGHT OUTER JOIN
This join returns all the rows from the right table in conjunction with the matching rows from the left table. If there are no columns matching in the left table, it returns NULL values.
SQL SERVER - Introduction to JOINs - Basic of JOINs right join

FULL OUTER JOIN
This join combines left outer join and right outer join. It returns row from either table when the conditions are met and returns null value when there is no match.
SQL SERVER - Introduction to JOINs - Basic of JOINs outer join

CROSS JOIN

This join is a Cartesian join that does not necessitate any condition to join. The resultset contains records that are multiplication of record number from both the tables.

SQL SERVER - Introduction to JOINs - Basic of JOINs cross join - half

Additional Notes related to JOIN:

The following are three classic examples to display where Outer Join is useful. You will notice several instances where developers write query as given below.

SELECT t1.*
FROM Table1 t1
WHERE t1.ID NOT IN (SELECT t2.ID FROM Table2 t2)
GO

The query demonstrated above can be easily replaced by Outer Join. Indeed, replacing it by Outer Join is the best practice. The query that gives same result as above is displayed here using Outer Join and WHERE clause in join.

/* LEFT JOIN - WHERE NULL */
SELECT t1.*,t2.*
FROM Table1 t1
LEFT JOIN Table2 t2 ON t1.ID = t2.ID
WHERE t2.ID IS NULL

SQL SERVER - Introduction to JOINs - Basic of JOINs left join null

The above example can also be created using Right Outer Join.

SQL SERVER - Introduction to JOINs - Basic of JOINs right join null

NOT INNER JOIN
Remember, the term Not Inner Join does not exist in database terminology. However, when full Outer Join is used along with WHERE condition, as explained in the above two examples, it will give you exclusive result to Inner Join. This join will give all the results that were not present in Inner Join.

SQL SERVER - Introduction to JOINs - Basic of JOINs outer join null

You can download the complete SQL Script here, but for the sake of complicity I am including the same script here.

USE AdventureWorks
GO
CREATE TABLE table1
(ID INT, Value VARCHAR(10))
INSERT INTO Table1 (ID, Value)
SELECT 1,'First'
UNION ALL
SELECT 2,'Second'
UNION ALL
SELECT 3,'Third'
UNION ALL
SELECT 4,'Fourth'
UNION ALL
SELECT 5,'Fifth'
GO
CREATE TABLE table2
(ID INT, Value VARCHAR(10))
INSERT INTO Table2 (ID, Value)
SELECT 1,'First'
UNION ALL
SELECT 2,'Second'
UNION ALL
SELECT 3,'Third'
UNION ALL
SELECT 6,'Sixth'
UNION ALL
SELECT 7,'Seventh'
UNION ALL
SELECT 8,'Eighth'
GO
SELECT *
FROM Table1
SELECT *
FROM Table2
GO
USE AdventureWorks
GO
/* INNER JOIN */
SELECT t1.*,t2.*
FROM Table1 t1
INNER JOIN Table2 t2 ON t1.ID = t2.ID
GO
/* LEFT JOIN */
SELECT t1.*,t2.*
FROM Table1 t1
LEFT JOIN Table2 t2 ON t1.ID = t2.ID
GO
/* RIGHT JOIN */
SELECT t1.*,t2.*
FROM Table1 t1
RIGHT JOIN Table2 t2 ON t1.ID = t2.ID
GO
/* OUTER JOIN */
SELECT t1.*,t2.*
FROM Table1 t1
FULL OUTER JOIN Table2 t2 ON t1.ID = t2.ID
GO
/* LEFT JOIN - WHERE NULL */
SELECT t1.*,t2.*
FROM Table1 t1
LEFT JOIN Table2 t2 ON t1.ID = t2.ID
WHERE t2.ID IS NULL
GO
/* RIGHT JOIN - WHERE NULL */
SELECT t1.*,t2.*
FROM Table1 t1
RIGHT JOIN Table2 t2 ON t1.ID = t2.ID
WHERE t1.ID IS NULL
GO
/* OUTER JOIN - WHERE NULL */
SELECT t1.*,t2.*
FROM Table1 t1
FULL OUTER JOIN Table2 t2 ON t1.ID = t2.ID
WHERE t1.ID IS NULL OR t2.ID IS NULL
GO
/* CROSS JOIN */
SELECT t1.*,t2.*
FROM Table1 t1
CROSS JOIN Table2 t2
GO
DROP TABLE table1
DROP TABLE table2
GO

I hope this article fulfills its purpose. I would like to have feedback from my blog readers. Please suggest me where do you all want me to take this article next.

Reference : Pinal Dave (https://blog.sqlauthority.com)

Best Practices, Database, SQL Joins, SQL Scripts, SQL User Group
Previous Post
SQL SERVER – FIX : ERROR : The SQL Server System Configuration Checker cannot be executed due to WMI configuration on the machine Error:2147749896 (0×80041008)
Next Post
SQL SERVER – Check if Current Login is Part of Server Role Member

Related Posts

402 Comments. Leave new

  • Love it. Thanks.

    Reply
  • Really a nice article …

    One thing i need to clarify what is the difference b/w JOINING and NESTED QUERIES in terms of performance wise and internal execution.

    Thanks.

    Reply
  • Nilesh Molankar
    January 25, 2010 11:02 am

    Hello Pinal\All,
    Any update on the question posted by “Noel Stefan Stoyanoff” on August 25, 2009 at 2:00 am.
    ——————————————————
    I have a question concerning a statement you made above where rather than writing a query as
    –QUERY 1
    SELECT t1.*
    FROM Table1 t1
    WHERE t1.ID NOT IN (SELECT t2.ID FROM Table2 t2)

    I should write it as
    –QUERY 2
    SELECT t1.*,t2.*
    FROM Table1 t1
    LEFT JOIN Table2 t2 ON t1.ID = t2.ID
    WHERE t2.ID IS NULL

    as it is a best practice. Does this mean that the later statement will run more quickly than the former or are you calling it a best practice from a legibility stand point?
    ——————————————————

    The estimated execution plan for Query 1 and Query 2 do not look very different.
    When dealing with large tables which approach would prove better from performace point of view.
    Please let us know.

    Reply
    • Marko Parkkola
      January 26, 2010 2:06 am

      NOT IN is always a very bad operator. Especially if Table2.ID is not indexed. You will end up nested loops which takes a lot of CPU time. So execution goes like:

      for-each ID in Table1
      for-each ID in Table2
      if Table1.ID Table2.ID
      AddToResults Table1.ID

      Reply
      • What if Table 2 only has 10 records?

      • Execution takes count(Table1.ID) * count(Table2.ID) * t (where t = time to do the actual comparison operation and store results) to finish. In any case.

        It might be that Query Optimizer decides to do nested loop even with LEFT JOIN. Sometimes it’s faster than other join types. But I think that with NOT IN operator you lock yourself into a nested loop and optimizer can’t optimize it at all.

        And remember to use indexes if possible. My friend got the task to speed up a Oracle database which had NOT IN operator used with unindexed fields. He put two indexes to the tables and query execution time dropped down from something like 2 minutes to 30 seconds. Btw. it took almost 2 days from the DB to create those indexes ;)

        But if anyone has better knowledge about this I’m happy to be corrected.

  • Nilesh Molankar
    January 26, 2010 10:07 am

    Thank you Marko..

    Reply
  • Really it’s a nice article and easy to learn.
    Thank you Pinal.
    Keep writting…

    Reply
  • Hi Pinal,
    This is really an excellent article on joins. As am a beginner it cleared my confusion as it covered the basic knowledge on joins which is absolutely brilliant.

    Regards,
    Vish.

    Reply
    • Hi I am Irfan Ali and i was much doupt about joins in sql but after studing from this site my doupt was removed .So finally thanks of ur site.

      Reply
  • hi pinal,

    Excellent article appreciated.

    I have a question,which type join will be used here.

    Select T1.* ,T2.* from T1,T2
    where T1.col1=T2.col1

    Select T1.* ,T2.* from T1,T2
    where T1.col1T2.col1

    Can u explain few details on this type of queries.

    Thanks & Regards,

    Sameer

    Reply
    • Hello Sameer,

      Your second select statement is not correct.
      The type of join to be used, depends on the required output not on the output columns or where clause.

      Regards,
      Pinal Dave

      Reply
    • @sameer

      The first query is an inner join.

      The second statement is invalid.

      Reply
    • Considering that there is a WHERE clause the first statement is indeed an INNER JOIN or a LEFT OUTER JOIN which omits the rows having NULL values for the 2nd table, but if you will omit the WHERE clause it is a cross-join or cartesian product between the first and second table.

      You could imagine this the other way around:

      If the WHERE clause would be “where T1.col1 \T2.col1” the first statement would become a RIGHT OUTER JOIN where you omit the rows with any NULL values or a cartesian product between Table1 and Table2 where you omit the rows where the IDs have same value.

      This is how I see it.

      Please correct me if I’m wrong.

      PS: 2nd statement probably contains the less-than greater-than ( Not equal to ) statement which can’t be rendered on the website ;).

      Reply
  • Very good article regarding joins.
    Now though my question is not relevant here but I don’t know how to ask questions from you (Pinal).

    I am having sql server 2005 enterprise edition with sp3 which works smoothly but at some time it stops accepting new connections and the problem gets resolved when I restart the service. I have tried my level best to find out a solution for it but still not successful. Then one of the instructor of CBT Nuggets for the course of 70-432 mentioned your name.

    Now I am waiting for your answer, what could be the possible reason due to which I am having this problem.

    please help me.

    Reply
  • Hello Tahir,

    When server stop accepting new connections, does it return any error or just stop responding? If you get error that please let us know that.

    Regards,
    Pinal Dave

    Reply
  • Thanks a lot for responding.

    The server does not return any error, it just stop responding, cause I always check the error log when it stops accepting connection.

    Reply
  • how do i use outer joins in more than 3 tables?

    Reply
  • nice presentation. really a great oppertunity to learn the concept indetail. thanks a lot.

    Reply
  • Simply Superb , Nice Explanation
    Thank you very much

    Reply
  • hi, i have one problem

    table1

    salesperson,region,customercreated,createddate,closeddate,

    table2

    salesperson,region,targetassigned

    i want to join this 2 table so that i can view what is the target given to salesperson,region & how much he has achieved.
    target achieved is computed by sum of customercreated
    which join i must use.

    thanks in advance.

    Reply
    • Brian Tkatch
      March 9, 2010 6:44 pm

      @irfan

      how’s this?

      SELECT
      table1.salesperson,
      table1.region,
      table2.targetassigned,
      SUM(table1.customercreated) targetachieved,
      MIN(createddate) started,
      MAX(closeddate) completed
      FROM
      table1,
      Table2
      WHERE
      table2.salesperson = table1.salesperson
      AND table2.region = table1.region
      GROUP BY
      table1.salesperson,
      table1.region,
      table2.targetassigned;

      Reply
      • Thanks Brain Tkatch

        this is exactly what i had tried but in closed date (for every row)i’m getting null value….

        Is this because Some closeddate are null in original table as they are not yet confirmed or under progress.

  • i want date also in resultant table so that i can know when
    has started & completed it

    thanks

    Reply
  • Hi

    Excellent artical … i learnt joins very easily using these given examples.. nice way of explaning the joins using venn diagrams.

    Thanks a lot…

    Reply
  • Hello Pinal..

    I read ur JOIN article…

    But in LEFT OUTER JOIN

    If there are no columns matching in the right table, it returns NULL values.

    I didn’t get it…can u explain me in depth…

    Thanks..

    Reply
  • very good approach for joins using venn diagram.

    Reply
  • simply superb

    Reply
  • hi,
    i have one problem

    there are 2 table

    1) impressions (id, pub_id, ad_id, ip, click)
    2) clicks (id, pub_id, ad_id, ip)

    i want to find the following in single query

    1) total number of impressions
    2) total number of clicks
    3) total number of clicks where impressions IP matches clicks IP
    4) total number of clicks where impressions IP does not match clicks IP

    plz help me
    thanks

    Reply
    • Starts with this

      select
      count(*),
      count(c.id),
      count(c.ip),
      sum(case when c.ip is null then 1 else end)
      from impressions as i left join clicks as c
      on i.id=c.id

      Reply

Leave a Reply