SQL SERVER – INNER JOIN Returning More Records than Exists in Table

SQL SERVER - INNER JOIN Returning More Records than Exists in Table join

I blog and engage with the community because it gives me satisfaction when someone resolves an issue. A few days ago, I blogged about a DBA who began his first day at a new company and could not find out where the installation summary file was. He was very happy when I featured his story on our blog. Today he asked me another question and when I received his question my first reaction was – not possible. Later I said, may be possible, and when he shared more information, I said of course it is possible and natural. Let us go over these conversations in the exact order of how they occurred.

(I have modified the emails to chat format as it is easier to understand)

Jeff: Pinal, help needed. My query is returning more data than I need.
Pinal: Well, check your WHERE clause, JOINS etc.
Jeff: The problem is that INNER JOIN is returning more records than exist in the table.
Pinal: Not possible. How can a query return data that does not exist. Are you sure that you are connected to the correct database and running the query on the same server where you are validating the data?
Jeff: Yes, yes, everything is in the right place. Can Inner Join ever return more data than exists in the table?
Pinal: Well, maybe. Here is the story. Inner Join can for sure return more records than exists in the table, but it cannot return records that do not exist. Does that make sense?
Jeff: No.
Pinal: Okay, in simple words, if your table has three rows (values 1, 2, 3), your inner join can return 10 rows but it cannot return you the value 4 as part of the result.
Jeff: In simple English please…
Pinal: Honestly, I cannot make it more simple than what I said just now. Here is the example for you.
(Jeff goes through example and after few minutes)
Jeff: Exactly, I had no equal condition which did this. You helped me again.
Pinal: You’re welcome!
(After few minutes…)
Jeff: Pinal, help needed…

Well, this goes on. Anyway, here is the script that I sent him.

Let us create a scenario. We will create three sample tables and insert values in it.

USE TempDB
GO
-- Sample Table
CREATE TABLE FirstTable (Col1 INT)
CREATE TABLE SecondTable (Col1 INT)
CREATE TABLE ThirdTable (Col1 INT)
GO
INSERT INTO FirstTable (Col1)
VALUES (1), (2), (3), (NULL)
GO
INSERT INTO SecondTable (Col1)
VALUES (1), (2), (3), (NULL)
GO
INSERT INTO ThirdTable (Col1)
VALUES (2), (2), (2), (2), (2), (NULL)
GO

Here you will see the normal use of INNER JOIN where a query returns the results as the number of rows or fewer than the number of rows in the table. This happens because the data is unique in both of the tables.

Query 1: Inner Join with Equal to Sign

-- Query 1
SELECT f.Col1 fCol1
FROM FirstTable f
INNER JOIN SecondTable s ON s.Col1 = f.col1
GO

SQL SERVER - INNER JOIN Returning More Records than Exists in Table morequeryresult1

Query 2: Inner Join with Not Equal to Sign

Here is a query where the data is still unique (distinct) among tables but the result set contains more records than the numbers of rows the table contains. The condition <> (not equal to) returns the result which was excluded in the query where = (equal to) condition was used.

-- Query 2
SELECT f.Col1 fCol1
FROM FirstTable f
INNER JOIN SecondTable s ON s.Col1 <> f.col1
GO

SQL SERVER - INNER JOIN Returning More Records than Exists in Table morequeryresult2

Note: If you combine the data of Query 1 and Query 2, you will get the same result as CROSS JOIN of both the tables. If you want to understand further, read my earlier article Introduction to Join – Visual Explanation.

Query 3: Inner Join with Non-Distinct Row Table

Here is a query where data in another table is not unique and INNER JOIN returns more records than one of the tables.

-- Query 3
SELECT f.Col1 fCol1
FROM FirstTable f
INNER JOIN ThirdTable s ON s.Col1 = f.col1
GO

SQL SERVER - INNER JOIN Returning More Records than Exists in Table morequeryresult3

Summary

Inner Join can for sure return more records than the records of the table. Inner join returns the results based on the condition specified in the JOIN condition. If there are more rows that satisfy the condition (as seen in query 2), it will return you more results. Reference: Introduction to Join – Visual Explanation.

Well, feel free to send your questions or even solutions. If I find them interesting I will share them with everybody with due credit.

Reference: Pinal Dave (https://blog.sqlauthority.com)

,
Previous Post
SQL SERVER – Convert Subquery to CTE – SQL in Sixty Seconds #001 – Video
Next Post
SQL SERVER – A Quick Puzzle on JOIN and NULL – SQL Brain Teaser

Related Posts

28 Comments. Leave new

  • Thank you Dave! I found the same scenario where I had to join multiple tables using a Code (varchar) instead of an Id, falsely believing each code was unique. Of course as you said, the join condition was satisfied for these duplicate codes, therefore much more records were returned in the result table than I was expecting.

    You have saved my day once more :) Thanks again!

    Reply
  • Thank you for the explanation. So may i know how do i resolve such a situation? I am using SAS for my assignment. Please help. Thank you.

    Reply
  • This was an excellent article and it helped me.
    I had a series of joins that added these two new joins, below.
    They appear to have reproduced the authors Query 3 example above with huge increase in number of rows :

    INNER JOIN AdventureWorksDW2019.dbo.FactCurrencyRate as FCR
    ON FCR.DateKey = DD.DateKey
    INNER JOIN AdventureWorksDW2019.dbo.FactCurrencyRate
    ON FCR.CurrencyKey = FRS.CurrencyKey

    Rewrote the query like this to solve the problem,
    not sure if this is standard syntax, but it worked on SQL Server :

    INNER JOIN AdventureWorksDW2019.dbo.FactCurrencyRate as FCR
    ON ( FCR.DateKey = DD.DateKey
    AND FCR.CurrencyKey = FRS.CurrencyKey )

    Reply

Leave a Reply

Menu