I blog and engage with the community because it gives me satisfaction when someone resolves an issue. A few days ago, I blogged about a DBA who began his first day at a new company and could not find out where the installation summary file was. He was very happy when I featured his story on our blog. Today he asked me another question and when I received his question my first reaction was – not possible. Later I said, may be possible, and when he shared more information, I said of course it is possible and natural. Let us go over these conversations in the exact order of how they occurred.
(I have modified the emails to chat format as it is easier to understand)
Jeff: Pinal, help needed. My query is returning more data than I need.
Pinal: Well, check your WHERE clause, JOINS etc.
Jeff: The problem is that INNER JOIN is returning more records than exist in the table.
Pinal: Not possible. How can a query return data that does not exist. Are you sure that you are connected to the correct database and running the query on the same server where you are validating the data?
Jeff: Yes, yes, everything is in the right place. Can Inner Join ever return more data than exists in the table?
Pinal: Well, maybe. Here is the story. Inner Join can for sure return more records than exists in the table, but it cannot return records that do not exist. Does that make sense?
Jeff: No.
Pinal: Okay, in simple words, if your table has three rows (values 1, 2, 3), your inner join can return 10 rows but it cannot return you the value 4 as part of the result.
Jeff:Â In simple English please…
Pinal:Â Honestly, I cannot make it more simple than what I said just now. Here is the example for you.
(Jeff goes through example and after few minutes)
Jeff: Exactly, I had no equal condition which did this. You helped me again.
Pinal: You’re welcome!
(After few minutes…)
Jeff:Â Pinal, help needed…
Well, this goes on. Anyway, here is the script that I sent him.
Let us create a scenario. We will create three sample tables and insert values in it.
USE TempDB
GO
-- Sample Table
CREATE TABLE FirstTable (Col1 INT)
CREATE TABLE SecondTable (Col1 INT)
CREATE TABLE ThirdTable (Col1 INT)
GO
INSERT INTO FirstTable (Col1)
VALUES (1), (2), (3), (NULL)
GO
INSERT INTO SecondTable (Col1)
VALUES (1), (2), (3), (NULL)
GO
INSERT INTO ThirdTable (Col1)
VALUES (2), (2), (2), (2), (2), (NULL)
GO
Here you will see the normal use of INNER JOIN where a query returns the results as the number of rows or fewer than the number of rows in the table. This happens because the data is unique in both of the tables.
Query 1: Inner Join with Equal to Sign
-- Query 1
SELECT f.Col1 fCol1
FROM FirstTable f
INNER JOIN SecondTable s ON s.Col1 = f.col1
GO
Query 2: Inner Join with Not Equal to Sign
Here is a query where the data is still unique (distinct) among tables but the result set contains more records than the numbers of rows the table contains. The condition <> (not equal to) returns the result which was excluded in the query where = (equal to) condition was used.
-- Query 2
SELECT f.Col1 fCol1
FROM FirstTable f
INNER JOIN SecondTable s ON s.Col1 <> f.col1
GO
Note: If you combine the data of Query 1 and Query 2, you will get the same result as CROSS JOIN of both the tables. If you want to understand further, read my earlier article Introduction to Join – Visual Explanation.
Query 3: Inner Join with Non-Distinct Row Table
Here is a query where data in another table is not unique and INNER JOIN returns more records than one of the tables.
-- Query 3
SELECT f.Col1 fCol1
FROM FirstTable f
INNER JOIN ThirdTable s ON s.Col1 = f.col1
GO
Summary
Inner Join can for sure return more records than the records of the table. Inner join returns the results based on the condition specified in the JOIN condition. If there are more rows that satisfy the condition (as seen in query 2), it will return you more results. Reference: Introduction to Join – Visual Explanation.
Well, feel free to send your questions or even solutions. If I find them interesting I will share them with everybody with due credit.
Reference:Â Pinal Dave (https://blog.sqlauthority.com)
28 Comments. Leave new
It was really awsome…….PINAL
Hi pinal sir,
We are also faced such kind of senarion many times while using inner join for more than 2 tables. We then find out that using proper join and distinct we are able to avoid such result.
Thank$
Chirag Satasiya
Hi Pinal,
Is this kind of behavior observed only in inner join or even in other joins?
Hi Pinal,
it is very nice example.
I like very much that Querry.1 it tells that.
Null = Null = False
Null is Null = True.
Wow..!! What a romantic SQL language we have..!!!!!
One of my friends asking me like this “How can i learn SQL Easily…!!!! is there any trick for this..???”
I told him that Do like this then you will get it easily..
Eat SQL,
Drink SQL,
Play With SQL,
Dream SQL,
Sing SQL,
Sleep with SQL,
Wake up with SQL……!!!!!
Hi pinal,
i send you one mail please do needful….!!!!!
Hope you help me….!!!
I can visualize what is happening here, but for those who cannot, try
SELECT f.Col1 fCol1, s.Col1 sCol1
so you can see the column that is matching. This may help you to understand why certain rows are returned.
I saw the challange and didn’t look at the article. Of course, it’s easy. When I did it, I used the “=” query instead of the != query which was my first choice, then, no, I want every table row picked in every join.:
DECLARE @tbl TABLE(i1 int)
INSERT INTO @tbl VALUES (1),(2),(3),(4),(5)
SELECT a.i1 first, b.i1 second
FROM @tbl a
INNER JOIN @tbl b
ON 1=1
Interesting that the result is the same as adding “ORDER BY b.i1, a.i1 so it processes from right to left. IE get the first record of the second table, find all the matches in the first table for that row… If I changed the insert order it would match that insert order since it is a heap table that hasn’t had a chance to fragment.
Hi Pinal,
If Not Equal to Sign is giving the duplicate records then why we are using it. What is use of Not Equal to Sign in SQL?
Please tell me.
Pinal Dave,
Sincere Thanks. Extremely good article.
We will come across Query3 type often (Ex : One to many Relationship)
Hi Pinal,
I have query related to joining. If I am using a OR condition to join 2tables then I am getting more than the expected number of records if the value matches with both conditions. For example.
select s.id, s.name1,s.name2,d.[desc]
from dbo.Test_Join_Source s
inner join dbo.Test_Join_Destination d
on
(s.name1=d.name1 or s.name2=d.name2)
The above query will produce 2 records if the source record matches by both the joining condition.
Question is, How do I prevent this.
hi Pinal,
My name is srikanthreddy
How to use join more than three tables with in one query
Thank you for your article. If you don’t mind, I have a question. I have two tables with information about a sales order. The left hand table contains one row of the sales order header information, sales order number and total etc. I need the required date for the order from the second table. The second table contains multiple rows per sales order as a customer may order an item with different delivery dates. However I only want to return the first required date from that table so that the query only ever outputs one row per sales order. How could I achieve this?
Somewhat misleading – since you allowed nulls – a dubious practise in my opinion – sacrilege to others such as Chris Date. Does however illustrate nicely why base tables should never have nulls – the only time a find any benefit in nulls is in views / queries resulting from outer joins.
“This happens because the data is unique in both of the tables.”
Man you are the man! this part solved my doubt! thanks!
I had two records on each table and it was bring me 4 as result.
lacks more one key on inner join condition. thanks.
it Helped me a lot than q
Thanks Pinal.. Your explanation helped me understand a behaviour that’s been killing me for 30mins..
fcbarcelonatalks – I am glad that it helped you
I’m in a very weird bug….. Using a with table1 as (540 rows with id field), table2 (540 rows with id field) Select t1 inner join t2 on t1.id=t2.id —- and getting 630 rows; another F5, now 634 rows; another F5 638 rows returned; another F5, 633 rows returned…….. Same query! Rows are created with partial value…
Sounds like there may be triggers or other auditing functions in play — would help if we could see the actual script(s)
How to solve 3rd scenario? I am stuck with it for a while now
Thank you Dave! I found the same scenario where I had to join multiple tables using a Code (varchar) instead of an Id, falsely believing each code was unique. Of course as you said, the join condition was satisfied for these duplicate codes, therefore much more records were returned in the result table than I was expecting.
You have saved my day once more :) Thanks again!