
I blog and engage with the community because it gives me satisfaction when someone resolves an issue. A few days ago, I blogged about a DBA who began his first day at a new company and could not find out where the installation summary file was. He was very happy when I featured his story on our blog. Today he asked me another question and when I received his question my first reaction was – not possible. Later I said, may be possible, and when he shared more information, I said of course it is possible and natural. Let us go over these conversations in the exact order of how they occurred.
(I have modified the emails to chat format as it is easier to understand)
Jeff: Pinal, help needed. My query is returning more data than I need.
Pinal: Well, check your WHERE clause, JOINS etc.
Jeff: The problem is that INNER JOIN is returning more records than exist in the table.
Pinal: Not possible. How can a query return data that does not exist. Are you sure that you are connected to the correct database and running the query on the same server where you are validating the data?
Jeff: Yes, yes, everything is in the right place. Can Inner Join ever return more data than exists in the table?
Pinal: Well, maybe. Here is the story. Inner Join can for sure return more records than exists in the table, but it cannot return records that do not exist. Does that make sense?
Jeff: No.
Pinal: Okay, in simple words, if your table has three rows (values 1, 2, 3), your inner join can return 10 rows but it cannot return you the value 4 as part of the result.
Jeff: In simple English please…
Pinal: Honestly, I cannot make it more simple than what I said just now. Here is the example for you.
(Jeff goes through example and after few minutes)
Jeff: Exactly, I had no equal condition which did this. You helped me again.
Pinal: You’re welcome!
(After few minutes…)
Jeff: Pinal, help needed…
Well, this goes on. Anyway, here is the script that I sent him.
Let us create a scenario. We will create three sample tables and insert values in it.
USE TempDB
GO
-- Sample Table
CREATE TABLE FirstTable (Col1 INT)
CREATE TABLE SecondTable (Col1 INT)
CREATE TABLE ThirdTable (Col1 INT)
GO
INSERT INTO FirstTable (Col1)
VALUES (1), (2), (3), (NULL)
GO
INSERT INTO SecondTable (Col1)
VALUES (1), (2), (3), (NULL)
GO
INSERT INTO ThirdTable (Col1)
VALUES (2), (2), (2), (2), (2), (NULL)
GO
Here you will see the normal use of INNER JOIN where a query returns the results as the number of rows or fewer than the number of rows in the table. This happens because the data is unique in both of the tables.
Query 1: Inner Join with Equal to Sign
-- Query 1
SELECT f.Col1 fCol1
FROM FirstTable f
INNER JOIN SecondTable s ON s.Col1 = f.col1
GO

Query 2: Inner Join with Not Equal to Sign
Here is a query where the data is still unique (distinct) among tables but the result set contains more records than the numbers of rows the table contains. The condition <> (not equal to) returns the result which was excluded in the query where = (equal to) condition was used.
-- Query 2
SELECT f.Col1 fCol1
FROM FirstTable f
INNER JOIN SecondTable s ON s.Col1 <> f.col1
GO

Note: If you combine the data of Query 1 and Query 2, you will get the same result as CROSS JOIN of both the tables. If you want to understand further, read my earlier article Introduction to Join – Visual Explanation.
Query 3: Inner Join with Non-Distinct Row Table
Here is a query where data in another table is not unique and INNER JOIN returns more records than one of the tables.
-- Query 3
SELECT f.Col1 fCol1
FROM FirstTable f
INNER JOIN ThirdTable s ON s.Col1 = f.col1
GO

Summary
Inner Join can for sure return more records than the records of the table. Inner join returns the results based on the condition specified in the JOIN condition. If there are more rows that satisfy the condition (as seen in query 2), it will return you more results. Reference: Introduction to Join – Visual Explanation.
Well, feel free to send your questions or even solutions. If I find them interesting I will share them with everybody with due credit.
Reference: Pinal Dave (http://blog.sqlauthority.com)
It was really awsome…….PINAL
Hi pinal sir,
We are also faced such kind of senarion many times while using inner join for more than 2 tables. We then find out that using proper join and distinct we are able to avoid such result.
Thank$
Chirag Satasiya
Hi Pinal,
Is this kind of behavior observed only in inner join or even in other joins?
Hi Pinal,
it is very nice example.
I like very much that Querry.1 it tells that.
Null = Null = False
Null is Null = True.
Wow..!! What a romantic SQL language we have..!!!!!
One of my friends asking me like this “How can i learn SQL Easily…!!!! is there any trick for this..???”
I told him that Do like this then you will get it easily..
Eat SQL,
Drink SQL,
Play With SQL,
Dream SQL,
Sing SQL,
Sleep with SQL,
Wake up with SQL……!!!!!
Excellent comment my friend. Love it!
Note that “Null = Null is False” is the default behavior, not when you change that behavior with:
SET ANSI_NULLS OFF
Also “Null != Null is False” is always the same behavior, but the reason it is false changes because of the ANSI_NULLS setting.
Hi pinal,
i send you one mail please do needful….!!!!!
Hope you help me….!!!
Pingback: SQL SERVER – A Quick Puzzle on JOIN and NULL – SQL Brain Teaser « SQL Server Journey with SQL Authority
I can visualize what is happening here, but for those who cannot, try
SELECT f.Col1 fCol1, s.Col1 sCol1
so you can see the column that is matching. This may help you to understand why certain rows are returned.
I saw the challange and didn’t look at the article. Of course, it’s easy. When I did it, I used the “=” query instead of the != query which was my first choice, then, no, I want every table row picked in every join.:
DECLARE @tbl TABLE(i1 int)
INSERT INTO @tbl VALUES (1),(2),(3),(4),(5)
SELECT a.i1 first, b.i1 second
FROM @tbl a
INNER JOIN @tbl b
ON 1=1
Interesting that the result is the same as adding “ORDER BY b.i1, a.i1 so it processes from right to left. IE get the first record of the second table, find all the matches in the first table for that row… If I changed the insert order it would match that insert order since it is a heap table that hasn’t had a chance to fragment.
Hi Pinal,
If Not Equal to Sign is giving the duplicate records then why we are using it. What is use of Not Equal to Sign in SQL?
Please tell me.
Pingback: SQL SERVER – Case Sensitive Database and Database User – Fix: Error: 15151 – Cannot find the user , because it does not exist or you do not have permission. « SQL Server Journey with SQL Authority
Pinal Dave,
Sincere Thanks. Extremely good article.
We will come across Query3 type often (Ex : One to many Relationship)
Hi Pinal,
I have query related to joining. If I am using a OR condition to join 2tables then I am getting more than the expected number of records if the value matches with both conditions. For example.
select s.id, s.name1,s.name2,d.[desc]
from dbo.Test_Join_Source s
inner join dbo.Test_Join_Destination d
on
(s.name1=d.name1 or s.name2=d.name2)
The above query will produce 2 records if the source record matches by both the joining condition.
Question is, How do I prevent this.
hi Pinal,
My name is srikanthreddy
How to use join more than three tables with in one query
Pingback: SQL SERVER – Weekly Series – Memory Lane – #015 « SQL Server Journey with SQL Authority
Thank you for your article. If you don’t mind, I have a question. I have two tables with information about a sales order. The left hand table contains one row of the sales order header information, sales order number and total etc. I need the required date for the order from the second table. The second table contains multiple rows per sales order as a customer may order an item with different delivery dates. However I only want to return the first required date from that table so that the query only ever outputs one row per sales order. How could I achieve this?
Somewhat misleading – since you allowed nulls – a dubious practise in my opinion – sacrilege to others such as Chris Date. Does however illustrate nicely why base tables should never have nulls – the only time a find any benefit in nulls is in views / queries resulting from outer joins.