If you have not understood what I am going to talk in this blog post by just reading the title, I think you are a normal human like me. Just the other day I received an email from a blog reader with subject line as “Select Unique Data From a Column Exist in Two Different Table”.
After exchanging multiple emails with him, I finally understood his business needs.
For example, he had two tables t1 with data (1,2,3) and t2 with data (1,4,5). In his resultset he wanted to remove the value 1 as it exists in both the tables and wanted to keep the final resultset as 2,3,4,5.
Let me explain you what he really needed with an image.
Now the task was to write a code which will remove the duplicates value from both the table and present the unique values in a single column. Initially I tried to write this query with some complex logic in a single SELECT statement. However, as I had another meeting coming up, I quickly wrote a code where I have combined both the table’s unique data with UNION.
Let me show you my solution.
First, we will create two sample tables.
CREATE TABLE t1 (ID INT); INSERT INTO t1 SELECT 1 UNION SELECT 2 UNION SELECT 3 GO CREATE TABLE t2 (ID INT); INSERT INTO t2 SELECT 1 UNION SELECT 4 UNION SELECT 5 GO SELECT * FROM t1; SELECT * FROM t2 GO
Next, let us see my quick and dirty solution (no way this is optimal solution).
SELECT t1.ID FROM t1 LEFT JOIN t2 ON t1.ID = t2.ID WHERE t2.ID IS NULL UNION SELECT t2.ID FROM t2 LEFT JOIN t1 ON t1.ID = t2.ID WHERE t1.ID IS NULL GO
When I sent the result back to him, he was pretty happy. However, I was personally not happy with my query. I believe there will be an optimal way to write this query which will give us better performance.
Can you think of the a simpler way to write this query?
Reference:Â Pinal Dave (https://blog.sqlauthority.com)
46 Comments. Leave new
;with c1 as (
select id from t1
Union
select id from t2)–,t1 where t1.IDt2.ID
, c2 as (
select id from t1
intersect
select id from t2)–,t1 where t1.IDt2.ID
select c1.* from c1,c2 where c1.id c2.id
When I was reading, I was thinking of a query with the union – except – intersect approach, very similar to what some folks like Prithvi Nath Pandey put on the comments. And after the read, I have made some testing on my local setup, and that was the worst of the 3 cases I have tried :-/ The test was made with a million rows on each table, running sql server on a local station.
I’ve tried:
1) Pinal Dave’s joins – union
2) My union – except – intersect
3) Manoj Sahoo’s union all – group – count
The best performance I got was using the third option, Manoj Sahoo’s script. It have made half the scans of the other solutions, and I believe that’s what makes it so better.
The intermediate was Pinal Dave’s, and the worst was my idea. The statistics of them shows the same amount of reads, but some CPU differences.
Based on your solution, are you implying that `id` is not unique in each table? Why else would you use `union` instead of `union all`?
Your query checks that a given `id` does not exist in the other table, for each table, and after that it returns those distinct values via `union`. If that is what is necessary, then most of the other answers here would need to add `distinct` to their query…
Comparison of some options posted here, with indexes:
https://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=7f8e294242ac6e6428134615df8f5447
Here is a pure set operator answer. I have no idea what kind of performance it has in SQL Server.
CREATE TABLE T1
(foo_id CHAR(1) NOT NULL PRIMARY KEY,
something_value INTEGER NOT NULL);
INSERT INTO T1 VALUES (‘1’, 1), (‘2’, 2), (‘3’, 3);
CREATE TABLE T2
(foo_id CHAR(1) NOT NULL PRIMARY KEY,
something_value INTEGER NOT NULL);
INSERT INTO T1 VALUES (‘1’, 1), (‘2’, 4), (‘3’, 5);
SELECT T12.something_value
FROM ((SELECT something_value FROM T1
UNION ALL
SELECT something_value FROM T2)
EXCEPT
(SELECT something_value FROM T1
INTERSECT
SELECT something_value FROM T2)) AS T12(something_value)
This is another way of doing the same:–
SELECT isnull(t1.ID,t2.id) as id
FROM t1
full outer JOIN t2 ON t1.ID = t2.ID
WHERE t2.ID IS NULL or t1.ID IS NULL order by 1
SELECT * from ( select id FROM t1
union
SELECT id FROM t2 )a where a.id not in (
SELECT a.id FROM t1 a join t2 b on a.id = b.id)
CREATE TABLE T1(ID INT)
CREATE TABLE T2(ID INT)
INSERT INTO T1(ID)
SELECT 1
UNION
SELECT 2
UNION
SELECT 3
INSERT INTO T2(ID)
SELECT 1
UNION
SELECT 4
UNION
SELECT 5
SELECT * FROM (SELECT * FROM T1
UNION
SELECT * FROM T2
) A
WHERE ID NOT IN (
SELECT T1.ID FROM T1,T2
WHERE T1.ID =T2.ID
)
Simple and Best Solution is
SELECT isnull(t1.ID,t2.id) as id
FROM t1
full outer JOIN t2 ON t1.ID = t2.ID
WHERE t2.ID IS NULL or t1.ID IS NULL
select * from t1 where not exists (select t2.ID from t2 where t2.id = t1.id)
union all
select * from t2 where not exists (select t1.ID from t1 where t2.id = t1.id)
I like this one:
select * from t1
except
select * from t2
union
(
select * from t2
except
select * from t1
)
SELECT isnull(t1.ID,t2.id) as id
FROM t1
full outer JOIN t2 ON t1.ID = t2.ID
WHERE t2.ID IS NULL or t1.ID IS NULL order by id
Declare @t1 TABLE (ID INT);
INSERT INTO @t1
SELECT 1
UNION
SELECT 2
UNION
SELECT 3
DECLARE @t2 TABLE (ID INT);
INSERT INTO @t2
SELECT 1
UNION
SELECT 4
UNION
SELECT 5
SELECT *
FROM @t1 T;
SELECT *
FROM @t2
SELECT DISTINCT id FROM (
SELECT id FROM @t1
UNION
SELECT id FROM @t2
EXCEPT
(
SELECT id FROM @t1
INTERSECT
SELECT id FROM @t2
)
) AS A
select id from (select id from t1
union all
select id from t2) main
group by id
having count(id)<=1
This will give expected result in above scenario.
This is how I will do in such a scenario :
WITH C1
AS
(
SELECT
ID
FROM T1
UNION ALL
SELECT
ID
FROM T2
),C2
(
SELECT
ID,
Cnt = COUNT(1)
FROM C1
)
SELECT
*
FROM C2 WHERE Cnt = 1
Forgot to add Group By Id in C2
;WITH CTE
AS
(
SELECT *
FROM T1
UNION
SELECT *
FROM T2
)
SELECT *
FROM CTE C
WHERE NOT EXISTS (SELECT 1 FROM T1 JOIN T2 ON T1.ID = T2.ID WHERE C.ID = T1.ID)
SELECT * FROM t1
UNION ALL
SELECT * FROM t2
EXCEPT
SELECT * FROM t1
INTERSECT
SELECT * FROM t2
How about this?
SELECT coelesce(t1.ID,t2.id) as id
FROM t1
full outer JOIN t2 ON t1.ID = t2.ID
WHERE ( t1.ID IS NULL or t2.ID IS NULL );
Select t.id from test t where t.id not in (Select id from test2)
Union
Select t1.id from test2 t2 where t2.id not In (Select id from test)
with c1 as ( select id from t1 Union select id from t2)
, c2 as (select id from t1 intersect select id from t2)
select * from c1 where c1.id not in (select distinct id from c2)
SELECT * FROM t1
UNION
SELECT * FROM t2
EXCEPT
SELECT * FROM t1
INTERSECT
SELECT * FROM t2