SQL SERVER – Difference Between Union vs. Union All – Optimal Performance Comparison

More than a year ago I had written article SQL SERVER – Union vs. Union All – Which is better for performance? I have got many request to update this article. It is not fair to update already written article so I am rewriting it again with additional information.

UNION
The UNION command is used to select related information from two tables, much like the JOIN command. However, when using the UNION command all selected columns need to be of the same data type. With UNION, only distinct values are selected.

UNION ALL
The UNION ALL command is equal to the UNION command, except that UNION ALL selects all values.

The difference between Union and Union all is that Union all will not eliminate duplicate rows, instead it just pulls all rows from all tables fitting your query specifics and combines them into a table.

A UNION statement effectively does a SELECT DISTINCT on the results set. If you know that all the records returned are unique from your union, use UNION ALL instead, it gives faster results.

Run following script in SQL Server Management Studio to see the result between UNION ALL and UNION. Download complete script from here.

/* Declare First Table */
DECLARE @Table1 TABLE (ColDetail VARCHAR(10))
INSERT INTO @Table1
SELECT 'First'
UNION ALL
SELECT 'Second'
UNION ALL
SELECT 'Third'
UNION ALL
SELECT 'Fourth'
UNION ALL
SELECT 'Fifth'
/* Declare Second Table */
DECLARE @Table2 TABLE (ColDetail VARCHAR(10))
INSERT INTO @Table2
SELECT 'First'
UNION ALL
SELECT 'Third'
UNION ALL
SELECT 'Fifth'
/* Check the data using SELECT */
SELECT *
FROM @Table1
SELECT *
FROM @Table2
/* UNION ALL */
SELECT *
FROM @Table1
UNION ALL
SELECT *
FROM @Table2
/* UNION */
SELECT *
FROM @Table1
UNION
SELECT
*
FROM @Table2
GO

In our example we have two tables: @Table1 and @Table2.

Now let us run UNION ALL and UNION together and see the resultset as well as Execution Plan compared to complete set of query. You can always turn on actual execution plan using CTRL+M.

We can see from the resultset of UNION ALL that it returns everything from both the table but from UNION it is very clear that only DISTINCT rows from both the table is only retrieved.

Additionally, when comparing the execution plan of UNION ALL and UNION it is also quite clear that UNION ALL is way less expensive than UNION as it does not have DISTINCT SORT operation.

Let me know what do you think about this article. If you have any suggestion for improvement please let me know and I will update articles according to that.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

About these ads

59 thoughts on “SQL SERVER – Difference Between Union vs. Union All – Optimal Performance Comparison

  1. Pingback: SQL SERVER - Union vs. Union All - Which is better for performance? Journey to SQL Authority with Pinal Dave

  2. Very nice article :)
    Just one question about the execution plans. I have observed that sometime the actual execution time is different from what is suggested. Like in a batch I see a query to occupy 40 % and query 60%.
    But when I actually execute both of them, Query 1 takes more time than the query 2
    Is it possible ?

  3. Great article! thank you!

    I have a question – on query cost diagram it shows 50% for table scan for UNION ALL and 18% for UNION. Why is it so different? Because of the number of records?

    If yes then UNION should be faster if it saves so much more resources on that stage?

    Thnk you in advance!

    • No, those numbers are percentages, and they have to add up to 100. They are smaller than 50% each because a new operation has entered the flow: distinct sort. This operation now takes 63% of the processing time.

      Keep in mind that these percentages provide no indication of the actual processing time involved, they just indicate the relative amounts of time each operation will take.

      But, if we assume that the table scans in UNION, and the table scans in UNION ALL take the same amount of actual time, then we can try to estimate how much more time the UNION operation will take.
      Let’s assume the UNION ALL takes 1000msec. Each table scan takes 500msec since each is 50% of the execution.
      Now if we go over to UNION and assume that a table scan will take 500msec here too, this 500msec actually only corresponds to 18% of the execution. So, 500msec/.18 = 2778msec.
      Thus, the UNION should take approx 2.7 times as long, if the assumption holds true and the execution plan is accurate.

  4. Hi
    how i can select data from two different tables and both have no relationship between them. how i can use only one select statement and obtain the data from two different tables

    • we want to know clearly because i want to know.i done query but it is getting like cartesian product of inserted values in both tables

      student table(sid,sname),course table(course)
      select * from student,course
      sid sname course
      1 aaa Java
      2 bbb Java
      3 ccc Java
      1 aaa .Net
      2 bbb .Net
      3 ccc .Net
      1 aaa Sql
      2 bbb Sql
      3 ccc Sql

  5. @Sajjad Ali,

    For example,

    Create table employees ( eid int, ename varchar(10))

    insert into employees values ( 1, ‘imran’)
    insert into employees values ( 2, ‘Taher’)

    We inserted two employees information.

    – Lets create another table that stores salary information

    create table Salary ( eid int, salary money)

    insert into Salary values ( 1, 1000)
    insert into Salary values ( 2, 2000)

    – We inserted two employees salary information.

    –Now if I want to get the complete data out of these two tables, I will use below script

    select A.eid
    ,A.ename
    ,B.salary
    from employees A
    join Salary B on A.eid = B.eid

    You need to have one common column on which you can join multiple tables.

    ~ IM.

  6. @Dasha Salo

    the percentages are relative , which means that 2×18% of resource in table scan v/s 63% for distinct sort.

    you cannot compare %values themselves for 2 different queries.

    the point to note is distinct sort is using more resources compared to no distinct sort.

    In simple words, read 50% as ’50 % of overall CPU resources/time’ for that execution plan compared to 18%.

  7. Hello,
    Nice article!
    This information was helpful in reducing the time required to insert multiple records in to a table. But seems like it triggered a new issue.It is monitored in our production sytem that SQL server private memory keeps increasing until it fails to reserve virtual memory to run query!We are using SQL server 2005 SP3.
    I wonder whether anybody else had this issue?Is there any hotfix/sp which addresses this issue?
    Thanks

  8. Hello Pinal,

    Nice article, helped a lot in performance tuning the queries. Just had a question, If a query can we written in three ways :
    (1) Having an “OR” operator
    (2) Having an “Union ALL” operator
    (3) having an “IN” operator

    Then which method should be used, keeping performance in mind.

    Regards,
    Deepesh

  9. Brilliant article – simple, yet clearly explained – I like the demo with the actual execution plan giving details of performance differences between Union and Union All

  10. Nice Article,
    I really got short and sweet description about union & union all with examples.
    Also I know new thing about execution plan in Server Studio from this article.So that I cam minimize some of my queries fro faster execution.

    Thanks.

  11. I have a doubt. To retrieve distinct values from union, I could either do
    1)
    SELECT * FROM A
    UNION
    SELECT * FROM B

    I could also do
    SELECT DISTINCT * FROM (
    SELECT * FROM A
    UNION ALL
    SELECT * FROM B
    ) x

    I am trying to tune a query which uses
    SELECT DISTINCT * FROM A
    UNION
    SELECT DISTINCT * FROM B
    In this case, the DISTINCTs do not serve any purpose. I am trying to figure out how I should rewrite it.

    Any guidelines on which one is better under which circumstances?
    Thanks

  12. I think the title of this article could be improved

    for e.g., instead of using Difference and vs in the same line

    “Difference Between Union vs. Union All – Optimal Performance Comparison”

    you could use

    “Difference Between Union and Union All – Optimal Performance Comparison”

    Its a wonderful article though!

  13. Hi,

    Very nice article…clearly differnciated the difference.
    But there is one thing that i would like to ask..

    As UNION sorts the o/p records on which column it will sort if there are more than one column which do not contain the primary key column??

    Pls reply.Thanks in advance.

  14. There is an error in the script at http://www.pinaldave.com/bimg/unionquery.zip. The last query uses UNION ALL but should be UNION. Please correct it (the query in the article is correct).

    P.S. – Your blog is very helpful. Nearly every time I look for an answer to a SQL Server problem (I’ve been forced to convert from Oracle), I see your face in the results. :) Keep up the good work.

  15. Pingback: SQL SERVER – Weekly Series – Memory Lane – #020 | SQL Server Journey with SQL Authority

  16. Best Example of Union Vs Union ALL

    SELECT 1 AS A
    UNION
    SELECT 1 AS A

    = one row

    SELECT 1 AS A
    UNION ALL
    SELECT 1 AS A

    = two rows

  17. I have tried this and its work well,
    $xque1 =
    “SELECT DISTINCT * from s_optional_que WHERE std_id = ‘$std’ && sub_code = ‘$subcode’ && chap_code = ‘$chap_code’ && quetype_code = ‘$qt1code’ && opdif_code = ‘$dif’
    UNION ALL
    SELECT DISTINCT * from s_optional_que WHERE std_id = ‘$std’ && sub_code = ‘$subcode’ && chap_code = ‘$chap_code3′ && quetype_code = ‘$qt1code’ && opdif_code = ‘$dif’
    UNION ALL
    SELECT DISTINCT * from s_optional_que WHERE std_id = ‘$std’ && sub_code = ‘$subcode’ && chap_code = ‘$chap_code2′ && quetype_code = ‘$qt1code’ && opdif_code = ‘$dif’ ORDER BY RAND() LIMIT $qtn1″;

  18. Pingback: SQL SERVER – UNION ALL and UNION are Different Operation | SQL Server Journey with SQL Authority

  19. I have a question about the UNION statement. I realize that it pulls a distinct list.
    If I have a union using two select statements and each table has a value, it returns only one of the two. My question is, which value is returned, the value from the first select or the value from the second select? Or, does it vary?
    The reason I am asking is that I have a User table and a user history table. Each table has an email column. If my union returns results from both the user table and the user history table, I want the record in the user table returned, not the history table. Does the union, by default, pick the match in the first select when there are duplicates, and if not, is there any way to force it to do so?
    My query would be something like:

    SELECT Email FROM User
    UNION
    SELECT Email FROM UserHistory

    • Hm – why would you bother whether it’s from first or second table? The value (Email in your example) will be either indistinguishable (you won’t be able to tell where it came from) because the value in both tables is identical – OR – you would have two values anyway – from both tables – because the values are different…..

  20. I realized my previous question is irrelevant. the value being returned would be the same regardless of which table it got pulled from.

  21. Hi to all,
    I have a doubt which is we have two tables one table(sid,sname) columns another table(course) column only.
    i want to merge both tables into single table like(sid,sname,course) columns. There is no common key column in between tables. How to achieve this task.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s