SQL SERVER – SELECT 1 vs SELECT * – An Interesting Observation

Many times I have seen the issue of SELECT 1 vs SELECT * discussed in terms of performance or readability while checking for the existence of rows in the table. I ran quick 4 tests about this observed that I am getting same result when used SELECT 1 and SELECT *. I think smart readers of this blog will come up the situation when SELECT 1 and SELECT * have a different execution plan when used to find the existence of rows.

Let us see 4 tests I ran and note their result is same and their Execution Plan take the same amount of resources (50% and 50%)

You can click on the images to see larger images.

Test 1: Whole Table SELECT

USE AdventureWorks
GO
IF EXISTS(
SELECT 1
FROM Production.Product)
SELECT 'SELECT 1'
GO
IF EXISTS(
SELECT *
FROM Production.Product)
SELECT 'SELECT *'
GO

SQL SERVER - SELECT 1 vs SELECT * - An Interesting Observation select1s

Test 2: Condition WHERE on Indexed Column

USE AdventureWorks
GO
IF EXISTS(
SELECT 1
FROM Production.Product
WHERE Name = 'Adjustable Race')
SELECT 'SELECT 1'
GO
IF EXISTS(
SELECT *
FROM Production.Product
WHERE Name = 'Adjustable Race')
SELECT 'SELECT *'
GO

SQL SERVER - SELECT 1 vs SELECT * - An Interesting Observation select2s

Test 3: Using Aggregate function COUNT

USE AdventureWorks
GO
IF (
SELECT 1
FROM Production.Product
WHERE Name = 'Adjustable Race') = 1
SELECT 'SELECT 1'
GO
IF (
SELECT COUNT(*)
FROM Production.Product
WHERE Name = 'Adjustable Race') = 1
SELECT 'SELECT *'
GO


SQL SERVER - SELECT 1 vs SELECT * - An Interesting Observation select3s

Test 4: Using COUNT with search on non-indexed Column

USE AdventureWorks
GO
IF (
SELECT COUNT(1)
FROM Production.Product
WHERE SafetyStockLevel = '800') > 1
SELECT 'SELECT 1'
GO
IF (
SELECT COUNT(*)
FROM Production.Product
WHERE SafetyStockLevel = '800') > 1
SELECT 'SELECT *'
GO

SQL SERVER - SELECT 1 vs SELECT * - An Interesting Observation select4s

Note: I have used single quotes around 800, which is numeric and I am aware of that fact. Above queries are for testing purpose only.

I have been using SELECT 1 instead of SELECT * when checking existence of rows. I would like to see what my readers have an opinion about this. Please have your opinion and make your comment here.

Reference: Pinal Dave (https://blog.sqlauthority.com)

,
Previous Post
SQL SERVER – Index Reorganize or Index Rebuild
Next Post
SQL SERVER – Dynamic Case Statement – FIX : ERROR 156 : Incorrect syntax near the keyword

Related Posts

51 Comments. Leave new

  • What is the equallent in SQL Server for

    select * from table1 a
    where (a.x, a.y) in (
    select b.x, min(b.y)
    from table1 b
    group by b.x
    )

    Reply
  • how do we get back the data that we ve deleted from a table??

    Reply
  • Hi Pinal,
    How can we see the “excution Plan”????

    Regards,
    Palanisamy

    Reply
  • Greg Jorgensen
    July 20, 2009 1:56 am

    This shouldn’t be such a mysterious of contentious subject. SELECT * in an EXISTS condition is a SQL idiom. In this context * means “I don’t care about the columns,” not “fetch all columns.” It may or may not be optimized by the compiler depending on the dialect of SQL. Apparently T-SQL is smart in this regard but other RDBMSs may special-case SELECT * in subqueries.

    The Microsoft SQL Server documentation says this:

    “The select list of a subquery introduced with EXISTS, by convention, has an asterisk (*) instead of a single column name. The rules for a subquery introduced with EXISTS are the same as those for a standard select list, because a subquery introduced with EXISTS creates an existence test and returns TRUE or FALSE, instead of data.” (T-SQL reference, Subquery Rules).

    Anyone who knows SQL should recognize the idiom EXISTS(SELECT * …). The alternative EXISTS(SELECT 1 …) is obvious but not really idiomatic. Anything else is just showing off and making the SQL harder for the next person to understand.

    Reply
  • Greg Jorgensen
    July 20, 2009 2:06 am

    Lutz:

    EXISTS and NOT EXISTS do the same work: the db engine only has to find a single row that matches the WHERE clause in the subquery to satisfy the test. The syntax gives it aways: NOT EXISTS is the logical NOT of the EXISTS test.

    Depending on the subquery WHERE condition the existence of at least one matching row may be answered by scanning an index, or it may require a table scan. In either case as soon as a single matching row is found the result of EXISTS is known. The result of NOT EXISTS is simply the logical inverse of the result of EXISTS.

    In your example the subquery inside the EXISTS() doesn’t “return approx. 1 million rows.” It doesn’t return any rows — it returns a TRUE or FALSE. The db engine can give the result when it encounters the first row matching the WHERE clause. There is no result set to discard for the subquery.

    If you are seeing different results using SELECT 1 vs. SELECT * it’s probably due to SELECT * having done all the work and cached the indexes so SELECT 1 has less work to do.

    Reply
  • Sangam Uprety
    July 20, 2009 3:13 pm

    Simple issue and much discussions. And good one! Since in much cases we have to execute the EXISTS, using faster one [may be only fractions of second faster] matters. My preference is using SELECT 1, however. Thanks.

    Reply
  • Evan Carroll
    July 20, 2009 8:53 pm

    I would call it a fairly major bug in the query engine if SELECT 1, and SELECT * were not optimized to do the same thing when the context is within the EXIST() statement. Certainly, there is no practical reason why you would visit any extra meta data at all if exists is just going to return a SQL bool..

    Reply
  • I would not call it a bug if they were not the same under the hood. The optimizer can only guess so much. It really depends on how the statements are written. The ones above are pretty simple and should optimize well, but not all are that easily second guessed as to their intent.

    Reply
  • How do we get back the data that we have deleted from a table?

    Reply
  • Siva,

    You need to have a valid backup in advance or you may need some third party tools

    Reply
  • Greg Jorgensen said:

    “EXISTS and NOT EXISTS do the same work”

    This is not necessarily true. Depending on the indexes being used (or not used), an EXISTS may be faster, because it can stop as soon as it finds one matching row. A NOT EXISTS may have to table scan the entire table (if there’s no index that helps the query).

    Reply
  • @Dan

    I still believe Greg Jorgensen was right on that one.

    To be sure that a value DOESN’T exist, the engine must do a full scan. To be sure that a value exists, finding the first is enough. And if there is proper index, the scan will use it.

    So, if you execute a WHERE X = 2 on a table completely filled of 1’s, a full scan will happen either way – doesn’t matter if you were using a EXISTS or a NOT EXISTS. The engine should work exactly the same way in both cases.

    Reply
  • Yeoh Ray Mond
    July 25, 2010 9:30 pm

    Well, I’ve always used EXISTS 1, simply because it is easier to type compared to *!

    Reply
  • I believe, select 1 is faster if the condition is checking the indexed columns.
    In that case, we don’t need to fetch the data from the FS, only read the index(which is the step used anyway) and if true, then output 1. This reduces the step to fetch data from underlying FS

    -R

    Reply
  • An advantage of using select 1 instead of select * is that you can do a simple find in your code base and any ‘select *’ will raise a red flag and a visit to the offending developer’s desk :-) A workaround is searching with a regular expression to exclude cases with ‘exists (‘ but my regular expression brain can’t figure this out in 2 minutes :-)

    Reply
  • A bit of history that we all know that SQL Server came from Sybase and that optimizers were not always as good as they are now. So these products have evoloved and been continually enhance over a period of over 20 years. These “Myths” started many moons ago. I clearly rmember having issues with SQL 6.5 doing EXISTS(SELECT * FROM …) and fixing these issues by using SELECT 1. The Optimizers now don’t have these issues but I still use SELECT 1 because vehemently dislike the use of SELECT * in any scenario.

    Reply
  • To add to this comment on the history of why select 1 is used. select * certainly used to have physical disk IO implications that take time reading the data where as selecting a constant didn’t. This probably isn’t the case now but I still use select 1

    Reply
  • I don’t believe this debate will ever end :)
    I have been asked more than once about the performance difference between EXISTS(SELECT * and EXISTS(SELECT 1. Since I deal almost exclusively with Microsoft SQL Server I have always pointed to the documentation Greg Jorgensen referred to (July 20, 2009), i.e. these days it comes down to personal preference. The whole point here is that the parser knows whether or not any real data is needed when the subquery is in an EXISTS, AFAIK SQL Server has treated EXISTS SELECT * as “SELECT anything” since at least version 7.0 (6.5 was the Sybase root/MS split). Any SQL database optimizer should be able to predict if retrieving meta-data is needed ot not.
    So for me there are 2 interpretations of SELECT *, if inside an EXISTS it is “SELECT anything”, outside is “SELECT ALL”, and any instance of “SELECT ALL” better have a good reason for its existance.

    BTW, I recently asked a newer co-worker why he used “SELECT 1”, his response was “I was taught that way”. The good news is he was also taught that “SELECT *” is bad, but unfortunately teaching good or bad with no explanation as to when or why is one of the reasons this thread exists.

    Reply
  • SELECT 1 helped me fairly well

    Reply
  • i just run select 1 from in DB2 it is showing 1,1,1,1,1… nothing more then that,even i tryd at SQL server too, same result , i dont knw how any y u guys comparing results ???

    Reply

Leave a Reply

Menu