SQL Puzzle – Correct the Incorrect Query – Win Price Worth USD 1000 – Aggregate and Subquery

SQL Puzzle - Correct the Incorrect Query - Win Price Worth USD 1000 - Aggregate and Subquery puzzle It has been a while since we have seen a puzzle so let us do a small puzzle today. Actually, this is a very simple puzzle if you have worked with SQL Server for a while. I got an email from one of the readers from the blog post and he wanted me to correct the following query which he had written. Well, I have been working with SQL Server for over 19 years and I often do not remember the syntax and have to look up online. Here is a very simple puzzle for you which involves Aggregate and Subquery.

Important Update: The Workshop is now over and it was one of the landmark workshops, where I shared business secrets of successful Performance Tuning Expert. You can watch the recording of the workshop for a limited period of time.

Puzzle – Aggregate and Subquery

Here is the query when we run against the AdventureWorks database, it gives us an error. Remember AdventureWorks database is a sample database and you can download and install AdventureWorks over here.

USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > AVG(p.StandardCost)
GO

Now when you run the above query, it gives us the following error:

Msg 147, Level 15, State 1, Line 17
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.

Now here is your task, you need to help me to fix the above query. The hint to resolve the query is already in the blog post’s title and in the query error. You need to understand what my query is attempting to do and use the hint which is already pointed out to you. After you successfully fix the query it will produce the following results.

SQL Puzzle - Correct the Incorrect Query - Win Price Worth USD 1000 - Aggregate and Subquery correctresult

Rules

Well, there are no real rules, as long as you modify my query to get me above results. Please post your query in the comments section below before April 22, 2019. I will keep all the answers hidden till April 22, 2019 11:59:00 PM.

Winner

One Winner will get 1 free seat for my upcoming class, which is worth USD 1000.

SQL Server Performance Tuning Practical Workshop for EVERYONE
Date: April 23, 2019, Tuesday
8:00 AM Pacific Time | 11:00 AM Eastern Time

If you do not want to take part in the contest, you can directly register for the class here.

Update: April 23, 2019

The contest is closed and the winning Answer is here:

USE AdventureWorks2014
GO
SELECT pch.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > p.StandardCost
GO

There is no need for the AVG on the StandardCost as there will be always a single entry for the product. Even if you change the query with group by + having you will still see the query uses the simple join.

Congratulations to the winner Nicholas Månsson. He is very excited to join the SQL Server Performance Tuning Practical Workshop for EVERYONE.

Everyone who participated you got to solve this difficult puzzle so indirectly you are a winner (of knowledge) as well. This was an intense competition with over 500 comments.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Contest, SQL Scripts, SQL Server
Previous Post
SQL SERVER – Relating Unrelated Tables – A Question from Reader
Next Post
Puzzle – DELETE Qualified Rows From Multiple Tables – Win USD 1000 Worth Class

Related Posts

527 Comments. Leave new

  • USE AdventureWorks2014
    GO
    ;with AvgCost as (
    SELECT DISTINCT
    pch.ProductID
    ,AVG(pch.StandardCost) as StandardCost
    FROM Production.ProductCostHistory as pch
    Group by pch.ProductID
    )
    SELECT ac.StandardCost, p.ProductID
    FROM AvgCost ac
    INNER JOIN Production.Product p ON ac.ProductID = p.ProductID
    WHERE ac.StandardCost > p.StandardCost
    GO

    Reply
  • Srikanth Chindam
    April 19, 2019 5:09 pm

    SELECT pch.*
    FROM Production.ProductCostHistory pch
    WHERE pch.StandardCost > (SELECT AVG(p.StandardCost) FROM Production.Product p WHERE pch.ProductID = p.ProductID)

    Reply
  • You don’t need the AVG function. Product table already has the average standar cost. So you can just modify the query this way:

    SELECT pch.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost

    Reply
  • need to use having clause instead of where

    Reply
  • SELECT *
    FROM ( SELECT pch.StandardCost, p.ProductID, AVG(p.StandardCost) OVER (PARTITION BY NULL) AS AvgStandardCost
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p
    ON pch.ProductID=p.ProductID) Windowed
    WHERE StandardCost>Windowed.AvgStandardCost;

    I like this because it only requires a single read of the underlying tables.

    Reply
  • USE AdventureWorks2014
    GO
    SELECT p.ProductID,StartDate,EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    group by p.ProductID,startdate,enddate, pch.StandardCost, pch.ModifiedDate
    having pch.StandardCost > AVG(p.StandardCost)
    GO

    Reply
  • Hi

    I think this is the query

    SELECT p.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY p.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
    HAVING pch.StandardCost > AVG(p.StandardCost)
    GO
    Regards
    Antonio

    Reply
  • Hi Pinal,

    Interesting query.
    Being an Oracle SQL and PL/SQL developer for 13+ years, this is my first attempt to do any coding in MS SQL Server.
    Let’s see how it goes.

    I tried to post on comment section of your blog post but it seems there is some issue so I am emailing you my answer.

    Option-1:
    Ideally Product.ProductID column must be a Primary Key. If that is the case then there is no point to use AVG function with Product.StandardCost column. In that case query can simply be written as:

    SELECT p.ProductID, pch.StandardCost
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost
    ORDER BY p.ProductID;

    Or may be as below to get all the columns you specified in the output you provide.

    SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost
    ORDER BY pch.ProductID;

    Option-2:
    Let’s assume Product.ProductID is not a Primary Key column.
    In that case we can use correlated sub-query to achieve the desired output.

    SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    WHERE pch.StandardCost > ( SELECT AVG(p.StandardCost)
    FROM Production.Product p
    WHERE p.ProductID = pch.ProductID )
    ORDER BY pch.ProductID;

    Option-3:
    Using Inline View.

    SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN ( SELECT p1.ProductID, AVG(p1.StandardCost) StandardCost
    FROM Production.Product p1
    GROUP BY p1.ProductID ) p
    ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost
    ORDER BY pch.ProductID;

    Thanks,
    Rahul Gandhi,
    Associate Manager – PD,
    Revitas (now Model N), Ahmedabad

    Reply
  • SELECT p.productid,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID and (pch.StandardCost>p.standardCost)

    Reply
  • Deepraj Singh
    April 19, 2019 7:48 pm

    USE AdventureWorks2014
    GO
    SELECT p.productid,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    group by p.productid,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
    having pch.StandardCost>avg(p.standardCost)

    Reply
  • Philip van Gass
    April 19, 2019 7:49 pm

    I have rewritten my answer using a subquery (see below) but my previous answer works just as well and produces exactly the same execution plan.

    USE AdventureWorks2014
    GO
    SELECT pch.ProductID, StartDate, EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    WHERE EXISTS (SELECT ProductID FROM Production.Product p WHERE p.ProductID = pch.ProductID
    GROUP BY p.ProductID
    HAVING pch.StandardCost > AVG(p.StandardCost))
    GROUP BY pch.ProductID, StartDate, EndDate, pch.StandardCost, pch.ModifiedDate
    GO

    Reply
  • Mark Marshall
    April 19, 2019 8:07 pm

    And another one from me….

    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN
    (
    SELECT ProductID, AVG(StandardCost) AS AvgCost
    FROM Production.Product
    GROUP BY ProductID
    ) p
    ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.AvgCost

    Reply
  • Robert E McCormick
    April 19, 2019 8:20 pm

    USE AdventureWorks
    GO
    SELECT pch.*
    FROM Production.ProductCostHistory pch
    INNER JOIN (
    select ProductID ,avg(StandardCost) as StandardCost
    from Production.Product
    group by productid) p
    ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost
    GO

    Reply
  • Robert E McCormick
    April 19, 2019 8:33 pm

    To add on my my previous entry, my second solution uses a correlated subquery, but it shows the same execution plan; which surprised me.
    USE AdventureWorks
    GO
    SELECT pch.*
    FROM Production.ProductCostHistory pch
    INNER JOIN (
    select ProductID ,avg(StandardCost) as StandardCost
    from Production.Product
    group by productid) p
    ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost
    GO
    — Or another way…
    select pch.*
    from Production.ProductCostHistory pch
    where pch.StandardCost > (
    Select Avg(p.StandardCost) as StandardCost
    from Production.Product p
    where p.productID = pch.productID)
    GO

    Reply
  • USE AdventureWorks2014
    GO
    SELECT p.ProductID, pch.StandardCost
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY ProductID
    HAVING pch.StandardCost > AVG(p.StandardCost)
    GO

    Reply
  • WE are at 282 comments… some really good comments.

    Reply
  • Here is the best I came up with:
    USE AdventureWorks2012
    GO
    SELECT pch.*
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost>p.StandardCost
    GO

    I started by introducing a subquery lookup like this:
    SELECT pch.*
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > (SELECT AVG(p2.StandardCost) FROM Production.Product p2 WHERE p2.ProductID=pch.ProductID)
    GO

    But saw that the original ‘p’ table was no longer needed, so took it out and let the subquery take the alias:
    SELECT pch.*
    FROM Production.ProductCostHistory pch
    WHERE pch.StandardCost > (SELECT avg(p.StandardCost) FROM Production.Product p WHERE p.ProductID=pch.ProductID)
    GO

    And then figured that since there would always only ever be one p.ProductID for a given pch.ProductID, the AVG(p.StandardCost) would just be p.StandardCost, so I could remove the subquery altogether and just use p.StandardCost. The query execution plans for all three of the above are identical.

    Thank you for the learning.

    Reply
  • John J DiMartino
    April 19, 2019 9:19 pm

    Pinal, please provide the URL/ link where I should place my puzzle response. Thanks.

    Reply
  • Gopinath Srirangan
    April 19, 2019 9:45 pm

    SELECT pch.* FROM Production.ProductCostHistory pch INNER JOIN
    (
    SELECT
    AVG(p.StandardCost) AVGStandardCost
    ,p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY p.ProductID
    ) T1 ON pch.ProductID = T1.ProductID
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > T1.AVGStandardCost
    GO

    Reply
  • ;with STD as (
    SELECT p.ProductID,AVG(p.StandardCost) as AVGCOST
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY p.ProductID
    )
    SELECT p.ProductID,startdate,enddate,pch.StandardCost,ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN STD p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > AVGCOST

    Reply

Leave a Reply