SQL Puzzle – Correct the Incorrect Query – Win Price Worth USD 1000 – Aggregate and Subquery

SQL Puzzle - Correct the Incorrect Query - Win Price Worth USD 1000 - Aggregate and Subquery puzzle It has been a while since we have seen a puzzle so let us do a small puzzle today. Actually, this is a very simple puzzle if you have worked with SQL Server for a while. I got an email from one of the readers from the blog post and he wanted me to correct the following query which he had written. Well, I have been working with SQL Server for over 19 years and I often do not remember the syntax and have to look up online. Here is a very simple puzzle for you which involves Aggregate and Subquery.

Important Update: The Workshop is now over and it was one of the landmark workshops, where I shared business secrets of successful Performance Tuning Expert. You can watch the recording of the workshop for a limited period of time.

Puzzle – Aggregate and Subquery

Here is the query when we run against the AdventureWorks database, it gives us an error. Remember AdventureWorks database is a sample database and you can download and install AdventureWorks over here.

USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > AVG(p.StandardCost)
GO

Now when you run the above query, it gives us the following error:

Msg 147, Level 15, State 1, Line 17
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.

Now here is your task, you need to help me to fix the above query. The hint to resolve the query is already in the blog post’s title and in the query error. You need to understand what my query is attempting to do and use the hint which is already pointed out to you. After you successfully fix the query it will produce the following results.

SQL Puzzle - Correct the Incorrect Query - Win Price Worth USD 1000 - Aggregate and Subquery correctresult

Rules

Well, there are no real rules, as long as you modify my query to get me above results. Please post your query in the comments section below before April 22, 2019. I will keep all the answers hidden till April 22, 2019 11:59:00 PM.

Winner

One Winner will get 1 free seat for my upcoming class, which is worth USD 1000.

SQL Server Performance Tuning Practical Workshop for EVERYONE
Date: April 23, 2019, Tuesday
8:00 AM Pacific Time | 11:00 AM Eastern Time

If you do not want to take part in the contest, you can directly register for the class here.

Update: April 23, 2019

The contest is closed and the winning Answer is here:

USE AdventureWorks2014
GO
SELECT pch.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > p.StandardCost
GO

There is no need for the AVG on the StandardCost as there will be always a single entry for the product. Even if you change the query with group by + having you will still see the query uses the simple join.

Congratulations to the winner Nicholas Månsson. He is very excited to join the SQL Server Performance Tuning Practical Workshop for EVERYONE.

Everyone who participated you got to solve this difficult puzzle so indirectly you are a winner (of knowledge) as well. This was an intense competition with over 500 comments.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Contest, SQL Scripts, SQL Server
Previous Post
SQL SERVER – Relating Unrelated Tables – A Question from Reader
Next Post
Puzzle – DELETE Qualified Rows From Multiple Tables – Win USD 1000 Worth Class

Related Posts

527 Comments. Leave new

  • Gopinath Srirangan
    April 19, 2019 9:58 pm

    SELECT T1.ProductID
    ,T1.StartDate
    ,T1.EndDate
    ,T1.StandardCost
    ,T1.ModifiedDate
    FROM (
    SELECT DISTINCT
    AVG(p.StandardCost) over(partition BY p.ProductID) AVGStandardCost
    ,pch.ProductID
    ,pch.StartDate
    ,pch.EndDate
    ,pch.StandardCost
    ,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    ) T1
    WHERE T1.StandardCost > T1.AVGStandardCost

    Reply
  • /*
    Production.Product.ProductID is primary key, hence one record for one product, hence no need to compare the cost with average function
    */
    SELECT pch.StandardCost, p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost

    Reply
  • Gopinath Srirangan
    April 19, 2019 10:42 pm

    SELECT * FROM Production.ProductCostHistory pch
    WHERE StandardCost > (SELECT AVG(StandardCost) FROM Production.Product p WHERE p.ProductID = pch.ProductID)

    Reply
  • Because ProductID in Production.Product is a primary key there will never be duplicate ProductID’s in the Production.Product table rendering “AVG(p.StandardCost)” irrelevant. You can modify the query by adding the additional fields you requested in your recordset and removing AVG in the where clause to get the desired results as follows:

    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost

    But lets say that there are multiple product ID’s in Production.Product and you want to find the hisortical rates that are above the average in the Product table you could use average in a subquery as follows:

    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > (select avg(z.StandardCost) from Production.Product z where z.ProductID = p.ProductID)

    Both of these queries give me the following record set:
    ProductID StartDate EndDate StandardCost ModifiedDate
    707 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 13.8782 2013-05-29 00:00:00.000
    708 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 13.8782 2013-05-29 00:00:00.000
    711 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 13.8782 2013-05-29 00:00:00.000
    858 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 9.7136 2013-05-29 00:00:00.000
    859 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 9.7136 2013-05-29 00:00:00.000
    860 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 9.7136 2013-05-29 00:00:00.000

    Reply
  • Chandrabhan Singh Gurjar
    April 19, 2019 11:34 pm

    Hey Guys,
    Here are few ways to solve the above error and get the expected result.

    1) Using subquery if you would like to avoid join & group

    Select pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    from Production.ProductCostHistory pch
    WHERE pch.StandardCost > (Select AVG(p.StandardCost) from Production.Product p Where pch.ProductID = p.ProductID)

    OR

    2) Using subquery with wildcard select statement
    Select pch.*
    from Production.ProductCostHistory pch
    WHERE pch.StandardCost > (Select AVG(p.StandardCost) from Production.Product p Where pch.ProductID = p.ProductID)

    OR

    3) Using Join & Group
    Select pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    from Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    Having pch.StandardCost > AVG(p.StandardCost)

    Reply
  • Chandrabhan Singh Gurjar
    April 19, 2019 11:52 pm

    Hi Pinal,
    One more quick solution to resolve the issue is to understand the data, as Production.Product table is having primary key constraint on ProductID, As there will not be any change in doing AVG of Standard Cost from production.product table. However by simply removing the aggregate function will resolve the issue without any effort.

    USE AdventureWorks2014
    GO
    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost
    GO

    Please share your thought.

    Reply
  • Hi,

    As Product table having primary constraint on productId, avg of each product will be the same cost…

    1. Sub Query:

    SELECT PCH.PRODUCTID, PCH.STARTDATE, PCH.ENDDATE, PCH.STANDARDCOST, PCH.MODIFIEDDATE
    FROM PRODUCTION.PRODUCTCOSTHISTORY PCH
    WHERE PCH.STANDARDCOST > (SELECT STANDARDCOST FROM PRODUCTION.PRODUCT P WHERE P.PRODUCTID = PCH.PRODUCTID)

    2nd: Remove the avg keyword.

    SELECT pch.StandardCost, p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost

    Query plan is same on both way.. because it scan for product all rows…

    Reply
  • SELECT p.ProductID
    ,pch.StartDate
    ,pch.EndDate
    ,pch.StandardCost
    ,pch.ModifiedDate
    FROM Production.ProductCostHistory AS pch
    INNER JOIN Production.Product AS p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > (SELECT AVG(a.StandardCost)
    FROM Production.ProductCostHistory AS a
    WHERE a.ProductID = pch.ProductID
    GROUP BY a.ProductID
    )
    AND pch.EndDate IS NOT NULL
    ORDER BY pch.ProductID
    GO

    Reply
  • Devaraj Govender
    April 20, 2019 1:20 am

    Hi There, first time I’m entering one of your puzzles. Here you go:

    USE AdventureWorks2014
    GO
    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.ModifiedDate, pch.StandardCost
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    –WHERE pch.StandardCost > AVG(p.StandardCost)
    WHERE pch.StandardCost > (
    Select AVG(p2.StandardCost)
    From Production.Product p2
    Where p2.ProductID = p.ProductID)
    GO

    Reply
  • SELECT pch.*
    FROM Production.ProductCostHistory pch
    WHERE pch.StandardCost > (SELECT AVG(p.StandardCost) FROM Production.Product p WHERE pch.ProductID = p.ProductID)
    GO

    Reply
  • select pch.*
    from Production.ProductCostHistory pch
    inner join (select ProductID, avg(StandardCost) as avgStandardCost from Production.Product group by StandardCost,ProductID ) as p on pch.ProductID= p.ProductID
    WHERE pch.StandardCost > p.avgStandardCost

    Reply
  • I think, subquery is enough in this case until unless the performance aspects are not taken into consideration. Please correct me, if I am going wrong.

    SELECT pch.StandardCost, p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost >
    ( SELECT AVG(p.StandardCost)
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID)

    Reply
  • SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > (SELECT AVG(StandardCost) FROM Production.Product WHERE ProductID = p.ProductID)

    Reply
  • Maxime Veilleux
    April 20, 2019 2:03 am

    A product exists only once in product table…
    Therefore, there is no need to calculate the average of field p.StandardCost because the average will be the value in the sole records anyway. Getting expected result is a simple where criteria….

    SELECT PCH.ProductID,
    PCH.StartDate,
    PCH.EndDate,
    PCH.StandardCost,
    PCH.ModifiedDate
    FROM Production.ProductCostHistory AS PCH
    INNER JOIN Production.Product AS P ON PCH.ProductID = P.ProductID
    AND PCH.StandardCost > P.StandardCost

    If somehow you still want an average on some random field I’m unaware of, then common table expression and windows function are always an option to handle it.

    Reply
  • John DiMartino
    April 20, 2019 2:07 am

    Puzzle – Aggregate and Subquery. From Penal on SQL Authority.

    USE AdventureWorks2014
    GO
    –>> incorrect query script < AVG(p.StandardCost)
    GO

    –>> April 18, 2019
    –>> Select statement contains the required columns per the results, which display when executing the query.
    –>> Inner Join using sub query that contains the Avg() function used in the ON clause.
    USE AdventureWorks2014
    GO
    SELECT pp.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost,pch.modifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN (select p.ProductID, AVG(p.StandardCost) avgStndrdCost
    FROM Production.Product p
    GROUP BY p.ProductID) pp
    ON pch.ProductID = pp.ProductID
    WHERE pch.StandardCost > pp.avgStndrdCost
    GO

    Reply
  • Carol Siddall
    April 20, 2019 2:14 am

    SELECT pch.ProductId, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN (select distinct ProductId, avg(StandardCost) over (partition by ProductId) as AvgCost from Production.ProductCostHistory ) as pAvg
    ON pch.ProductID = pAvg.ProductID
    INNER JOIN Production.Product p
    on pch.ProductID = p.ProductID
    WHERE pch.StandardCost > pAvg.AvgCost
    and p.SellEndDate is null
    and pch.EndDate is not null
    order by pch.ProductId

    This gives you the result set you posted. I think the guy’s original request would have included the product name and wouldn’t have removed items where pch.EndDate was null, but for the purposes of the puzzle, I guess this makes sense since it gives a nice simple result set.

    Reply
  • Carol Siddall
    April 20, 2019 2:26 am

    Sorry, here it is with a subquery:

    SELECT pch.ProductId, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p
    on pch.ProductID = p.ProductID
    WHERE p.SellEndDate is null
    and pch.EndDate is not null
    and pch.StandardCost > (select avg(StandardCost) as AvgCost
    from Production.ProductCostHistory
    where Production.ProductCostHistory.ProductId = pch.ProductId)
    order by pch.ProductId

    I really didn’t need a windowing function for my other solution. This works fine and gives a better execution plan:
    SELECT pch.ProductId, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN (select ProductId, avg(StandardCost) as AvgCost
    from Production.ProductCostHistory
    group by ProductId ) as pAvg
    ON pch.ProductID = pAvg.ProductID
    INNER JOIN Production.Product p
    on pch.ProductID = p.ProductID
    WHERE pch.StandardCost > pAvg.AvgCost
    and p.SellEndDate is null
    and pch.EndDate is not null
    order by pch.ProductId

    Of course, you can go over the top with it and use a With clause, as well:

    With pAvg as (select ProductId, avg(StandardCost) as AvgCost
    from Production.ProductCostHistory
    group by ProductId )
    SELECT pch.ProductId, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN pAvg
    ON pch.ProductID = pAvg.ProductID
    INNER JOIN Production.Product p
    on pch.ProductID = p.ProductID
    WHERE pch.StandardCost > pAvg.AvgCost
    and p.SellEndDate is null
    and pch.EndDate is not null
    order by pch.ProductId

    But the subquery version definitely has the best execution plan.

    Reply
  • SELECT P.PRODUCTID,STARTDATE,ENDDATE,PCH.STANDARDCOST,PCH.MODIFIEDDATE
    FROM PRODUCTION.PRODUCTCOSTHISTORY PCH
    INNER JOIN PRODUCTION.PRODUCT P ON PCH.PRODUCTID = P.PRODUCTID
    GROUP BY P.PRODUCTID,PCH.STANDARDCOST,STARTDATE,ENDDATE,PCH.MODIFIEDDATE
    HAVING PCH.STANDARDCOST > AVG(P.STANDARDCOST)
    GO

    Reply
  • Michael Roudebush
    April 20, 2019 6:03 am

    USE AdventureWorks2014
    GO
    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost
    GO

    Reply
  • USE AdventureWorks2014
    GO
    SELECT pch.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    WHERE pch.StandardCost IN
    (
    SELECT pch.StandardCost
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY pch.StandardCost,p.ProductID
    HAVING pch.StandardCost > AVG(p.StandardCost)
    )
    GO

    Reply

Leave a Reply