SQL Puzzle – Correct the Incorrect Query – Win Price Worth USD 1000 – Aggregate and Subquery

SQL Puzzle - Correct the Incorrect Query - Win Price Worth USD 1000 - Aggregate and Subquery puzzle It has been a while since we have seen a puzzle so let us do a small puzzle today. Actually, this is a very simple puzzle if you have worked with SQL Server for a while. I got an email from one of the readers from the blog post and he wanted me to correct the following query which he had written. Well, I have been working with SQL Server for over 19 years and I often do not remember the syntax and have to look up online. Here is a very simple puzzle for you which involves Aggregate and Subquery.

Important Update: The Workshop is now over and it was one of the landmark workshops, where I shared business secrets of successful Performance Tuning Expert. You can watch the recording of the workshop for a limited period of time.

Puzzle – Aggregate and Subquery

Here is the query when we run against the AdventureWorks database, it gives us an error. Remember AdventureWorks database is a sample database and you can download and install AdventureWorks over here.

USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > AVG(p.StandardCost)
GO

Now when you run the above query, it gives us the following error:

Msg 147, Level 15, State 1, Line 17
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.

Now here is your task, you need to help me to fix the above query. The hint to resolve the query is already in the blog post’s title and in the query error. You need to understand what my query is attempting to do and use the hint which is already pointed out to you. After you successfully fix the query it will produce the following results.

SQL Puzzle - Correct the Incorrect Query - Win Price Worth USD 1000 - Aggregate and Subquery correctresult

Rules

Well, there are no real rules, as long as you modify my query to get me above results. Please post your query in the comments section below before April 22, 2019. I will keep all the answers hidden till April 22, 2019 11:59:00 PM.

Winner

One Winner will get 1 free seat for my upcoming class, which is worth USD 1000.

SQL Server Performance Tuning Practical Workshop for EVERYONE
Date: April 23, 2019, Tuesday
8:00 AM Pacific Time | 11:00 AM Eastern Time

If you do not want to take part in the contest, you can directly register for the class here.

Update: April 23, 2019

The contest is closed and the winning Answer is here:

USE AdventureWorks2014
GO
SELECT pch.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > p.StandardCost
GO

There is no need for the AVG on the StandardCost as there will be always a single entry for the product. Even if you change the query with group by + having you will still see the query uses the simple join.

Congratulations to the winner Nicholas Månsson. He is very excited to join the SQL Server Performance Tuning Practical Workshop for EVERYONE.

Everyone who participated you got to solve this difficult puzzle so indirectly you are a winner (of knowledge) as well. This was an intense competition with over 500 comments.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Contest, SQL Scripts, SQL Server
Previous Post
SQL SERVER – Relating Unrelated Tables – A Question from Reader
Next Post
Puzzle – DELETE Qualified Rows From Multiple Tables – Win USD 1000 Worth Class

Related Posts

527 Comments. Leave new

  • Dean James Gabriel
    April 18, 2019 5:58 pm

    USE AdventureWorks2014
    GO
    SELECT p.ProductID,startDate,EndDate,pch.StandardCost,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    group by p.ProductID,startDate,EndDate,pch.StandardCost,pch.ModifiedDate
    having pch.StandardCost > AVG(p.StandardCost)

    GO

    Reply
  • HI, Pinal.

    I’m Ignez, from Brazil. I’d love to win a seat in your class, so there’s my answer:

    SELECT pch.*
    FROM Production.ProductCostHistory pch
    CROSS APPLY (SELECT p.ProductId, AVG(p.StandardCost) AS StandardCost
    FROM Production.Product p
    WHERE pch.ProductID = p.ProductID
    GROUP BY p.ProductID) as P
    WHERE pch.StandardCost > p.StandardCost
    GO

    Best regards.

    Reply
  • SELECT p.ProductID,pch.startdate, pch.enddate, P.StandardCost, p.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost

    Reply
  • Here we are:

    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID and pch.StandardCost > p.StandardCost

    Reply
  • Patrick McCollum
    April 18, 2019 6:03 pm

    USE AdventureWorks2014
    GO
    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost >
    (SELECT AVG(p2.StandardCost) FROM Production.Product p2
    WHERE p2.ProductID = p.ProductID)
    GO

    Reply
  • USE AdventureWorks2014
    GO
    SELECT p.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > (select AVG(StandardCost) from production.Product)
    GO

    Reply
  • Dave Merideth
    April 18, 2019 6:10 pm

    USE AdventureWorks
    GO
    SELECT p.ProductID, StartDate, EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    LEFT JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > (SELECT AVG(P1.StandardCost)
    FROM Production.Product AS P1
    WHERE P1.rowguid = p.rowguid)

    Reply
  • Use AdventureWorks2012
    Go
    SELECT p1.ProductID,StartDate,EndDate,ProductCostHis.StandardCost, ProductCostHis.ModifiedDate
    FROM Production.Product p1
    INNER JOIN (
    SELECT pch.ProductID,pch.StartDate,pch.EndDate, pch.StandardCost, AVG(pch.StandardCost) AS AVGStandardCost,pch.ModifiedDate
    FROM Production.ProductCostHistory pch

    GROUP BY ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
    ) ProductCostHis
    ON ProductCostHis.ProductID = p1.ProductID
    WHERE ProductCostHis.AVGStandardCost > p1.StandardCost

    Reply
    • formatted one:

      USE AdventureWorks2012
      GO

      SELECT p1.ProductID
      ,StartDate
      ,EndDate
      ,ProductCostHis.StandardCost
      ,ProductCostHis.ModifiedDate
      FROM Production.Product p1
      INNER JOIN (
      SELECT pch.ProductID
      ,pch.StartDate
      ,pch.EndDate
      ,pch.StandardCost
      ,AVG(pch.StandardCost) AS AVGStandardCost
      ,pch.ModifiedDate
      FROM Production.ProductCostHistory pch
      GROUP BY ProductID
      ,pch.StartDate
      ,pch.EndDate
      ,pch.StandardCost
      ,pch.ModifiedDate
      ) ProductCostHis ON ProductCostHis.ProductID = p1.ProductID
      WHERE ProductCostHis.AVGStandardCost > p1.StandardCost

      Reply
  • It is hard to understand the requirement as the query provided is wrong. However, analysing the data and your expected output, it does not need to be a group by query.

    This query will give your expected output:
    SELECT pch.*
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost

    Reply
  • USE AdventureWorks2014
    GO
    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.standardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY p.ProductID, pch.StartDate, pch.EndDate, pch.ModifiedDate, pch.StandardCost
    HAVING pch.StandardCost > AVG(p.StandardCost)

    Reply
  • SELECT p.ProductID,p.StartDate,p.EndDate,pch.StandardCost,p.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY p.ProductID
    HAVING pch.StandardCost > AVG(p.StandardCost)

    Reply
  • Hi Dave,

    Not sure but look at it. The result is almost the same except for the nulls in the Enddate.

    USE AdventureWorks
    GO
    SELECT ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    WHERE pch.StandardCost > (select AVG(StandardCost) from production.ProductCostHistory
    where pch.ProductID = ProductID)
    order by ProductID
    GO

    Reply
  • girish thakur
    April 18, 2019 6:23 pm

    as error says “HAVING clause…”
    Having said that “Having” did the magic

    USE AdventureWorks2014
    GO
    SELECT p.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    group by p.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
    having pch.StandardCost > AVG(p.StandardCost)
    GO

    Reply
  • girish thakur
    April 18, 2019 6:24 pm

    I don’t see my comments

    Reply
  • Albertus van den Berg
    April 18, 2019 6:28 pm

    There you go.

    SELECT pch.*
    FROM Production.ProductCostHistory pch
    INNER JOIN (SELECT ProductID, AVG(StandardCost) as AVG_Cost from Production.Product GROUP BY ProductID) as p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > AVG_Cost

    Reply
  • USE AdventureWorks2016
    GO
    ;
    with AvgStdCost (ProductID , AvgStdCost )
    as (
    Select P.ProductID , Avg( P.StandardCost )
    From Production.product as P
    Group By P.ProductID
    )
    SELECT pch.StandardCost, p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    Inner join AvgStdCost A on a.productID = p.ProductID
    WHERE pch.StandardCost > A.AvgStdCost
    GO

    Reply
  • Vasantha Guru
    April 18, 2019 6:30 pm

    Hi,

    Please review my answer below.

    SELECT pch.*
    FROM ProductCostHistory pch
    INNER JOIN (SELECT PRODUCTID,AVG(STANDARDCOST) AS STDCOST FROM Product GROUP BY PRODUCTID) p
    ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost >p.STDCOST

    Thanks for the oppurtunity.

    Best Regards,
    Vasantha Guru

    Reply
  • Corey Hambrick
    April 18, 2019 6:30 pm

    I created a subquery in 2 ways. First I tried it in the where clause directly. It has a low subtree cost. Then I put the subquery in the FROM Clause essentially creating what I would call a derived table. I put the aggregate field in the derived table and then was able to use as a field within the WHERE clause. This produced and even lower subtree cost. So that is my solution. A derived table subquery in the FROM Clause.

    My query…

    SELECT pch.StandardCost, p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN
    (SELECT PS.ProductID,AVG(PS.StandardCost) AS AVGCost FROM Production.Product PS GROUP BY PS.PRODUCTID) AS p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > AVGCost
    GO

    Reply
  • SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > (SELECT AVG(StandardCost) FROM Production.Product)
    GO

    Reply
  • Joe Gakenheimer
    April 18, 2019 6:33 pm

    Hi Pinal,

    I have 3 solutions below; each with different execution plans. The output of each query is exactly as your output. The last solution is probably the one you’d be most interested in as I’ve implemented a correlated subquery and removed the HAVING clause. I’d guess the 3rd solution would be the fastest.

    Thanks, Joe

    –method 1 the quick and simple way utilizing HAVING, this would be my solution in a prod env
    SELECT p.ProductID, StartDate, EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p
    ON pch.ProductID = p.ProductID
    GROUP BY pch.StandardCost, p.ProductID, StartDate, EndDate, pch.StandardCost, pch.ModifiedDate
    HAVING pch.StandardCost > AVG(p.StandardCost)
    ORDER BY p.ProductID

    –method 2 correlated subquery
    SELECT p.ProductID, StartDate, EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p
    ON pch.ProductID = p.ProductID
    GROUP BY pch.StandardCost, p.ProductID, StartDate, EndDate, pch.StandardCost, pch.ModifiedDate
    HAVING pch.StandardCost > (
    SELECT AVG(p2.StandardCost)
    FROM Production.Product p2
    WHERE p2.ProductID = p.ProductID
    )
    ORDER BY p.ProductID

    –method 3 correlated subquery optimized with no HAVING clause
    SELECT p.ProductID, StartDate, EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p
    ON pch.ProductID = p.ProductID
    INNER JOIN (
    SELECT p2.ProductID, AVG(p2.StandardCost) AS StandardCost
    FROM Production.Product p2
    GROUP BY p2.ProductID
    ) p3
    ON p3.ProductID = p.ProductID
    WHERE pch.StandardCost > p3.StandardCost

    Reply

Leave a Reply