SQL Puzzle – Correct the Incorrect Query – Win Price Worth USD 1000 – Aggregate and Subquery

SQL Puzzle - Correct the Incorrect Query - Win Price Worth USD 1000 - Aggregate and Subquery puzzle It has been a while since we have seen a puzzle so let us do a small puzzle today. Actually, this is a very simple puzzle if you have worked with SQL Server for a while. I got an email from one of the readers from the blog post and he wanted me to correct the following query which he had written. Well, I have been working with SQL Server for over 19 years and I often do not remember the syntax and have to look up online. Here is a very simple puzzle for you which involves Aggregate and Subquery.

Important Update: The Workshop is now over and it was one of the landmark workshops, where I shared business secrets of successful Performance Tuning Expert. You can watch the recording of the workshop for a limited period of time.

Puzzle – Aggregate and Subquery

Here is the query when we run against the AdventureWorks database, it gives us an error. Remember AdventureWorks database is a sample database and you can download and install AdventureWorks over here.

USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > AVG(p.StandardCost)
GO

Now when you run the above query, it gives us the following error:

Msg 147, Level 15, State 1, Line 17
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.

Now here is your task, you need to help me to fix the above query. The hint to resolve the query is already in the blog post’s title and in the query error. You need to understand what my query is attempting to do and use the hint which is already pointed out to you. After you successfully fix the query it will produce the following results.

SQL Puzzle - Correct the Incorrect Query - Win Price Worth USD 1000 - Aggregate and Subquery correctresult

Rules

Well, there are no real rules, as long as you modify my query to get me above results. Please post your query in the comments section below before April 22, 2019. I will keep all the answers hidden till April 22, 2019 11:59:00 PM.

Winner

One Winner will get 1 free seat for my upcoming class, which is worth USD 1000.

SQL Server Performance Tuning Practical Workshop for EVERYONE
Date: April 23, 2019, Tuesday
8:00 AM Pacific Time | 11:00 AM Eastern Time

If you do not want to take part in the contest, you can directly register for the class here.

Update: April 23, 2019

The contest is closed and the winning Answer is here:

USE AdventureWorks2014
GO
SELECT pch.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > p.StandardCost
GO

There is no need for the AVG on the StandardCost as there will be always a single entry for the product. Even if you change the query with group by + having you will still see the query uses the simple join.

Congratulations to the winner Nicholas Månsson. He is very excited to join the SQL Server Performance Tuning Practical Workshop for EVERYONE.

Everyone who participated you got to solve this difficult puzzle so indirectly you are a winner (of knowledge) as well. This was an intense competition with over 500 comments.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Contest, SQL Scripts, SQL Server
Previous Post
SQL SERVER – Relating Unrelated Tables – A Question from Reader
Next Post
Puzzle – DELETE Qualified Rows From Multiple Tables – Win USD 1000 Worth Class

Related Posts

527 Comments. Leave new

  • Hi Pinal Dave,

    We could use,
    USE AdventureWorks2014
    GO
    SELECT pch.*
    FROM Production.ProductCostHistory pch
    WHERE pch.StandardCost > (Select AVG(p.StandardCost) From Production.Product p Where pch.ProductID = p.ProductID)
    GO

    Usage of COALESCE is optional, depends on the necessity.

    Great job, keep rocking!

    Thanks,
    Praveen V

    Reply
  • Abdul Qayyum Babulkhair
    April 20, 2019 1:14 pm

    SELECT pch.StandardCost, p.ProductID, pch.StartDate, pch.EndDate, p.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > (p.StandardCost)

    Reply
  • Rakesh Tarapara
    April 20, 2019 2:44 pm

    Select pch.*
    From Production.ProductCostHistory pch
    Join (Select P.ProductID,AVG(p.StandardCost) StandardCost
    From Production.Product P
    Group By p.ProductID) p On p.ProductID = pch.ProductID
    WHERE pch.StandardCost > p.StandardCost

    Reply
  • Hi Pinal!

    Here’s my resubmission. This assumes the intention of the query is to find periods when each product’s cost was higher than the current product standard cost.

    SELECT ProductCostHistory.ProductID,
    ProductCostHistory.StartDate,
    ProductCostHistory.EndDate,
    ProductCostHistory.StandardCost,
    ProductCostHistory.ModifiedDate
    FROM Production.ProductCostHistory
    WHERE EXISTS (
    SELECT 1
    FROM Production.Product
    WHERE ProductCostHistory.ProductID = Product.ProductID
    AND ProductCostHistory.StandardCost > Product.StandardCost
    )
    ORDER BY ProductCostHistory.ProductID

    This has an estimated query plan cost of 0.0258527

    An alternative with an INNER JOIN (below) uses a little more – 0.0258636 – with the same number of reads, and the execution time is approximately the same.

    SELECT ProductCostHistory.ProductID,
    ProductCostHistory.StartDate,
    ProductCostHistory.EndDate,
    ProductCostHistory.StandardCost,
    ProductCostHistory.ModifiedDate
    FROM Production.ProductCostHistory
    INNER JOIN Production.Product
    ON ProductCostHistory.ProductID = Product.ProductID
    AND ProductCostHistory.StandardCost > Product.StandardCost
    ORDER BY ProductCostHistory.ProductID

    I read the sample query’s intention as finding where the product cost history records where the cost was either higher than the product’s historical average standard price, or when a product’s cost was higher than the current average cost for all products. Neither produce the sample output, so I’ve had to ignore both the hint to use an aggregate, and the use of the AVG command in the sample query.

    I look forward to seeing the solution and intention of the query.

    Best,
    AndrewDP23

    Reply
  • SELECT p.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,Pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY p.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,Pch.ModifiedDate
    HAVING pch.StandardCost > AVG(p.StandardCost)

    Reply
  • We are over 313 comments. I really will need a volunteer to help me select a winner on Monday.

    Reply
  • Charles Loprinzo
    April 20, 2019 6:53 pm

    USE AdventureWorks2014
    GO
    SELECT p.ProductID, StartDate, EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    group by p.ProductID, StartDate, EndDate, pch.StandardCost, pch.ModifiedDate
    having pch.StandardCost > AVG(p.StandardCost)
    GO

    Reply
  • Jija Dahifale
    April 20, 2019 9:09 pm

    Hi Pinal, In your query, you put Average of Standard cost of Table Product. I can see ProductID is primary key of Product table so each productID having only one StandardCost.
    Why you have used Average of Product.StandardCost instead of without averaging not getting; Let me know if I mis-understood. So I can try my luck. :) Thank you.

    Regards,
    Jija Dahifale

    Reply
    • This is a great question and I think you are very close to answering. Maybe it is not needed or maybe – it is not needed and you have understood it correctly.

      Reply
  • SELECT pch.StandardCost, p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    where pch.StandardCost > (select avg(p.StandardCost) from Production.Product p)—-AVG(p.StandardCost)

    Reply
  • Nicholas MÃ¥nsson
    April 21, 2019 9:32 am

    Hi Pinal,

    There are multiple answers to this query but after looking at the query, it is clear that there is 1 product and there is a single standard cost associated with it. So there is no need to use AVG there. Removing the average will help you to get your query quickly.

    USE AdventureWorks2014
    GO
    SELECT pch.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost
    GO

    Please note that using CTE, Subquery or Having may get the same answer but it was important to understand the data and its property. Getting correct answer should not be priority when re-writing the query but getting correct answer by understanding the query.

    Nicholas

    Reply
  • No sub query is required and below query o/p is same as your o/p. Please correct me if I’m wrong.

    SELECT PCH.ProductID, PCH.StartDate, PCH.EndDate, PCH.StandardCost, PCH.ModifiedDate
    FROM Production.ProductCostHistory PCH
    INNER JOIN Production.Product P ON PCH.ProductID = P.ProductID
    WHERE PCH.StandardCost > P.StandardCost
    GO

    Reply
  • Can not use aggregate function in WHERE clause unless it is in a subquery contained in a HAVING clause or a select list.

    If we need to get the desired output then we need to change the select list and also we need to use sub query for AVg function.

    Reply
  • Hi, Pinal
    Below you can find a solution to your query puzzle. It produces the results you want to.

    USE AdventureWorks2014
    GO

    SELECT pp.ProductID
    ,pp.StandardCost
    –,p2.StandardCost AS [Avg. Standard Cost]
    ,pp.StartDate
    ,pp.EndDate
    ,pp.ModifiedDate
    FROM Production.ProductCostHistory pp INNER JOIN (
    SELECT ProductID
    ,AVG(StandardCost) AS StandardCost
    FROM Production.Product
    GROUP BY ProductID
    ) p2
    ON p2.ProductID = pp.ProductID
    WHERE pp.StandardCost > p2.StandardCost
    ORDER BY pp.ProductID

    Reply
  • If it’s ok with you, I have two answers for this (although I prefer one answer more than the other). Each of them have similar query plans (one uses Inner Join plan and the other Right Semi Join plan), same IO statistics, and the same results. But first, my thoughts when I was trying to fix the query.

    First thing I noticed is seeing the expected result of the query, the query does not need to get or display columns from Production.Product table. All of the columns in the expected result are from Production.ProductCostHistory. So from there, we can already look into putting it in a sub query instead of joining the table. Which lead me to the two answers.

    First answer (which I prefer more for being simpler)
    SELECT pch.ProductID
    ,pch.StartDate
    ,pch.EndDate
    ,pch.StandardCost
    ,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    WHERE pch.StandardCost > (SELECT AVG(p.StandardCost)
    FROM Production.Product p
    WHERE p.ProductID = pch.ProductID)

    Second answer
    SELECT pch.ProductID
    ,pch.StartDate
    ,pch.EndDate
    ,pch.StandardCost
    ,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    WHERE EXISTS (SELECT p.ProductID
    FROM Production.Product p
    WHERE pch.ProductID = p.ProductID
    GROUP BY p.ProductID
    HAVING pch.StandardCost > AVG(p.StandardCost)
    )

    I hope my answers are correct. :)

    Reply
  • Robin Lantigua
    April 22, 2019 7:54 am

    USE AdventureWorks
    GO
    SELECT
    p.ProductID,
    pch.StartDate ,
    pch.EndDate ,
    pch.StandardCost,
    pch.ModifiedDate

    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p
    ON pch.ProductID = p.ProductID

    INNER JOIN (
    SELECT p.ProductID,
    AVG(p.StandardCost) StandardCostAvg
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p
    ON pch.ProductID = p.ProductID
    group by p.ProductID,
    p.StandardCost
    ) T
    ON T.ProductID = P.ProductID
    WHERE pch.StandardCost > T.StandardCostAvg –AVG(p.StandardCost)

    Reply
  • Dhammika Devarathne
    April 22, 2019 8:09 am

    — Re-comment as your request

    USE AdventureWorks2014
    GO
    SELECT PCH.ProductID,
    [StartDate],
    [EndDate],
    [StandardCost],
    [ModifiedDate]
    FROM [Production].[ProductCostHistory] PCH
    INNER JOIN
    (
    SELECT [ProductID],
    AVG([StandardCost]) AS [AVG_StandardCost]
    FROM [Production].[Product]
    GROUP BY [ProductID]
    ) P ON PCH.ProductID = P.ProductID
    WHERE PCH.StandardCost > P.AVG_StandardCost;

    Reply
  • SELECT pch.StandardCost, p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    where pch.StandardCost > (select AVG(p.StandardCost) from Production.Product p)

    Reply
  • Ashish Trivedi
    April 22, 2019 10:28 am

    ProductID is Identity column in “Production.Product” thus we do not need to calculate AVG of p.StandardCost.

    Below is my solution.

    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    –pch.StandardCost, p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost

    Reply
  • Shahu Thorawat
    April 22, 2019 11:27 am

    SELECT p.ProductID,pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    group by p.ProductID,pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    HAVING pch.StandardCost > AVG(p.StandardCost)

    Reply
  • SELECT p.ProductID,
    pch.StartDate,
    pch.EndDate,
    pch.StandardCost,
    pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN (
    SELECT p.ProductID,
    p.StandardCost
    FROM Production.Product p
    INNER JOIN Production.ProductCostHistory pch
    ON pch.ProductID = p.ProductID
    GROUP BY
    p.ProductID, p.StandardCost
    ) AS p
    ON p.ProductID = pch.ProductID
    WHERE pch.StandardCost > p.StandardCost

    Reply

Leave a Reply