SQL Puzzle – Correct the Incorrect Query – Win Price Worth USD 1000 – Aggregate and Subquery

SQL Puzzle - Correct the Incorrect Query - Win Price Worth USD 1000 - Aggregate and Subquery puzzle It has been a while since we have seen a puzzle so let us do a small puzzle today. Actually, this is a very simple puzzle if you have worked with SQL Server for a while. I got an email from one of the readers from the blog post and he wanted me to correct the following query which he had written. Well, I have been working with SQL Server for over 19 years and I often do not remember the syntax and have to look up online. Here is a very simple puzzle for you which involves Aggregate and Subquery.

Important Update: The Workshop is now over and it was one of the landmark workshops, where I shared business secrets of successful Performance Tuning Expert. You can watch the recording of the workshop for a limited period of time.

Puzzle – Aggregate and Subquery

Here is the query when we run against the AdventureWorks database, it gives us an error. Remember AdventureWorks database is a sample database and you can download and install AdventureWorks over here.

USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > AVG(p.StandardCost)
GO

Now when you run the above query, it gives us the following error:

Msg 147, Level 15, State 1, Line 17
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.

Now here is your task, you need to help me to fix the above query. The hint to resolve the query is already in the blog post’s title and in the query error. You need to understand what my query is attempting to do and use the hint which is already pointed out to you. After you successfully fix the query it will produce the following results.

SQL Puzzle - Correct the Incorrect Query - Win Price Worth USD 1000 - Aggregate and Subquery correctresult

Rules

Well, there are no real rules, as long as you modify my query to get me above results. Please post your query in the comments section below before April 22, 2019. I will keep all the answers hidden till April 22, 2019 11:59:00 PM.

Winner

One Winner will get 1 free seat for my upcoming class, which is worth USD 1000.

SQL Server Performance Tuning Practical Workshop for EVERYONE
Date: April 23, 2019, Tuesday
8:00 AM Pacific Time | 11:00 AM Eastern Time

If you do not want to take part in the contest, you can directly register for the class here.

Update: April 23, 2019

The contest is closed and the winning Answer is here:

USE AdventureWorks2014
GO
SELECT pch.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > p.StandardCost
GO

There is no need for the AVG on the StandardCost as there will be always a single entry for the product. Even if you change the query with group by + having you will still see the query uses the simple join.

Congratulations to the winner Nicholas Månsson. He is very excited to join the SQL Server Performance Tuning Practical Workshop for EVERYONE.

Everyone who participated you got to solve this difficult puzzle so indirectly you are a winner (of knowledge) as well. This was an intense competition with over 500 comments.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Contest, SQL Scripts, SQL Server
Previous Post
SQL SERVER – Relating Unrelated Tables – A Question from Reader
Next Post
Puzzle – DELETE Qualified Rows From Multiple Tables – Win USD 1000 Worth Class

Related Posts

527 Comments. Leave new

  • Boris Shimonov
    April 18, 2019 7:46 pm

    Hi Pinal,

    I modified query, and result is below:

    ;With CTE AS
    (
    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate,
    AVG(p.StandardCost) AvarageStdCost
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    )
    SELECT ProductID, StartDate, EndDate, StandardCost, ModifiedDate
    FROM CTE
    WHERE StandardCost > AvarageStdCost

    Thank you.

    Boris

    Reply
  • USE AdventureWorks2014
    GO
    SELECT pch.StandardCost, p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > (SELECT AVG(StandardCost) FROM Production.Product)
    GO

    Reply
  • SELECT p.ProductID, StartDate, EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > (select AVG(pr.StandardCost) from Production.Product pr where pr.ProductID = p.ProductID)

    Reply
  • Vishwa Deepak
    April 18, 2019 7:56 pm

    /* Hi Pinal
    Please find my query below , This is based on your output result.
    */

    USE AdventureWorks2014
    GO
    with tmp as
    (
    SELECT pch.StandardCost, p.ProductID, AVG(p.StandardCost) avg_StandardCost
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    group by pch.StandardCost, p.ProductID
    ) select pch.ProductID,max(pch.startDate) StartDate ,max(pch.EndDate) EndDate,max(pch.StandardCost) StandardCost,
    Max(ModifiedDate) ModifiedDate from tmp t inner join Production.ProductCostHistory pch on t.ProductID =pch.ProductID
    where t.StandardCost > t.avg_StandardCost and pch.EndDate is not null group by pch.ProductID
    GO

    Reply
  • –avoiding pch.* for obvious reasons
    –execution plan gives a beautiful merge join because of equal number (almost) of rows from both tables
    SELECT pch.ProductId, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN
    (select p.productid, avg(p.standardcost) as StandardCost from Production.Product p
    group by p.ProductID) p
    on pch.productid = p.ProductID
    and pch.StandardCost > p.StandardCost

    Reply
  • USE AdventureWorks2012
    GO
    SELECT pch.StandardCost, p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN (SELECT ProductID, AVG(StandardCost) AS StandardCost FROM Production.Product GROUP BY ProductID ) p
    ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > (p.StandardCost)
    GO

    Reply
  • Andrew Haugen
    April 18, 2019 8:04 pm

    ;WITH cte_P(ProductID, AVG_StandardCost)
    AS
    (
    SELECT ProductID, AVG(StandardCost) AVG_StandardCost
    FROM Production.Product
    GROUP BY ProductID
    )
    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN cte_P p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > AVG_StandardCost

    Reply
  • Not knowing that the ProductId in the table Production.Product is the Primary key, I would probably go ahead and create an aggregate sub-query (which I have submitted earlier on). But now, knowing the data, and that the ProductId is unique, there is no point doing the Average because it will just be the same.
    The table ProductionCostHistory may have an average but the table Product will just be the same. So we can get the desired result by even just removing the “AVG” on the WHERE clause.

    SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost

    Reply
  • You actually don’t need a subquery to achieve the same results as you want…

    SELECT p.ProductID,
    pch.StartDate,
    pch.EndDate,
    pch.StandardCost,
    pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY p.ProductID,
    pch.StartDate,
    pch.EndDate,
    pch.StandardCost,
    pch.ModifiedDate
    HAVING pch.StandardCost > AVG(p.StandardCost)
    GO

    Reply
  • Hermann Cardenas
    April 18, 2019 8:11 pm

    USE AdventureWorks2014
    GO
    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    INNER JOIN
    (
    select ProductID, AvgStandardCost = avg(StandardCost)
    from Production.Product
    group by ProductID
    ) pavg on pavg.ProductId = p.ProductID
    WHERE pch.StandardCost > pavg.AvgStandardCost
    GO

    Reply
  • I did not use a sub-query, but will be interested to learn how (or if) others did. Thanks for stretching my brain this morning. Here’s my solution:

    USE AdventureWorks2014
    GO
    SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
    HAVING pch.StandardCost > AVG(p.StandardCost)
    GO

    Reply
  • Desirae McArdle
    April 18, 2019 8:20 pm

    Resubmitting:

    USE AdventureWorks
    GO
    SELECT pch.StandardCost, p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE exists(Select AVG(p.StandardCost)
    from Production.Product p
    where pch.ProductID = p.ProductID
    HAVING pch.StandardCost > AVG(p.StandardCost))
    GO

    Reply
  • SELECT p.ProductID
    ,pch.StartDate
    ,Pch.EndDate
    ,pch.StandardCost
    ,PCH.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN (Select ProductID, AVG(StandardCost) as StandardCost from Production Group By ProductID ) p
    ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost
    GO

    I had to guess what table the dates came from as I don’t have the database, but you should be able to get the idea. Keep it up Dave!

    Reply
  • Karthick Babu
    April 18, 2019 8:25 pm

    USE AdventureWorks2014
    GO
    SELECT p.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    GROUP BY p.StandardCost, pch.StandardCost, p.ProductID,pch.StartDate,pch.EndDate, pch.ModifiedDate
    HAVING pch.StandardCost > AVG(p.StandardCost)
    GO

    Reply
  • Ajoy Majumder
    April 18, 2019 8:26 pm

    USE AdventureWorks2014
    GO
    SELECT pch.*
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > (select AVG(StandardCost) from Production.Product)
    GO

    Reply
  • Jason Van Schagen
    April 18, 2019 8:27 pm

    USE AdventureWorks2017;
    GO

    WITH CostCTE (ProductID, StartDate, EndDate, ModifiedDate, HistoryStandardCost, AverageProductStandardCost)
    AS (SELECT p.ProductID,
    pch.StartDate,
    pch.EndDate,
    p.ModifiedDate,
    pch.StandardCost,
    AVG(p.StandardCost)
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p
    ON pch.ProductID = p.ProductID
    GROUP BY p.ProductID,
    pch.StartDate,
    pch.EndDate,
    p.ModifiedDate,
    pch.StandardCost)
    SELECT c.ProductID,
    c.StartDate,
    c.EndDate,
    c.HistoryStandardCost,
    c.ModifiedDate
    FROM CostCTE c
    WHERE c.HistoryStandardCost > c.AverageProductStandardCost;

    GO

    Reply
  • SELECT pch.StandardCost, p.ProductID
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > select AVG(p.StandardCost)

    Reply
  • Victor Escott
    April 18, 2019 8:28 pm

    Hi Pinal, nice puzzle :).

    There are different ways to fix the query, but only one of those solutions is optimal.

    The wrong way: We can fix the query by replacing the “where” with “having”, then we have to “group by ” all the columns requested, the problem with this solution is that SQL server has to perform an aggregation, sort , etc. What is the problem with this? I think the AVG is not necessary for this particular query.

    The right way: After analyzing your image with the results and compare vs the original query, I noticed the following:

    1- Dataset comes from the “ProductCostHistory” table.
    2- The Aggregation “AVG” is done in the table “Product”, which is incorrect since this particular table cannot have duplicate “ProductID”, because “ProductID” is the primary key, so the result for “AVG(StandardCost)” is always the same as “StandardCost” (for each “ProductID”).

    So you can fix this query by removing the AVG:

    SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    WHERE pch.StandardCost > p.StandardCost

    Also, we can perform a semi join:)

    SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost,pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    WHERE EXISTS ( SELECT 1/0
    FROM Production.Product P
    WHERE P.ProductID=pch.ProductID
    AND pch.StandardCost > p.StandardCost
    )

    Both queries should have the same execution plan due to simplicity.

    Reply
  • Select pch.StandardCost , X.ProductID
    from Production.ProductCostHistory PCH
    cross apply
    (
    Select p.ProductID
    from Production.Product P
    where pch.ProductID = p.ProductID
    Group BY p.ProductID
    Having pch.StandardCost > AVG(p.StandardCost)
    ) x

    Reply
  • Saurabh Shakyawar
    April 18, 2019 8:35 pm

    Use AdventureWorks2012;
    GO
    SELECT p.ProductID,pch.StandardCost, pch.StartDate, pch.EndDate, pch.ModifiedDate
    FROM Production.ProductCostHistory pch
    INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
    Group by p.ProductID,pch.StandardCost,pch.StartDate, pch.EndDate, pch.ModifiedDate
    Having pch.StandardCost > AVG(p.StandardCost)
    Order By p.ProductID ASC
    GO

    Reply

Leave a Reply