It has been a while since we have seen a puzzle so let us do a small puzzle today. Actually, this is a very simple puzzle if you have worked with SQL Server for a while. I got an email from one of the readers from the blog post and he wanted me to correct the following query which he had written. Well, I have been working with SQL Server for over 19 years and I often do not remember the syntax and have to look up online. Here is a very simple puzzle for you which involves Aggregate and Subquery.
Important Update: The Workshop is now over and it was one of the landmark workshops, where I shared business secrets of successful Performance Tuning Expert. You can watch the recording of the workshop for a limited period of time. |
Puzzle – Aggregate and Subquery
Here is the query when we run against the AdventureWorks database, it gives us an error. Remember AdventureWorks database is a sample database and you can download and install AdventureWorks over here.
USE AdventureWorks2014 GO SELECT pch.StandardCost, p.ProductID FROM Production.ProductCostHistory pch INNER JOIN Production.Product p ON pch.ProductID = p.ProductID WHERE pch.StandardCost > AVG(p.StandardCost) GO
Now when you run the above query, it gives us the following error:
Msg 147, Level 15, State 1, Line 17
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
Now here is your task, you need to help me to fix the above query. The hint to resolve the query is already in the blog post’s title and in the query error. You need to understand what my query is attempting to do and use the hint which is already pointed out to you. After you successfully fix the query it will produce the following results.
Rules
Well, there are no real rules, as long as you modify my query to get me above results. Please post your query in the comments section below before April 22, 2019. I will keep all the answers hidden till April 22, 2019 11:59:00 PM.
Winner
One Winner will get 1 free seat for my upcoming class, which is worth USD 1000.
SQL Server Performance Tuning Practical Workshop for EVERYONE
Date:Â April 23, 2019, Tuesday
8:00 AM Pacific Time | 11:00 AM Eastern Time
If you do not want to take part in the contest, you can directly register for the class here.
Update: April 23, 2019
The contest is closed and the winning Answer is here:
USE AdventureWorks2014 GO SELECT pch.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate FROM Production.ProductCostHistory pch INNER JOIN Production.Product p ON pch.ProductID = p.ProductID WHERE pch.StandardCost > p.StandardCost GO
There is no need for the AVG on the StandardCost as there will be always a single entry for the product. Even if you change the query with group by + having you will still see the query uses the simple join.
Congratulations to the winner Nicholas Månsson. He is very excited to join the SQL Server Performance Tuning Practical Workshop for EVERYONE.
Everyone who participated you got to solve this difficult puzzle so indirectly you are a winner (of knowledge) as well. This was an intense competition with over 500 comments.
Reference:Â Pinal Dave (https://blog.sqlauthority.com)
527 Comments. Leave new
Some really nice comments are already here – thank you guys! Keep it coming!
My o/p is exactly what is needed , please confirm Sir :)
As mentioned in the original blog post I will publish all the answers together on 22nd of this month.
I have query ready, I just added subquery to get the result
SELECT pch.StandardCost, p.ProductID
FROM Production.[ProductCostHistory] pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
Where pch.StandardCost >
(Select AVG(StandardCost) from Production.Product)
USE AdventureWorks2014
GO
SELECT p.ProductID, pch.StartDate [StartDate], pch.EndDate [EndDate], pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (SELECT AVG(p.StandardCost) FROM Production.Product p WHERE pch.ProductID = p.ProductID)
ORDER BY p.ProductID ASC
hi, resubmitting:
SELECT p.productid, pch.startdate, pch.enddate, p.standardcost, pch.modifieddate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost >p.StandardCost
Go
I am from cell phone now, hope that columns in select are right named
cheers, Ondrej
Hi All,
I see lots of people are just replacing the WHERE with another keywords HAVING – NO, that will still give you error and it will not work. Please try out yourself the query before posting it. Even though if you do not have AdventureWorks database, you can still valid and check if your query is correct or not.
Thanks again!
I have corrected it through sub query. thanks for guide!
Even your that query is incorrect. You should try out the query first yourself and see if the answer works or not.
Dave,
One cannot correct the query without knowing what the user is asking for. If the user is asking for the costs over the average for each product, that’s one thing. But if the user is asking for the products over the average price of all products, that’s another. This problem is not solvable without the missing information.
You have the results there, but even with those results the query could still be “incorrect” if you get my meaning.
-Phil
USE AdventureWorks2014
GO
SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
WHERE pch.StandardCost >
(SELECT AVG(p.StandardCost)
FROM Production.Product AS p
WHERE p.ProductID = pch.ProductID
)
GO
Over 50 entries… and some of them are really great. They did understood that this puzzle is not as simple as it looks.
Let me give you one more hint… check the execution plan :-)
Pinal – I uploaded my reply already with the code that produces the result set you indicated. But I hope you post more of these, even without a prize. It was a good challenge and I need this kind of real world exercise to help me in my career change to SQL Developer. Thank you, John
Thank you John for your kind words.
USE AdventureWorks2014
GO
SELECT p.ProductID,pch.StartDate,pch.EndDate,AVG(pch.StandardCost) as StandardCost ,pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
GROUP BY p.ProductID,pch.StandardCost,StartDate,EndDate,pch.ModifiedDate
having pch.StandardCost > AVG(p.StandardCost)
USE AdventureWorks2014
GO
SELECT p.ProductID,pch.StartDate,pch.EndDate,AVG(pch.StandardCost) as StandardCost ,pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
GROUP BY p.ProductID,pch.StandardCost,StartDate,EndDate,pch.ModifiedDate
having pch.StandardCost > AVG(p.StandardCost)
GO
Hi All,
If you have previously submitted comment here I request you to resubmit your solution. As due to one of the comment had HTML TAG which deleted few of the very interesting comment. I have fixed the small issue but again, I do not want you to loose your chance so please resubmit the solution.
I truly apologize but I had already identified few really interesting comments there to publish and now they are gone due to so SQL Injection via HTML.
USE AdventureWorks2014
GO
SELECT pch.*
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (select AVG(p.StandardCost) from Production.Product p)
GO
SELECT PCH.ProductID,
[StartDate],
[EndDate],
[StandardCost],
[ModifiedDate]
FROM [AdventureWorks2014].[Production].[ProductCostHistory] PCH
INNER JOIN
(
SELECT [ProductID],
AVG([StandardCost]) AS [AVG_StandardCost]
FROM [AdventureWorks2014].[Production].[Product]
GROUP BY [ProductID]
) P ON PCH.ProductID = P.ProductID
WHERE PCH.StandardCost > P.AVG_StandardCost;
WITH AVGStandardCost (ProductID, AVGStandardCost)
AS (SELECT p.ProductID, AVG(p.StandardCost)
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
GROUP BY p.ProductID)
SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN AVGStandardCost pp ON pch.ProductID = pp.ProductID
WHERE pch.StandardCost > pp.AVGStandardCost
Hereby:
SELECT pch.ProductId, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch INNER JOIN
(
SELECT ProductID, AVG(StandardCost) AS AverageCost
FROM Production.Product
GROUP BY ProductID
) AS p ON pch.ProductID = p.ProductID
GROUP BY pch.ProductId, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate, AverageCost
HAVING pch.StandardCost > AverageCost
First, get the average in a subquery, then join them, group them and ask for the desired result after the grouping
WITH p
AS (SELECT ProductID,
AVG(StandardCost) OVER (PARTITION BY ProductID) AS AverageCost
FROM Production.Product)
SELECT pch.ProductID,
pch.StartDate,
pch.EndDate,
pch.StandardCost,
pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN p
ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > p.AverageCost;
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN (select ProductID, AVG(StandardCost) as AVG_StandardCost from Production.Product group by ProductID ) p
ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > p.AVG_StandardCost
SELECT pr.ProductID,pch.StartDate,pch.EndDate,AVG(pch.StandardCost) as StandardCost ,pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product pr ON pch.ProductID = pr.ProductID
GROUP BY pr.ProductID,pch.StandardCost,StartDate,EndDate,pch.ModifiedDate
having pch.StandardCost > AVG(pr.StandardCost)
Correction on my previous comment. This one is better I suppose, the last group by / having clause not needed.
SELECT pch.ProductId, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch INNER JOIN
(
SELECT ProductID, AVG(StandardCost) AS AverageCost
FROM Production.Product
GROUP BY ProductID
) AS p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > AverageCost
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID GROUP BY p.ProductID WHERE pch.StandardCost > AVG(p.StandardCost)
Hi Pinal,
As per your comment, i am posting answer again. I don’t want to miss the chance.
SELECT p.ProductID, pch.StartDate,EndDate,pch.StandardCost,pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (Select AVG(StandardCost) from Production.Product where ProductID=P.ProductID )
Well right nw I don’t have sql available to me. But I trying to correct this query based on my previous knowledge.
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (select AVG(p1.StandardCost) from Production.Product p1);
–RESULT HERE:
SELECT p.ProductID,pch.StandardCost
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
group by p.ProductID,pch.StandardCost
HAVING pch.StandardCost <= AVG(p.StandardCost)
GO
Below is the answer from myside
SELECT *
FROM Production.ProductCostHistory pch
INNER JOIN (SELECT AVG_StandardCost=AVG(StandardCost),ProductID
from Production.Product
GROUP BY ProductID) p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > AVG_StandardCost
My answer is as below.
USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
GROUP BY p.ProductID
HAVING pch.StandardCost > AVG(p.StandardCost)
GO
USE AdventureWorks2012
GO
SELECT pch.StandardCost, p.ProductID,AVG(p.StandardCost),p.StandardCost,pch.startdate,pch.EndDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
group by pch.StandardCost, p.ProductID,p.StandardCost,pch.startdate,pch.EndDate
having pch.StandardCost > AVG(p.StandardCost)
GO
USE AdventureWorks2012
GO
SELECT pch.StandardCost, p.ProductID,AVG(p.StandardCost),p.StandardCost,pch.startdate,pch.EndDate,pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
group by pch.StandardCost, p.ProductID,p.StandardCost,pch.startdate,pch.EndDate,pch.ModifiedDate
having pch.StandardCost > AVG(p.StandardCost)
GO
SELECT p.ProductID, pch.StartDate,pch.EndDate,pch.StandardCost, p.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p
ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (SELECT AVG(p.StandardCost) FROM Production.Product p WHERE pch.ProductID = p.ProductID)