It has been a while since we have seen a puzzle so let us do a small puzzle today. Actually, this is a very simple puzzle if you have worked with SQL Server for a while. I got an email from one of the readers from the blog post and he wanted me to correct the following query which he had written. Well, I have been working with SQL Server for over 19 years and I often do not remember the syntax and have to look up online. Here is a very simple puzzle for you which involves Aggregate and Subquery.
Important Update: The Workshop is now over and it was one of the landmark workshops, where I shared business secrets of successful Performance Tuning Expert. You can watch the recording of the workshop for a limited period of time. |
Puzzle – Aggregate and Subquery
Here is the query when we run against the AdventureWorks database, it gives us an error. Remember AdventureWorks database is a sample database and you can download and install AdventureWorks over here.
USE AdventureWorks2014 GO SELECT pch.StandardCost, p.ProductID FROM Production.ProductCostHistory pch INNER JOIN Production.Product p ON pch.ProductID = p.ProductID WHERE pch.StandardCost > AVG(p.StandardCost) GO
Now when you run the above query, it gives us the following error:
Msg 147, Level 15, State 1, Line 17
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
Now here is your task, you need to help me to fix the above query. The hint to resolve the query is already in the blog post’s title and in the query error. You need to understand what my query is attempting to do and use the hint which is already pointed out to you. After you successfully fix the query it will produce the following results.
Rules
Well, there are no real rules, as long as you modify my query to get me above results. Please post your query in the comments section below before April 22, 2019. I will keep all the answers hidden till April 22, 2019 11:59:00 PM.
Winner
One Winner will get 1 free seat for my upcoming class, which is worth USD 1000.
SQL Server Performance Tuning Practical Workshop for EVERYONE
Date:Â April 23, 2019, Tuesday
8:00 AM Pacific Time | 11:00 AM Eastern Time
If you do not want to take part in the contest, you can directly register for the class here.
Update: April 23, 2019
The contest is closed and the winning Answer is here:
USE AdventureWorks2014 GO SELECT pch.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate FROM Production.ProductCostHistory pch INNER JOIN Production.Product p ON pch.ProductID = p.ProductID WHERE pch.StandardCost > p.StandardCost GO
There is no need for the AVG on the StandardCost as there will be always a single entry for the product. Even if you change the query with group by + having you will still see the query uses the simple join.
Congratulations to the winner Nicholas Månsson. He is very excited to join the SQL Server Performance Tuning Practical Workshop for EVERYONE.
Everyone who participated you got to solve this difficult puzzle so indirectly you are a winner (of knowledge) as well. This was an intense competition with over 500 comments.
Reference:Â Pinal Dave (https://blog.sqlauthority.com)
527 Comments. Leave new
SELECT
p.ProductID,
StartDate,
EndDate,
pch.StandardCost,
pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p
ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (select AVG(StandardCost) from Production.ProductCostHistory where ProductID = p.ProductID)
and pch.StandardCost p.StandardCost
USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN
(
Select ProductID,AVG(p.StandardCost) As StandardCost
From Production
Group By ProductID
) As m On m.ProductID = p.ProductID
WHERE pch.StandardCost > m.StandardCost
GO
USE AdventureWorks2017
GO
SELECT pch.ProductID, pch.[StartDate], pch.[EndDate], pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > p.StandardCost
GO
USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN (select prod.ProductID
,AVG(prod.StandardCost) as AvgStandardCost
from Production.Product as prod
group by prod.ProductID
) as p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > p.AvgStandardCost
GO
Here’s my solution:
USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
CROSS APPLY (SELECT AVG(p.StandardCost) avgStandardCost FROM Production.Product p GROUP BY p.ProductID WHERE pch.ProductID = p.ProductID) pAvg
WHERE pch.StandardCost > pAvg.avgStandardCost
GO
SELECT
p.ProductID,
StartDate,
EndDate,
pch.StandardCost,
pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p
ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (select AVG(StandardCost) from Production.ProductCostHistory where ProductID = p.ProductID)
and pch.StandardCost != p.StandardCost
Hi Pinal,
Here is my query for the required result.
SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (select AVG(prd.StandardCost) from Production.Product prd where prd.ProductID = p.ProductID)
Thanks
select pch.* –, DateBasedAvgStdCost
from
(
SELECT StartDate, EndDate, AVG(pch.StandardCost) as DateBasedAvgStdCost
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
GROUP BY StartDate, EndDate
) a
INNER JOIN Production.ProductCostHistory pch on a.StartDate = pch.StartDate and (a.EndDate = pch.EndDate or (a.EndDate IS NULL and pch.EndDate IS NULL))
where pch.StandardCost > a.DateBasedAvgStdCost
and pch.StartDate = ‘2012-05-30 00:00:00.000’ and pch.EndDate = ‘2013-05-29 00:00:00.000’
ORDER BY 1
I’m using AdventureWorks 2016. I think i have more rows. The question didn’t specify, but I did average standard cost per year. I didn’t think it was applicable to compare one year of individual cost to three years of average cost. But I didn’t have a customer to ask these questions. :-)
USE AdventureWorks2014
GO
SELECT pch.StandardCost, m.ProductID
FROM Production,ProductCostHistory pch
INNER JOIN
(
Select ProductID,AVG(p.StandardCost) As StandardCost
From Production As p
Group By ProductID
) As m On m.ProductID = pch.ProductID
WHERE pch.StandardCost > m.StandardCost
GO
how can i send my query
Leaving the comment here was the process to follow.
SELECT pch.ProductID,
pch.EndDate,
pch.StandardCost,
pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN (SELECT p.ProductID,
AVG(p.StandardCost) AvgCost
FROM Production.Product p
GROUP BY p.ProductID) p
ON p.ProductID = pch.ProductID
WHERE pch.StandardCost > AvgCost
USE AdventureWorks2014
GO
SELECT p.ProductID,
pch.StartDate,
pch.EndDate,
pch.StandardCost,
p.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (SELECT AVG(StandardCost) FROM Production.Product)
GO
oops, didn’t read the remit properly. here is my corrected soution
SELECT
p.ProductID,
pch.StartDate,
pch.EndDate,
pch.StandardCost,
pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
CROSS APPLY (SELECT AVG(StandardCost) AS StandardCostAvg
FROM Production.Product
WHERE Production.Product.ProductID = pch.ProductID
) avgCost
WHERE pch.StandardCost > avgCost.StandardCostAvg
GO
–Solution 1
Select pch.StandardCost, p.ProductId
From Production.ProductCostHistory pch
Inner Join
(
Select ProductId,AVG(StandardCost) As AvgStandardCost From Production.Product
Group By ProductId
)p
ON pch.ProductId=p.ProductId
Where pch.StandardCost>p.AvgStandardCost
–Solution 2
;WITH CTE
AS
(
Select ProductId,AVG(StandardCost) As AvgStandardCost From Production.Product
Group By ProductId
)
Select pch.StandardCost, p.ProductId
From Production.ProductCostHistory pch
Inner Join CTE p
ON pch.ProductId=p.ProductId
Where pch.StandardCost>p.AvgStandardCost
USE AdventureWorks2014
GO
SELECT pch.*
FROM Production.ProductCostHistory pch
WHERE pch.StandardCost > (select AVG(p.StandardCost) from Production.Product p where pch.ProductID = p.ProductID)
GO
USE AdventureWorks2014;
GO
SELECT pch.StandardCost,
p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
GROUP BY pch.StandardCost,
p.ProductID
HAVING pch.StandardCost > AVG(p.StandardCost);
GO
Hi Pinal,
I have created the query in 4 different ways and also identified the query execution plan. Please see and verify the below query/details and let me know if anything wrong.
Note: Only 1st query execution plan is difference (this is obvious) and will take time. Rest all are same.
–1.
SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch WITH (NOLOCK)
INNER JOIN Production.Product p WITH (NOLOCK) ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (SELECT AVG(p.StandardCost) FROM Production.Product p WITH (NOLOCK) WHERE pch.ProductID = p.ProductID GROUP BY p.ProductID)
–2.
SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch WITH (NOLOCK)
, (SELECT ProductID, AVG(StandardCost) AvgStandardCost FROM Production.Product WITH (NOLOCK) GROUP BY ProductID) p WHERE pch.ProductID = p.ProductID AND pch.StandardCost > p.AvgStandardCost
–3.
SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch WITH (NOLOCK)
WHERE pch.StandardCost > (SELECT AVG(p.StandardCost) FROM Production.Product p WITH (NOLOCK) WHERE pch.ProductID = p.ProductID GROUP BY p.ProductID)
–4.
SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate FROM Production.ProductCostHistory pch WITH (NOLOCK)
CROSS APPLY
(SELECT AVG(p.StandardCost) AvgStandardCost FROM Production.Product p WITH (NOLOCK) WHERE pch.ProductID = p.ProductID GROUP BY p.ProductID) AS A WHERE pch.StandardCost>AvgStandardCost
Thanks,
Sanjay Kumar
USE AdventureWorks2014
GO
SELECT
p.ProductID
, pch.StartDate
, pch.EndDate
, pch.StandardCost
, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p
ON pch.ProductID = p.ProductID
INNER JOIN (
SELECT ProductID
, AVG(StandardCost) AS AvgStandardCost
FROM Production.Product
GROUP BY ProductID
) avgStdCost
ON avgStdCost.ProductID = p.ProductID
WHERE pch.StandardCost > avgStdCost.AvgStandardCost
GO
WITH CTE_Cost
AS
(
SELECT ProductID, AVG(StandardCost) AS AvgCost
FROM Production.Product
GROUP BY ProductID
)
SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN CTE_Cost p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > p.AvgCost
USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
GROUP BY pch.StandardCost,
p.ProductID
HAVING pch.StandardCost > AVG(p.StandardCost)
GO
….and the one below if you want it to look like the screenshot of your results;
SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
GROUP BY p.ProductID,
pch.StartDate,
pch.EndDate,
pch.StandardCost,
pch.ModifiedDate
HAVING pch.StandardCost > AVG(p.StandardCost)
GO