It has been a while since we have seen a puzzle so let us do a small puzzle today. Actually, this is a very simple puzzle if you have worked with SQL Server for a while. I got an email from one of the readers from the blog post and he wanted me to correct the following query which he had written. Well, I have been working with SQL Server for over 19 years and I often do not remember the syntax and have to look up online. Here is a very simple puzzle for you which involves Aggregate and Subquery.
Important Update: The Workshop is now over and it was one of the landmark workshops, where I shared business secrets of successful Performance Tuning Expert. You can watch the recording of the workshop for a limited period of time. |
Puzzle – Aggregate and Subquery
Here is the query when we run against the AdventureWorks database, it gives us an error. Remember AdventureWorks database is a sample database and you can download and install AdventureWorks over here.
USE AdventureWorks2014 GO SELECT pch.StandardCost, p.ProductID FROM Production.ProductCostHistory pch INNER JOIN Production.Product p ON pch.ProductID = p.ProductID WHERE pch.StandardCost > AVG(p.StandardCost) GO
Now when you run the above query, it gives us the following error:
Msg 147, Level 15, State 1, Line 17
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
Now here is your task, you need to help me to fix the above query. The hint to resolve the query is already in the blog post’s title and in the query error. You need to understand what my query is attempting to do and use the hint which is already pointed out to you. After you successfully fix the query it will produce the following results.
Rules
Well, there are no real rules, as long as you modify my query to get me above results. Please post your query in the comments section below before April 22, 2019. I will keep all the answers hidden till April 22, 2019 11:59:00 PM.
Winner
One Winner will get 1 free seat for my upcoming class, which is worth USD 1000.
SQL Server Performance Tuning Practical Workshop for EVERYONE
Date:Â April 23, 2019, Tuesday
8:00 AM Pacific Time | 11:00 AM Eastern Time
If you do not want to take part in the contest, you can directly register for the class here.
Update: April 23, 2019
The contest is closed and the winning Answer is here:
USE AdventureWorks2014 GO SELECT pch.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate FROM Production.ProductCostHistory pch INNER JOIN Production.Product p ON pch.ProductID = p.ProductID WHERE pch.StandardCost > p.StandardCost GO
There is no need for the AVG on the StandardCost as there will be always a single entry for the product. Even if you change the query with group by + having you will still see the query uses the simple join.
Congratulations to the winner Nicholas Månsson. He is very excited to join the SQL Server Performance Tuning Practical Workshop for EVERYONE.
Everyone who participated you got to solve this difficult puzzle so indirectly you are a winner (of knowledge) as well. This was an intense competition with over 500 comments.
Reference:Â Pinal Dave (https://blog.sqlauthority.com)
527 Comments. Leave new
Also, I do not have Adventurworks 2014 but 2016CP3, so I used that.
That is fine.
Hi,
thanks for homework :) brain synapse speed up test :)
…this is what I think it’s ok to work with, defenetly Query1 because it’s simpler and faster, Query2 is an example when it need to do it that way…
Best regards from a very small dot on a globe…43.8860701,20.3330983
Query 1: Query cost: 36%
SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (select AVG(StandardCost) from Production.Product where ProductID = p.ProductID)
GO
/*
Table ‘ProductCostHistory’. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Product’. Scan count 2, logical reads 30, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
*/
Query 2: Query cost: 64%
SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
INNER JOIN (select ProductID, AVG(StandardCost) as StandardCost from Production.Product group by ProductID) x on x.ProductID = p.ProductID
WHERE pch.StandardCost > x.StandardCost
GO
/*
Table ‘Workfile’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Product’. Scan count 2, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘ProductCostHistory’. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
*/
USE AdventureWorks2014
GO
SELECT p.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
GROUP BY p.ProductID, pch.StartDate,pch.EndDate,pch.StandardCost, pch.ModifiedDate
HAVING pch.StandardCost > AVG(p.StandardCost)
–Here is the query with a CROSS APPLY instead of Having clause.
USE AdventureWorks2014
GO
SELECT p.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
CROSS APPLY(SELECT ProductID, AVG(StandardCost) AS AvgStandardCost FROM Production.Product
WHERE ProductID = pch.ProductID Group BY ProductID) AS pcha
WHERE pch.ProductID = pcha.ProductID AND pch.StandardCost > pcha.AvgStandardCost
–Here is the Query with one less table join and the Cross Apply.
SET STATISTICS IO ON
GO
USE AdventureWorks2014
GO
SELECT pcha.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
CROSS APPLY(SELECT ProductID, AVG(StandardCost) AS AvgStandardCost FROM Production.Product
WHERE ProductID = pch.ProductID Group BY ProductID) AS pcha
WHERE pch.ProductID = pcha.ProductID AND pch.StandardCost > pcha.AvgStandardCost
–>> April 18, 2019
–>> Select statement contains the required columns per the results, which display when executing the query.
–>> Inner Join using sub query that contains the Avg() function used in the ON clause.
USE AdventureWorks2014
GO
SELECT pp.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost,pch.modifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN (select p.ProductID, AVG(p.StandardCost) avgStndrdCost
FROM Production.Product p
GROUP BY p.ProductID) pp
ON pch.ProductID = pp.ProductID
WHERE pch.StandardCost > pp.avgStndrdCost
GO
USE AdventureWorks2014
GO
SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost , pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
where pch.StandardCost > p.StandardCost
Go
USE AdventureWorks
GO
SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost,pch.ModifiedDate
FROM Production.ProductCostHistory pch
JOIN Production.Product p ON pch.ProductID = p.ProductID
GROUP BY p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost,pch.ModifiedDate
HAVING pch.StandardCost > AVG(p.StandardCost)
This gave me the same output that you have. I am using SQL Server 2017
USE AdventureWorks2014
GO
SELECT p.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (select AVG(p2.StandardCost) from Production.Product p2 where p2.ProductID=p.productid)
group by p.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
GO
/* I love your tips and advice on SQL for many years would love chance to learn more.
I want to move from a 20yr exp developer/tech currently as L4 SQL analyst/SSIS job and svr resource guy to a DBA. I work for a big 4 and they have begun move to azure/docker and need for typical DBA is not as in demand. Do you feel DBA as we know it will be less desirable in next 7yr?
*/
My solution, which uses two CTE’s (Common Table Expressions). I find it easier to read CTE’s versus subqueries. By the way, I’m very interested to see the crafty HTML injection attack! Can you consider doing a post that shows exactly what the person submitted?
With
APC (productid, AvgCost)
as
(select p.productid, AVG(p.StandardCost) as “AvgCost”
from Production.Product p
group by p.ProductID),
StdCost (StartDate, EndDate, StandardCost, ModifiedDate, ProductID)
as
(select pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate, p.ProductID
from Production.ProductCostHistory pch
join Production.Product p
on pch.ProductID = p.ProductID)
select StdCost.ProductID, StartDate, EndDate, StandardCost, ModifiedDate
from StdCost
join APC
on StdCost.ProductID = APC.productid
where StdCost.StandardCost > APC.AvgCost
order by StdCost.ProductID;
go
Hmmm…
Something like this…? Subquery… Aggregate…Copy results as shown…?
SELECT p.ProductID, StartDate, EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost >
(
SELECT AVG(x.StandardCost) as avgStandCost
FROM Production.ProductCostHistory y
INNER JOIN Production.Product x ON y.ProductID = x.ProductID
WHERE pch.ProductID = p.ProductID
)
USE AdventureWorks2014
GO
SELECT
p.ProductID,
pch.StartDate,
pch.EndDate,
pch.StandardCost,
pch.ModifiedDate
FROM
Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
group by p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
having pch.StandardCost > AVG(p.StandardCost)
order by p.ProductID
GO
Thank you for resubmitting the comments. We are over 240 comments at this point of time.
However still only 15 valid answers :-) Guys check the execution plan of query hint and see if you need subquery!
Here is a solution to your puzzle:
SELECT pch.ProductID
,pch.StartDate
,pch.EndDate
,pch.StandardCost
,pch.ModifiedDate
FROM Production.ProductCostHistory AS pch
WHERE pch.StandardCost > (
SELECT AVG(p.StandardCost)
FROM Production.Product AS p
WHERE pch.ProductID = p.ProductID
)
ORDER BY pch.ProductID
USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p
ON pch.ProductID = p.ProductID
WHERE pch.StandardCost >
(Select AVG(p2.StandardCost) From Production.Product p2 where p.ProductID=p2.ProductID)
GO
There you go…
1) Solutions:1
USE AdventureWorks
go
SELECT p.ProductID,pch.StartDate,pch.EndDate,min(pch.StandardCost) as StandardCost,pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
group by p.ProductID,pch.StartDate,pch.EndDate,pch.ModifiedDate
having min(pch.StandardCost)>AVG(p.StandardCost)
GO
USE AdventureWorks
go
2) Solutions:2
SELECT pch.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
FROM Production.ProductCostHistory pch
where pch.StandardCost > (select AVG(p.StandardCost) from Production.Product p where p.ProductID=pch.ProductID)
GO
Please review sir and share your thought.
The query for above puzzle is
SELECT p.ProductID,
pch.StartDate,
pch.EndDate,
pch.StandardCost,
pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN (
SELECT p.ProductID,
AVG(p.StandardCost) AS StandardCost
FROM Production.Product p
INNER JOIN Production.ProductCostHistory pch
ON pch.ProductID = p.ProductID
GROUP BY
p.ProductID
) AS p
ON p.ProductID = pch.ProductID
WHERE pch.StandardCost > p.StandardCost
Hi. I’m interested in your puzzle. I’m a beginner in sql, hope you can point out where is my weakness.
here is my solution :
SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
CROSS APPLY (
SELECT AVG(x.StandardCost) avg_price
FROM Production.product x
WHERE x.productid = pch.productid
) tblB
WHERE pch.StandardCost > tblB.avg_price
ORDER BY pch.productid
sorry, first time posting here, might get the format wrong.
Thanks alot for this site and thanks alot for all your knowledge.
Sorry, don’t know if this is wrong, but thought i want to give my second solution, well not much i change but just want to send it to you.
SELECT pch.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
WHERE pch.StandardCost > (
SELECT AVG(x.StandardCost) avg_price
FROM Production.product x
WHERE x.productid = pch.productid
)
ORDER BY pch.productid
–resubmitting the answer
USE AdventureWorks2014
GO
SELECT p.ProductID,StartDate,EndDate,pch.StandardCost,ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN (select ProductID,AVG(StandardCost) StandardCost from Production.Product group by ProductID) P ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (p.StandardCost)
USE AdventureWorksDW2016
GO
SELECT * pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory as pch
INNER JOIN Production.Product as p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost >(select AVG(p.StandardCost) from Production.Product as p)
GO
Below is the solution :
Use AdventureWorks2012_Data
go
SELECT p.ProductID,pch.StartDate,pch.EndDate,AVG(pch.StandardCost) as StandardCost,pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
group by pch.StandardCost, p.ProductID,pch.startdate,pch.EndDate,pch.ModifiedDate
having pch.StandardCost > AVG(p.StandardCost)
GO
USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (select AVG(SP.StandardCost) from Production.Product SP)
GO