It has been a while since we have seen a puzzle so let us do a small puzzle today. Actually, this is a very simple puzzle if you have worked with SQL Server for a while. I got an email from one of the readers from the blog post and he wanted me to correct the following query which he had written. Well, I have been working with SQL Server for over 19 years and I often do not remember the syntax and have to look up online. Here is a very simple puzzle for you which involves Aggregate and Subquery.
Important Update: The Workshop is now over and it was one of the landmark workshops, where I shared business secrets of successful Performance Tuning Expert. You can watch the recording of the workshop for a limited period of time. |
Puzzle – Aggregate and Subquery
Here is the query when we run against the AdventureWorks database, it gives us an error. Remember AdventureWorks database is a sample database and you can download and install AdventureWorks over here.
USE AdventureWorks2014 GO SELECT pch.StandardCost, p.ProductID FROM Production.ProductCostHistory pch INNER JOIN Production.Product p ON pch.ProductID = p.ProductID WHERE pch.StandardCost > AVG(p.StandardCost) GO
Now when you run the above query, it gives us the following error:
Msg 147, Level 15, State 1, Line 17
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
Now here is your task, you need to help me to fix the above query. The hint to resolve the query is already in the blog post’s title and in the query error. You need to understand what my query is attempting to do and use the hint which is already pointed out to you. After you successfully fix the query it will produce the following results.
Rules
Well, there are no real rules, as long as you modify my query to get me above results. Please post your query in the comments section below before April 22, 2019. I will keep all the answers hidden till April 22, 2019 11:59:00 PM.
Winner
One Winner will get 1 free seat for my upcoming class, which is worth USD 1000.
SQL Server Performance Tuning Practical Workshop for EVERYONE
Date:Â April 23, 2019, Tuesday
8:00 AM Pacific Time | 11:00 AM Eastern Time
If you do not want to take part in the contest, you can directly register for the class here.
Update: April 23, 2019
The contest is closed and the winning Answer is here:
USE AdventureWorks2014 GO SELECT pch.ProductID, pch.StartDate,pch.EndDate, pch.StandardCost, pch.ModifiedDate FROM Production.ProductCostHistory pch INNER JOIN Production.Product p ON pch.ProductID = p.ProductID WHERE pch.StandardCost > p.StandardCost GO
There is no need for the AVG on the StandardCost as there will be always a single entry for the product. Even if you change the query with group by + having you will still see the query uses the simple join.
Congratulations to the winner Nicholas Månsson. He is very excited to join the SQL Server Performance Tuning Practical Workshop for EVERYONE.
Everyone who participated you got to solve this difficult puzzle so indirectly you are a winner (of knowledge) as well. This was an intense competition with over 500 comments.
Reference:Â Pinal Dave (https://blog.sqlauthority.com)
527 Comments. Leave new
USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p
WHERE pch.StandardCost > AVG(p.StandardCost)
and pch.ProductID = p.ProductID
GO
SELECT pch.ProductID,
pch.StartDate,
pch.EndDate,
pch.StandardCost,
pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost >
(
SELECT AVG(pc.StandardCost)
FROM Production.ProductCostHistory pc
WHERE pc.ProductID = pch.ProductID
)
AND pch.EndDate IS NOT NULL;
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (
SELECT AVG(p2.StandardCost)
FROM Production.Product p2
WHERE p2.ProductID = p.ProductID
)
did I post it in the right place? :)
Yes you did.
select a.* from(
(select ProductID,StartDate,EndDate,StandardCost,ModifiedDate from Production.ProductCostHistory) a
inner join
(select productid, AVG(StandardCost) as costavg from Production.Product
group by productid) b
on a.productid=b.productid)
where a.standardcost>b.costavg
USE AdventureWorks2014
GO
SELECT p.ProductID, pch.StandardCost
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
GROUP BY p.ProductID, pch.StandardCost
HAVING pch.StandardCost > AVG(p.StandardCost)
ORDER BY p.ProductID
GO
select pch.StandardCost,p.ProductID
from Production.ProductCostHistory pch
Inner join Production.Product p on pch.ProductID=p.ProductID
where pch.StandardCost>(select avg(c.StandardCost) from Production.Product c)
Hi – hows this?
SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
JOIN (SELECT ProductID, AVG(StandardCost) AS AvgStdCost FROM Production.Product GROUP BY ProductID) a ON p.ProductID = a.ProductID
WHERE pch.StandardCost > a.AvgStdCost
ORDER BY 1
GO
Hello Pinal,
Please find the below query which is giving expected output. Please let me know if this is how you expecting the query to be tuned.
USE AdventureWorks2014
GO
SELECT p.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
GROUP BY p.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost,pch.ModifiedDate
HAVING pch.StandardCost > AVG(p.StandardCost)
Hello Sir, I do like your SQL puzzle postings – this one seems super easy, couldn’t control myself from responding.
here, AVG(p.StandardCost) is a aggregate function which require to use group by clause likely below:
USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID,AVG(p.StandardCost)
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
group by pch.StandardCost, p.ProductID having
pch.StandardCost > AVG(p.StandardCost)
GO
Feel appreciated to be participated. Thank you!
I have added one extra column which may consider wrong – the correct will be as below (by excluded the extra column:
USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
group by pch.StandardCost, p.ProductID having
pch.StandardCost > AVG(p.StandardCost)
GO
Requires a Correlated Subquery:
SELECT p.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (SELECT AVG(P1.StandardCost)
FROM Production.Product P1
GROUP BY P1.ProductID
HAVING P1.ProductID = p.ProductID);
select p.ProductID,pch.StartDate, pch.EndDate,pch.StandardCost,pch.ModifiedDate
from
((select avg(Pr.StandardCost) as av ,ProductID from Production.Product pr
group by ProductID )p join
(select StandardCost,StartDate,EndDate,ModifiedDate,ProductID from Production.ProductCostHistory ) pch
on pch.ProductID = p.ProductID) where pch.StandardCost> p.av
select pch.productid, pch.StartDate, pch.enddate, pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN (select productid, avg(standardcost) as avgcost from Production.Product group by ProductID ) p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > p.avgcost
/******************************************
Created by: Pinal Dave
Created to: Confound blog readers
Created on: 17 April 2019?
Updated by: Stephen Lauzon
Updated to: a) Get the expected results:
ProductID StartDate EndDate StandardCost ModifiedDate
———– ———————– ———————– ——————— ———————–
707 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 13.8782 2014-02-08 10:01:36.827
708 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 13.8782 2014-02-08 10:01:36.827
711 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 13.8782 2014-02-08 10:01:36.827
858 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 9.7136 2014-02-08 10:01:36.827
859 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 9.7136 2014-02-08 10:01:36.827
860 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 9.7136 2014-02-08 10:01:36.827
b) Win the prize
Updated on: 18 April 2019 at 9:30 am (UTC-7)
******************************************/
USE AdventureWorks2014
GO
SELECT pout.ProductID
, pch.StartDate
, pch.EndDate
, pch.StandardCost
, pout.ModifiedDate
FROM Production.ProductCostHistory AS pch
INNER
JOIN Production.Product AS pout
ON pch.ProductID = pout.ProductID
WHERE pch.StandardCost > ( SELECT AVG(pin.StandardCost)
FROM Production.Product AS pin
WHERE pin.ProductID = pout.ProductID
);
GO
Hi Pinal.. I provided both a CTE, and nonCTE version just for fun. lol They optimize down to the same execution plan, but I generally prefer CTE’s as there is less redundant code written. You can optionally add an order to sort the list but I believe that wasn’t a requirement.
— CTE Version (#1)
;with costtally as
(select pch.StandardCost , p.ProductId
from Production.ProductCostHistory pch
inner join Production.Product p ON pch.ProductID = p.ProductID)
select StandardCost, ProductId, ( select avg(StandardCost) from Production.ProductCostHistory ) as avgStdCost
from costtally
where StandardCost > ( select avg(StandardCost) from Production.ProductCostHistory );
go
— #2nd version (with subquery)
select pch.StandardCost , p.ProductId, ( select avg(StandardCost) from Production.ProductCostHistory) as avgStdCost
from Production.ProductCostHistory pch
inner join Production.Product p ON pch.ProductID = p.ProductID
where pch.StandardCost > ( select avg(StandardCost) from Production.ProductCostHistory );
go
USE AdventureWorks2014
GO
SELECT pch.StandardCost, p.ProductID
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
WHERE pch.StandardCost > (Select AVG(a.StandardCost) from Production.Product a)
GO
USE AdventureWorks2014
GO
SELECT pp.ProductID, pch.StartDate, pch.EndDate, pch.StandardCost,pch.modifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN (select p.ProductID, AVG(p.StandardCost) avgStndrdCost
from Production.Product p
group by p.ProductID) pp
ON pch.ProductID = pp.ProductID
and pch.StandardCost > pp.avgStndrdCost
GO
This is in response to your Query Puzzle. My results are as required.
ProdId Start Date End Date Stnd Cost Modified Date
707 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 13.8782 2013-05-29 00:00:00.000
708 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 13.8782 2013-05-29 00:00:00.000
711 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 13.8782 2013-05-29 00:00:00.000
858 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 9.7136 2013-05-29 00:00:00.000
859 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 9.7136 2013-05-29 00:00:00.000
860 2012-05-30 00:00:00.000 2013-05-29 00:00:00.000 9.7136 2013-05-29 00:00:00.000
SELECT pch.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
INNER JOIN (SELECT ProductID, AvgStandardCost=AVG(StandardCost) FROM Production.Product GROUP BY ProductID) ac
ON pch.ProductID=ac.ProductID
WHERE pch.StandardCost > ac.AvgStandardCost
GO
OR
SELECT pch.ProductID,pch.StartDate,pch.EndDate,pch.StandardCost, pch.ModifiedDate
FROM Production.ProductCostHistory pch
INNER JOIN Production.Product p ON pch.ProductID = p.ProductID
INNER JOIN (SELECT ProductID, AvgStandardCost=AVG(StandardCost) FROM Production.Product GROUP BY ProductID) ac
ON pch.ProductID=ac.ProductID and pch.StandardCost > ac.AvgStandardCost
GO
–This is a correction or corrected version i submitted……
— i corrected the syntax but you had a separate
–comment on the output so this included the extra
–column to give you the expected results
use [AdventureWorks2014]
go
select pch.[ProductID], pch.[StartDate],pch.[EndDate],
pch.StandardCost,
pch.ModifiedDate
FROM
[Production].[ProductCostHistory] pch INNER JOIN
(Select ProductID, AVG(StandardCost) [AVGStandardCost] from [Production].[Product] group by ProductID) p
ON
pch.[ProductID] = p.ProductID
WHERE
pch.StandardCost > p.[AVGStandardCost]
USE AdventureWorks2014
SELECT
p.ProductID
, pch.StandardCost
, pch.startdate
, pch.enddate
, pch.modifieddate
FROM
Production.ProductCostHistory pch
INNER JOIN
Production.Product p
ON
pch.ProductID = p.ProductID
where
(
pch.enddate is not null
OR
pch.enddate not like ‘%NULL%’
)
group by
p.ProductID
, pch.StandardCost
, pch.startdate
, pch.enddate
, pch.modifieddate
having
pch.StandardCost > (
select AVG(pch2.StandardCost)
from Production.ProductCostHistory pch2
where
p.productid = pch2.productid
and
(
pch2.enddate is not null
OR
pch2.enddate not like ‘%NULL%’
)
group by pch2.productid
)
order by p.productid;