Last week we asked a puzzle SQL SERVER – Puzzle to Win Print Book – Functions FIRST_VALUE and LAST_VALUE with OVER clause and ORDER BY . This puzzle got very interesting participation. The details of the winner is listed here.
In this puzzle we received two very important feedback.
- This puzzle cleared the concepts of First_Value and Last_Value to the participants.
- As this was based on SQL Server 2012 many could not participate it as they have yet not installed SQL Server 2012.
I really appreciate the feedback of user and decided to come up something as fun and helps learn new feature of SQL Server 2012.
Please read yesterday’s blog post SQL SERVER – Introduction to LEAD and LAG – Analytic Functions Introduced in SQL Server 2012 before continuing this puzzle as it is based on yesterday’s post.
Let us fun following query.
USE AdventureWorks
GO
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
LEAD(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
) LeadValue,
LAG(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
) LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
GO
Above query will give us following result.
Puzzle:
Now use T-SQL Self Join where same table is joined to itself and get the same result without using LEAD or LAG functions.
Hint:
- Introduction to JOINs – Basic of JOINs
- Self Join
- A new analytic functions in SQL Server Denali CTP3 – LEAD() and LAG()
Rules
- Leave a comment with your detailed answer by Nov 21′s blog post.
- Open world-wide (where Amazon ships books)
- If you blog about puzzle’s solution and if you win, you win additional surprise gift as well.
Prizes
Print copy of my new book SQL Server Interview Questions Amazon|Flipkart
If you already have this book, you can opt for any of my other books SQL Wait Stats [Amazon|Flipkart|Kindle] and SQL Programming [Amazon|Flipkart|Kindle].
Reference: Pinal Dave (https://blog.sqlauthority.com)
21 Comments. Leave new
Your article and the puzzle say they refer to LEAD() and LAG() from yesterday’s post, but the sample query and results use FIRST_VALUE() and LAST_VALUE(). I’m not sure which set of functions you want to simulate in your puzzle. Here are two queries that I believe emulate the desired functionality for either pair of functions. Although you didn’t ask for it in the puzzle, the second of these two queries also supports the equivalent of the optional 2nd and 3rd parameters of LEAD() and LAG().
— a query to emulate FIRST_VALUE() and LAST_VALUE()
;with s as (
select
SalesOrderID,
SalesOrderDetailID,
OrderQty
from Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
)
SELECT s.SalesOrderID,
s.SalesOrderDetailID,
s.OrderQty,
(select top 1 SalesOrderDetailID
from s order BY SalesOrderDetailID ) as FstValue,
s.SalesOrderDetailID as LstValue
FROM s
ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty
— a query to emulate LEAD() and LAG()
;with s as (
select
0 as ldOffset, — equiv to 2nd param of LEAD
0 as lgOffset, — equiv to 2nd param of LAG
null as ldDefVal, — equiv to 3rd param of LEAD
null as lgDefVal, — equiv to 3rd param of LAG
ROW_NUMBER() OVER (ORDER BY SalesOrderDetailID) as row,
SalesOrderID,
SalesOrderDetailID,
OrderQty
from Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
)
SELECT s.SalesOrderID,
s.SalesOrderDetailID,
s.OrderQty,
isnull( sLd.SalesOrderDetailID, s.LdDefVal) as LeadValue,
isnull( sLg.SalesOrderDetailID, s.LgDefVal) as LagValue
FROM s
left outer join s as sLd on s.row = sLd.row – s.ldOffset
left outer join s as sLg on s.row = sLg.row + s.lgOffset
ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty
Here’s a variation of my comment yesterday. This one includes the PARTITION BY behavior in addition to support for the 2nd and 3rd optional parameters for LEAD() and LAG()
/* a query to emulate LEAD() and LAG() */
;with s as (
select
0 as LeadOffset, /* equiv to 2nd param of LEAD */
0 as LagOffset, /* equiv to 2nd param of LAG */
null as LeadDefVal, /* equiv to 3rd param of LEAD */
null as LagDefVal, /* equiv to 3rd param of LAG */
/* Try changing the values of the 4 integer values above to see their effect on the results */
/* The values given above of 0, 0, null and null
behave the same as the default 2nd and 3rd parameters to LEAD() and LAG() */
ROW_NUMBER() OVER (ORDER BY SalesOrderDetailID) as row,
SalesOrderID,
SalesOrderDetailID,
OrderQty
from Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
)
SELECT s.SalesOrderID,
s.SalesOrderDetailID,
s.OrderQty,
isnull( sLead.SalesOrderDetailID, s.LeadDefVal) as LeadValue,
isnull( sLag.SalesOrderDetailID, s.LagDefVal) as LagValue
FROM s
left outer join s as sLead
on s.row = sLead.row – s.LeadOffset
/* Try commenting out this next line when LeadOffset != 0 */
and s.SalesOrderID = sLead.SalesOrderID
/* The additional join criteria on SalesOrderID above
is equivalent to PARTITION BY SalesOrderID
in the OVER clause of the LEAD() function */
left outer join s as sLag
on s.row = sLag.row + s.LagOffset
/* Try commenting out this next line when LagOffset != 0 */
and s.SalesOrderID = sLag.SalesOrderID
/* The additional join criteria on SalesOrderID above
is equivalent to PARTITION BY SalesOrderID
in the OVER clause of the LAG() function */
ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty
Hi Pinal,
I think there is some typo error as the article refer Lead(), Leag() function and the sample query you provided is referring to First_Value(), Last_value() function.
I have written query that gives Lead value and Leag value using self join.,
–CTE based solution
;
with cteMain
As
(
select
SalesOrderID,
SalesOrderDetailID,
OrderQty,
ROW_NUMBER() over (order by salesorderdetailid) as sn
from
Sales.SalesOrderDetail
WHERE
SalesOrderID IN (43670, 43669, 43667, 43663)
)
select
m.SalesOrderID, m.SalesOrderDetailID, m.OrderQty,
sLead.SalesOrderDetailID as leadvalue,
sLeg.SalesOrderDetailID as leagvalue
from
cteMain as m
left outer join cteMain as sLead
on sLead.sn = m.sn+1
left outer join cteMain as sLeg
on sLeg.sn = m.sn-1
order by
m.SalesOrderID, m.SalesOrderDetailID, m.OrderQty
— The same can be achieved using Co-Related subquery…
select
m.SalesOrderID,
m.SalesOrderDetailID,
m.OrderQty,
( select MIN(salesorderdetailid)
from Sales.SalesOrderDetail as l
where l.SalesOrderID in (43670, 43669, 43667, 43663)
and l.SalesOrderID >= m.SalesOrderID and l.SalesOrderDetailID > m.SalesOrderDetailID
) as lead,
( select MAX(salesorderdetailid)
from Sales.SalesOrderDetail as l
where l.SalesOrderID in (43670, 43669, 43667, 43663)
and l.SalesOrderID <= m.SalesOrderID and l.SalesOrderDetailID < m.SalesOrderDetailID
) as leag
from
Sales.SalesOrderDetail as m
where
m.SalesOrderID in (43670, 43669, 43667, 43663)
order by
m.SalesOrderID, m.SalesOrderDetailID, m.OrderQty
Thanks,
Pravin Patel.
Dear Pinal Sir,
Yesterday your blog article was saying about LEAD or LAG functions. That’s right, but the sample query you are used with FIRST_VALUE and LAST_VALUE for today’s puzzle.
I’m confused.. expecting your reply..
Thanks and regards,
Nikhildas
Hi Sir,
bit confused over the blog puzzle question and the required answer.
However to execute the result same as lead and lag function using self join, i wrote the following query
USE AdventureWorks
DROP TABLE #TempSalesOrderDetail
CREATE TABLE #TempSalesOrderDetail
(
Rownumber INT
,LEAD INT
,LAG INT
,SalesOrderID INT
,SalesOrderDetailID INT
,OrderQty INT
)
INSERT INTO #TempSalesOrderDetail
SELECT ROW_NUMBER() OVER (ORDER BY S.SalesOrderID)
,ROW_NUMBER() OVER (ORDER BY S.SalesOrderID)-1
,ROW_NUMBER() OVER (ORDER BY S.SalesOrderID)+1
,s.SalesOrderID
,s.SalesOrderDetailID
,s.OrderQty
FROM Sales.SalesOrderDetail s
WHERE S.SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
SELECT A.SalesOrderID
,A.SalesOrderDetailID
,A.OrderQty
,B.SalesOrderDetailID AS Lead
,C.SalesOrderDetailID AS Lag
FROM #TempSalesOrderDetail AS A
LEFT JOIN #TempSalesOrderDetail AS B ON A.Rownumber = B.LEAD
LEFT JOIN #TempSalesOrderDetail AS C ON A.Rownumber = C.LAG
The output will be similar to the blog post on Lead and Lag functions described yesterday.
Thanks and Regards,
P.Anish Shenoy.
Hi, plz check below query. This will show result as you like.
SELECT SalesOrderID,SalesOrderDetailID, OrderQty,
( SELECT MIN(SalesOrderDetailID) AS FirstValue
FROM Sales.SalesOrderDetail WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
)AS FirstValue
,SalesOrderDetailID AS LastValue
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
Vinay
Hi Dave,
This is my trail please have a look at it
1.
SELECT a1.salesorderid,
a1.salesorderdetailsid,
a1.orderqty,
min(a2.salesorderdetailsid) over () as FstValue,
a2.salesorderdetailsid as LstValue
FROM adventureworks a1
INNER JOIN adventureworks a2
ON a1.salesorderdetailsid=a2.salesorderdetailsid
WHERE a1.salesorderid in (43670,43669,43667,43663)
ORDER BY a1.salesorderid,a1.salesorderdetailsid,a1.orderqty
SalesOrderId SalesOrderDetailsId OrderQty FstValue LstValue
_________________________________________________________
43663 52 1 52 52
43667 77 3 52 77
43667 78 1 52 78
43667 79 1 52 79
43667 80 1 52 80
43669 110 1 52 110
43670 111 1 52 111
43670 112 2 52 112
43670 113 2 52 113
43670 114 1 52 114
___________________________________________________________
2. When use the same thing with PARTITION BY
SalesOrderId SalesOrderDetailsId OrderQty FstValue LstValue
___________________________________________________________
43663 52 1 52 52
43667 77 3 77 77
43667 78 1 77 78
43667 79 1 77 79
43667 80 1 77 80
43669 110 1 110 110
43670 111 1 111 111
43670 112 2 111 112
43670 113 2 111 113
43670 114 1 111 114
___________________________________________________________
Great Solution
Dear Sir,
Please have a look on the below query.This will give the required result.
USE AdventureWorks
GO
WITH tempSalesOrderDetail AS
(
SELECT s.SalesOrderID,
s.SalesOrderDetailID,
s.OrderQty,
ROW_NUMBER() OVER (ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty) AS rownum
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
)
SELECT currow.SalesOrderID,
currow.SalesOrderDetailID,
currow.OrderQty,
nextrow.SalesOrderDetailID AS LeadValue,
prevrow.SalesOrderDetailID AS LagVAlue
FROM tempSalesOrderDetail currow
LEFT JOIN tempSalesOrderDetail nextrow
ON currow.rownum = nextrow.rownum – 1
LEFT JOIN tempSalesOrderDetail prevrow
ON currow.rownum = prevrow.rownum + 1
GO
——————————————————————
Sir,
Please correct me, if I’m wrong.
Thanks and Regards,
SIJIN KUMAR V P
INDIA
Following is the solution for your puzzle without using LEAD and LAG functions
USE AdventureWorks
GO
WITH cteSalesOrderDetail AS
(
SELECT s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty, ROW_NUMBER() OVER (ORDER BY s.SalesOrderDetailID) AS rownum
FROM Sales.SalesOrderDetail s WITH(NOLOCK)
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
)
SELECT c1.SalesOrderID, c1.SalesOrderDetailID, c1.OrderQty, c2.SalesOrderDetailID AS LeadValue, c3.SalesOrderDetailID AS LagVAlue
FROM cteSalesOrderDetail c1
LEFT JOIN cteSalesOrderDetail c2
ON c1.rownum = c2.rownum – 1
LEFT JOIN cteSalesOrderDetail c3
ON c1.rownum = c3.rownum + 1
ORDER BY c1.SalesOrderID,c1.SalesOrderDetailID,c1.OrderQty
2. When using ” partition by ”
select a1.salesorderid,a1.salesorderdetailsid,a1.orderqty,
case a1.salesorderdetailsid
when min(a2.salesorderdetailsid) over (partition by a2.salesorderid) then min(a2.salesorderdetailsid) over (partition by a2.salesorderid)
else
min(a2.salesorderdetailsid) over (partition by a2.salesorderid)
end
as FstValue,a2.salesorderdetailsid as LstValue
from adventureworks a1
inner join adventureworks a2
on a1.salesorderdetailsid=a2.salesorderdetailsid
WHERE a1.salesorderid in (43670,43669,43667,43663)
order by a1.salesorderid,a1.salesorderdetailsid,a1.orderqty
USE AdventureWorks
GO
WITH SalesOrderDetailOrdered as
(
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, ROW_NUMBER() OVER (ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty) RowNumber
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
)
SELECT sbase.SalesOrderID,sbase.SalesOrderDetailID,sbase.OrderQty,slead.SalesOrderDetailID,slag.SalesOrderDetailID
FROM SalesOrderDetailOrdered sbase
LEFT OUTER JOIN SalesOrderDetailOrdered slead ON sbase.RowNumber=slead.RowNumber-1
LEFT OUTER JOIN SalesOrderDetailOrdered slag ON sbase.RowNumber=slag.RowNumber+1
Hi Pinal,
This is my solution to the puzzle.
;with Sales
as
(
SELECT SalesOrderId,SalesOrderDetail,OrderQty,row_number() over(order by salesorderid) as Row FROM SALESORDERDETAIL
)
select a.SalesOrderId,a.SalesOrderDetail,a.OrderQty,b.SalesOrderDetail As LeadValue,c.SalesOrderDetail As LagValue from sales a
left join Sales b
on a.Row +1 = b.Row
left join Sales c
on a.Row = c.Row+1
Regards,
Rajkumar,
Bangalore.
I have one without using SQL2012 new functions
And without any Join:
With T1 As
(Select Row_Number() Over(Order By SalesOrderDetailID) N,
s.SalesOrderID,
s.SalesOrderDetailID,
s.OrderQty
From Sales.SalesOrderDetail s
Where SalesOrderID IN (43670, 43669, 43667, 43663))
Select SalesOrderID,SalesOrderDetailID,OrderQty,
Case When N%2=1 Then Max(Case When N%2=0 Then SalesOrderDetailID End) Over (Partition By (N+1)/2) Else Max(Case When N%2=1 Then SalesOrderDetailID End) Over (Partition By N/2) End LeadVal,
Case When N%2=1 Then Max(Case When N%2=0 Then SalesOrderDetailID End) Over (Partition By N/2) Else Max(Case When N%2=1 Then SalesOrderDetailID End) Over (Partition By (N+1)/2) End LagVal
From T1
Order BY SalesOrderID,
SalesOrderDetailID,
OrderQty;
Go
I tried a few things, but this seemed to be the cleanest method:
USE AdventureWorks
GO
WITH tmpSalesOrderDetail AS
(
SELECT
SalesOrderID,
SalesOrderDetailID,
OrderQty,
ROW_NUMBER() OVER (ORDER BY SalesOrderDetailID) AS rownum
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
)
SELECT
t.SalesOrderID,
t.SalesOrderDetailID,
t.OrderQty,
t_lead.SalesOrderDetailID AS LeadValue,
t_lag.SalesOrderDetailID AS LagValue
FROM tmpSalesOrderDetail t
LEFT OUTER JOIN tmpSalesOrderDetail t_lead
ON t.rownum = t_lead.rownum – 1
LEFT OUTER JOIN tmpSalesOrderDetail t_lag
ON t.rownum = t_lag.rownum + 1
ORDER BY t.SalesOrderID, t.SalesOrderDetailID, t.OrderQty;
USE AdventureWorks
GO
;WITH cteSalesOrderDetail AS
(
SELECT s.SalesOrderID,
s.SalesOrderDetailID,
s.OrderQty,
ROW_NUMBER() OVER (ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty) AS rownum
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
)
SELECT [Current Row].SalesOrderID,
[Current Row].SalesOrderDetailID,
[Current Row].OrderQty,
[Next Row].SalesOrderDetailID AS LeadValue,
[Previous Row].SalesOrderDetailID AS LagVAlue
FROM cteSalesOrderDetail [Current Row]
LEFT JOIN cteSalesOrderDetail [Next Row]
ON [Next Row].rownum = [Current Row].rownum + 1
LEFT JOIN cteSalesOrderDetail [Previous Row]
ON [Previous Row].rownum = [Current Row].rownum – 1
Hi Pinal,
Here is my solution to your puzzle :
select SalesOrderID , SalesOrderDetailID , OrderQty , Lead , Lag from [SalesLT].[SalesOrderDetail] S1
Outer Apply(
select top 1 min(f.SalesOrderDetailID) “Lead” from [SalesLT].[SalesOrderDetail] s inner join [SalesLT].[SalesOrderDetail] f on (s.SalesOrderID = f.SalesOrderID AND s.SalesOrderDetailID f.SalesOrderDetailID ) where s.SalesOrderDetailID=S1.SalesOrderDetailID group by s.SalesOrderDetailID order by s.SalesOrderDetailID) B
I have not used any built in windows functions . All the replies posted above are using Row_Number , I have not used even that.
Let me know your feedback.
select SalesOrderID , SalesOrderDetailID , OrderQty , Lead , Lag from [SalesLT].[SalesOrderDetail] S1
Outer Apply(
select top 1 min(f.SalesOrderDetailID) “Lead” from [SalesLT].[SalesOrderDetail] s inner join [SalesLT].[SalesOrderDetail] f on (s.SalesOrderID = f.SalesOrderID AND s.SalesOrderDetailID f.SalesOrderDetailID ) where s.SalesOrderDetailID=S1.SalesOrderDetailID group by s.SalesOrderDetailID order by s.SalesOrderDetailID) B
select SalesOrderID , SalesOrderDetailID , OrderQty , Lead , Lag from [SalesLT].[SalesOrderDetail] S1 Outer Apply(
select top 1 min(f.SalesOrderDetailID) “Lead” from [SalesLT].[SalesOrderDetail] s inner join [SalesLT].[SalesOrderDetail] f on (s.SalesOrderID = f.SalesOrderID AND s.SalesOrderDetailID f.SalesOrderDetailID ) where s.SalesOrderDetailID=S1.SalesOrderDetailID group by s.SalesOrderDetailID order by s.SalesOrderDetailID) B
Looks like the query is being truncated automaticaly.