SQL SERVER – Puzzle to Win Print Book – Write T-SQL Self Join Without Using LEAD and LAG

Last week we asked a puzzle SQL SERVER – Puzzle to Win Print Book – Functions FIRST_VALUE and LAST_VALUE with OVER clause and ORDER BY . This puzzle got very interesting participation. The details of the winner is listed here.

In this puzzle we received two very important feedback.

  1. This puzzle cleared the concepts of First_Value and Last_Value to the participants.
  2. As this was based on SQL Server 2012 many could not participate it as they have yet not installed SQL Server 2012.

I really appreciate the feedback of user and decided to come up something as fun and helps learn new feature of SQL Server 2012.

Please read yesterday’s blog post SQL SERVER – Introduction to LEAD and LAG – Analytic Functions Introduced in SQL Server 2012 before continuing this puzzle as it is based on yesterday’s post.

Let us fun following query.
USE AdventureWorks
GO
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
LEAD(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
) LeadValue,
LAG(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
) LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
GO

Above query will give us following result.

Puzzle:

Now use T-SQL Self Join where same table is joined to itself and get the same result without using LEAD or LAG functions.

Hint:

Rules

  • Leave a comment with your detailed answer by Nov 21′s blog post.
  • Open world-wide (where Amazon ships books)
  • If you blog about puzzle’s solution and if you win, you win additional surprise gift as well.

Prizes

Print copy of my new book SQL Server Interview Questions Amazon|Flipkart

If you already have this book, you can opt for any of my other books SQL Wait Stats [Amazon|Flipkart|Kindle] and SQL Programming [Amazon|Flipkart|Kindle].

Reference: Pinal Dave (http://blog.SQLAuthority.com)

About these ads

20 thoughts on “SQL SERVER – Puzzle to Win Print Book – Write T-SQL Self Join Without Using LEAD and LAG

  1. Your article and the puzzle say they refer to LEAD() and LAG() from yesterday’s post, but the sample query and results use FIRST_VALUE() and LAST_VALUE(). I’m not sure which set of functions you want to simulate in your puzzle. Here are two queries that I believe emulate the desired functionality for either pair of functions. Although you didn’t ask for it in the puzzle, the second of these two queries also supports the equivalent of the optional 2nd and 3rd parameters of LEAD() and LAG().

    — a query to emulate FIRST_VALUE() and LAST_VALUE()
    ;with s as (
    select
    SalesOrderID,
    SalesOrderDetailID,
    OrderQty
    from Sales.SalesOrderDetail
    WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
    )
    SELECT s.SalesOrderID,
    s.SalesOrderDetailID,
    s.OrderQty,
    (select top 1 SalesOrderDetailID
    from s order BY SalesOrderDetailID ) as FstValue,
    s.SalesOrderDetailID as LstValue
    FROM s
    ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty

    — a query to emulate LEAD() and LAG()
    ;with s as (
    select
    0 as ldOffset, — equiv to 2nd param of LEAD
    0 as lgOffset, — equiv to 2nd param of LAG
    null as ldDefVal, — equiv to 3rd param of LEAD
    null as lgDefVal, — equiv to 3rd param of LAG
    ROW_NUMBER() OVER (ORDER BY SalesOrderDetailID) as row,
    SalesOrderID,
    SalesOrderDetailID,
    OrderQty
    from Sales.SalesOrderDetail
    WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
    )
    SELECT s.SalesOrderID,
    s.SalesOrderDetailID,
    s.OrderQty,
    isnull( sLd.SalesOrderDetailID, s.LdDefVal) as LeadValue,
    isnull( sLg.SalesOrderDetailID, s.LgDefVal) as LagValue
    FROM s
    left outer join s as sLd on s.row = sLd.row – s.ldOffset
    left outer join s as sLg on s.row = sLg.row + s.lgOffset
    ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty

    • Here’s a variation of my comment yesterday. This one includes the PARTITION BY behavior in addition to support for the 2nd and 3rd optional parameters for LEAD() and LAG()

      /* a query to emulate LEAD() and LAG() */
      ;with s as (
      select
      0 as LeadOffset, /* equiv to 2nd param of LEAD */
      0 as LagOffset, /* equiv to 2nd param of LAG */
      null as LeadDefVal, /* equiv to 3rd param of LEAD */
      null as LagDefVal, /* equiv to 3rd param of LAG */
      /* Try changing the values of the 4 integer values above to see their effect on the results */
      /* The values given above of 0, 0, null and null
      behave the same as the default 2nd and 3rd parameters to LEAD() and LAG() */
      ROW_NUMBER() OVER (ORDER BY SalesOrderDetailID) as row,
      SalesOrderID,
      SalesOrderDetailID,
      OrderQty
      from Sales.SalesOrderDetail
      WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
      )
      SELECT s.SalesOrderID,
      s.SalesOrderDetailID,
      s.OrderQty,
      isnull( sLead.SalesOrderDetailID, s.LeadDefVal) as LeadValue,
      isnull( sLag.SalesOrderDetailID, s.LagDefVal) as LagValue
      FROM s
      left outer join s as sLead
      on s.row = sLead.row – s.LeadOffset
      /* Try commenting out this next line when LeadOffset != 0 */
      and s.SalesOrderID = sLead.SalesOrderID
      /* The additional join criteria on SalesOrderID above
      is equivalent to PARTITION BY SalesOrderID
      in the OVER clause of the LEAD() function */
      left outer join s as sLag
      on s.row = sLag.row + s.LagOffset
      /* Try commenting out this next line when LagOffset != 0 */
      and s.SalesOrderID = sLag.SalesOrderID
      /* The additional join criteria on SalesOrderID above
      is equivalent to PARTITION BY SalesOrderID
      in the OVER clause of the LAG() function */
      ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty

  2. Hi Pinal,

    I think there is some typo error as the article refer Lead(), Leag() function and the sample query you provided is referring to First_Value(), Last_value() function.

    I have written query that gives Lead value and Leag value using self join.,

    –CTE based solution

    ;
    with cteMain
    As
    (
    select
    SalesOrderID,
    SalesOrderDetailID,
    OrderQty,
    ROW_NUMBER() over (order by salesorderdetailid) as sn
    from
    Sales.SalesOrderDetail
    WHERE
    SalesOrderID IN (43670, 43669, 43667, 43663)
    )

    select
    m.SalesOrderID, m.SalesOrderDetailID, m.OrderQty,
    sLead.SalesOrderDetailID as leadvalue,
    sLeg.SalesOrderDetailID as leagvalue

    from
    cteMain as m
    left outer join cteMain as sLead
    on sLead.sn = m.sn+1
    left outer join cteMain as sLeg
    on sLeg.sn = m.sn-1
    order by
    m.SalesOrderID, m.SalesOrderDetailID, m.OrderQty

    — The same can be achieved using Co-Related subquery…

    select
    m.SalesOrderID,
    m.SalesOrderDetailID,
    m.OrderQty,
    ( select MIN(salesorderdetailid)
    from Sales.SalesOrderDetail as l
    where l.SalesOrderID in (43670, 43669, 43667, 43663)
    and l.SalesOrderID >= m.SalesOrderID and l.SalesOrderDetailID > m.SalesOrderDetailID
    ) as lead,

    ( select MAX(salesorderdetailid)
    from Sales.SalesOrderDetail as l
    where l.SalesOrderID in (43670, 43669, 43667, 43663)
    and l.SalesOrderID <= m.SalesOrderID and l.SalesOrderDetailID < m.SalesOrderDetailID
    ) as leag

    from
    Sales.SalesOrderDetail as m
    where
    m.SalesOrderID in (43670, 43669, 43667, 43663)
    order by
    m.SalesOrderID, m.SalesOrderDetailID, m.OrderQty

    Thanks,
    Pravin Patel.

  3. Dear Pinal Sir,

    Yesterday your blog article was saying about LEAD or LAG functions. That’s right, but the sample query you are used with FIRST_VALUE and LAST_VALUE for today’s puzzle.

    I’m confused.. expecting your reply..

    Thanks and regards,

    Nikhildas

  4. Hi Sir,

    bit confused over the blog puzzle question and the required answer.

    However to execute the result same as lead and lag function using self join, i wrote the following query

    USE AdventureWorks

    DROP TABLE #TempSalesOrderDetail

    CREATE TABLE #TempSalesOrderDetail
    (
    Rownumber INT
    ,LEAD INT
    ,LAG INT
    ,SalesOrderID INT
    ,SalesOrderDetailID INT
    ,OrderQty INT
    )

    INSERT INTO #TempSalesOrderDetail
    SELECT ROW_NUMBER() OVER (ORDER BY S.SalesOrderID)
    ,ROW_NUMBER() OVER (ORDER BY S.SalesOrderID)-1
    ,ROW_NUMBER() OVER (ORDER BY S.SalesOrderID)+1
    ,s.SalesOrderID
    ,s.SalesOrderDetailID
    ,s.OrderQty
    FROM Sales.SalesOrderDetail s
    WHERE S.SalesOrderID IN (43670, 43669, 43667, 43663)
    ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty

    SELECT A.SalesOrderID
    ,A.SalesOrderDetailID
    ,A.OrderQty
    ,B.SalesOrderDetailID AS Lead
    ,C.SalesOrderDetailID AS Lag
    FROM #TempSalesOrderDetail AS A
    LEFT JOIN #TempSalesOrderDetail AS B ON A.Rownumber = B.LEAD
    LEFT JOIN #TempSalesOrderDetail AS C ON A.Rownumber = C.LAG

    The output will be similar to the blog post on Lead and Lag functions described yesterday.

    Thanks and Regards,
    P.Anish Shenoy.

  5. Hi, plz check below query. This will show result as you like.

    SELECT SalesOrderID,SalesOrderDetailID, OrderQty,
    ( SELECT MIN(SalesOrderDetailID) AS FirstValue
    FROM Sales.SalesOrderDetail WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
    )AS FirstValue
    ,SalesOrderDetailID AS LastValue
    FROM Sales.SalesOrderDetail
    WHERE SalesOrderID IN (43670, 43669, 43667, 43663)

    Vinay

  6. Hi Dave,
    This is my trail please have a look at it

    1.

    SELECT a1.salesorderid,
    a1.salesorderdetailsid,
    a1.orderqty,
    min(a2.salesorderdetailsid) over () as FstValue,
    a2.salesorderdetailsid as LstValue
    FROM adventureworks a1
    INNER JOIN adventureworks a2
    ON a1.salesorderdetailsid=a2.salesorderdetailsid
    WHERE a1.salesorderid in (43670,43669,43667,43663)
    ORDER BY a1.salesorderid,a1.salesorderdetailsid,a1.orderqty

    SalesOrderId SalesOrderDetailsId OrderQty FstValue LstValue
    _________________________________________________________
    43663 52 1 52 52
    43667 77 3 52 77
    43667 78 1 52 78
    43667 79 1 52 79
    43667 80 1 52 80
    43669 110 1 52 110
    43670 111 1 52 111
    43670 112 2 52 112
    43670 113 2 52 113
    43670 114 1 52 114
    ___________________________________________________________

    2. When use the same thing with PARTITION BY

    SalesOrderId SalesOrderDetailsId OrderQty FstValue LstValue
    ___________________________________________________________

    43663 52 1 52 52
    43667 77 3 77 77
    43667 78 1 77 78
    43667 79 1 77 79
    43667 80 1 77 80
    43669 110 1 110 110
    43670 111 1 111 111
    43670 112 2 111 112
    43670 113 2 111 113
    43670 114 1 111 114
    ___________________________________________________________

  7. Dear Sir,
    Please have a look on the below query.This will give the required result.

    USE AdventureWorks
    GO

    WITH tempSalesOrderDetail AS
    (
    SELECT s.SalesOrderID,
    s.SalesOrderDetailID,
    s.OrderQty,
    ROW_NUMBER() OVER (ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty) AS rownum
    FROM Sales.SalesOrderDetail s
    WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
    )

    SELECT currow.SalesOrderID,
    currow.SalesOrderDetailID,
    currow.OrderQty,
    nextrow.SalesOrderDetailID AS LeadValue,
    prevrow.SalesOrderDetailID AS LagVAlue
    FROM tempSalesOrderDetail currow
    LEFT JOIN tempSalesOrderDetail nextrow
    ON currow.rownum = nextrow.rownum – 1
    LEFT JOIN tempSalesOrderDetail prevrow
    ON currow.rownum = prevrow.rownum + 1
    GO

    ——————————————————————
    Sir,
    Please correct me, if I’m wrong.

    Thanks and Regards,
    SIJIN KUMAR V P
    INDIA

  8. Following is the solution for your puzzle without using LEAD and LAG functions

    USE AdventureWorks
    GO
    WITH cteSalesOrderDetail AS
    (
    SELECT s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty, ROW_NUMBER() OVER (ORDER BY s.SalesOrderDetailID) AS rownum
    FROM Sales.SalesOrderDetail s WITH(NOLOCK)
    WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
    )

    SELECT c1.SalesOrderID, c1.SalesOrderDetailID, c1.OrderQty, c2.SalesOrderDetailID AS LeadValue, c3.SalesOrderDetailID AS LagVAlue
    FROM cteSalesOrderDetail c1
    LEFT JOIN cteSalesOrderDetail c2
    ON c1.rownum = c2.rownum – 1
    LEFT JOIN cteSalesOrderDetail c3
    ON c1.rownum = c3.rownum + 1
    ORDER BY c1.SalesOrderID,c1.SalesOrderDetailID,c1.OrderQty

  9. 2. When using ” partition by ”

    select a1.salesorderid,a1.salesorderdetailsid,a1.orderqty,
    case a1.salesorderdetailsid
    when min(a2.salesorderdetailsid) over (partition by a2.salesorderid) then min(a2.salesorderdetailsid) over (partition by a2.salesorderid)
    else
    min(a2.salesorderdetailsid) over (partition by a2.salesorderid)
    end
    as FstValue,a2.salesorderdetailsid as LstValue
    from adventureworks a1
    inner join adventureworks a2
    on a1.salesorderdetailsid=a2.salesorderdetailsid
    WHERE a1.salesorderid in (43670,43669,43667,43663)
    order by a1.salesorderid,a1.salesorderdetailsid,a1.orderqty

  10. USE AdventureWorks
    GO
    WITH SalesOrderDetailOrdered as
    (
    SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, ROW_NUMBER() OVER (ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty) RowNumber
    FROM Sales.SalesOrderDetail s
    WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
    )

    SELECT sbase.SalesOrderID,sbase.SalesOrderDetailID,sbase.OrderQty,slead.SalesOrderDetailID,slag.SalesOrderDetailID
    FROM SalesOrderDetailOrdered sbase
    LEFT OUTER JOIN SalesOrderDetailOrdered slead ON sbase.RowNumber=slead.RowNumber-1
    LEFT OUTER JOIN SalesOrderDetailOrdered slag ON sbase.RowNumber=slag.RowNumber+1

  11. Hi Pinal,

    This is my solution to the puzzle.

    ;with Sales
    as
    (
    SELECT SalesOrderId,SalesOrderDetail,OrderQty,row_number() over(order by salesorderid) as Row FROM SALESORDERDETAIL
    )
    select a.SalesOrderId,a.SalesOrderDetail,a.OrderQty,b.SalesOrderDetail As LeadValue,c.SalesOrderDetail As LagValue from sales a
    left join Sales b
    on a.Row +1 = b.Row
    left join Sales c
    on a.Row = c.Row+1

    Regards,

    Rajkumar,
    Bangalore.

  12. I have one without using SQL2012 new functions
    And without any Join:

    With T1 As
    (Select Row_Number() Over(Order By SalesOrderDetailID) N,
    s.SalesOrderID,
    s.SalesOrderDetailID,
    s.OrderQty
    From Sales.SalesOrderDetail s
    Where SalesOrderID IN (43670, 43669, 43667, 43663))
    Select SalesOrderID,SalesOrderDetailID,OrderQty,
    Case When N%2=1 Then Max(Case When N%2=0 Then SalesOrderDetailID End) Over (Partition By (N+1)/2) Else Max(Case When N%2=1 Then SalesOrderDetailID End) Over (Partition By N/2) End LeadVal,
    Case When N%2=1 Then Max(Case When N%2=0 Then SalesOrderDetailID End) Over (Partition By N/2) Else Max(Case When N%2=1 Then SalesOrderDetailID End) Over (Partition By (N+1)/2) End LagVal
    From T1
    Order BY SalesOrderID,
    SalesOrderDetailID,
    OrderQty;
    Go

  13. I tried a few things, but this seemed to be the cleanest method:

    USE AdventureWorks
    GO

    WITH tmpSalesOrderDetail AS
    (
    SELECT
    SalesOrderID,
    SalesOrderDetailID,
    OrderQty,
    ROW_NUMBER() OVER (ORDER BY SalesOrderDetailID) AS rownum
    FROM Sales.SalesOrderDetail
    WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
    )

    SELECT
    t.SalesOrderID,
    t.SalesOrderDetailID,
    t.OrderQty,
    t_lead.SalesOrderDetailID AS LeadValue,
    t_lag.SalesOrderDetailID AS LagValue
    FROM tmpSalesOrderDetail t
    LEFT OUTER JOIN tmpSalesOrderDetail t_lead
    ON t.rownum = t_lead.rownum – 1
    LEFT OUTER JOIN tmpSalesOrderDetail t_lag
    ON t.rownum = t_lag.rownum + 1
    ORDER BY t.SalesOrderID, t.SalesOrderDetailID, t.OrderQty;

  14. USE AdventureWorks
    GO
    ;WITH cteSalesOrderDetail AS
    (
    SELECT s.SalesOrderID,
    s.SalesOrderDetailID,
    s.OrderQty,
    ROW_NUMBER() OVER (ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty) AS rownum
    FROM Sales.SalesOrderDetail s
    WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
    )
    SELECT [Current Row].SalesOrderID,
    [Current Row].SalesOrderDetailID,
    [Current Row].OrderQty,
    [Next Row].SalesOrderDetailID AS LeadValue,
    [Previous Row].SalesOrderDetailID AS LagVAlue
    FROM cteSalesOrderDetail [Current Row]
    LEFT JOIN cteSalesOrderDetail [Next Row]
    ON [Next Row].rownum = [Current Row].rownum + 1
    LEFT JOIN cteSalesOrderDetail [Previous Row]
    ON [Previous Row].rownum = [Current Row].rownum – 1

  15. Pingback: SQL SERVER – 2012 – Summary of All the Analytic Functions – MSDN and SQLAuthority « Journey to SQLAuthority

  16. Pingback: SQL SERVER – Solution to Puzzle – Simulate LEAD() and LAG() without Using SQL Server 2012 Analytic Function « Journey to SQLAuthority

  17. Pingback: SQL SERVER – Puzzle to Win Print Book and Free 30 Days Online Training Material « SQL Server Journey with SQL Authority

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s