SQL SERVER – Introduction to LEAD and LAG – Analytic Functions Introduced in SQL Server 2012

SQL Server 2012 introduces new analytical function LEAD() and LAG(). These functions accesses data from a subsequent row (for lead) and previous row (for lag) in the same result set without the use of a self-join . It will be very difficult to explain this in words so I will attempt small example to explain you this function. Instead of creating new table, I will be using AdventureWorks sample database as most of the developer uses that for experiment.

Let us fun following query.
USE AdventureWorks
GO
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
LEAD(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
) LeadValue,
LAG(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
) LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
GO

Above query will give us following result.

SQL SERVER - Introduction to LEAD and LAG - Analytic Functions Introduced in SQL Server 2012 leadlag1

When we look at above resultset it is very clear that LEAD function gives us value which is going to come in next line and LAG function gives us value which was encountered in previous line. If we have to generate the same result without using this function we will have to use self join. In future blog post we will see the same. Let us explore this function a bit more. This function not only provide previous or next line but it can also access any line before or after using offset.

Let us fun following query, where LEAD and LAG function accesses the row with offset of 2.

Solarwinds

USE AdventureWorks
GO
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
LEAD(SalesOrderDetailID,2) OVER (ORDER BY SalesOrderDetailID
) LeadValue,
LAG(SalesOrderDetailID,2) OVER (ORDER BY SalesOrderDetailID
) LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
GO

Above query will give us following result.

SQL SERVER - Introduction to LEAD and LAG - Analytic Functions Introduced in SQL Server 2012 leadlag2

You can see the LEAD and LAG functions  now have interval of  rows when they are returning results. As there is interval of two rows the first two rows in LEAD function and last two rows in LAG function will return NULL value. You can easily replace this NULL Value with any other default value by passing third parameter in LEAD and LAG function.

Let us fun following query.

USE AdventureWorks
GO
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
LEAD(SalesOrderDetailID,2,0) OVER (ORDER BY SalesOrderDetailID
) LeadValue,
LAG(SalesOrderDetailID,2,0) OVER (ORDER BY SalesOrderDetailID
) LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
GO

Above query will give us following result, where NULL are now replaced with value 0.

SQL SERVER - Introduction to LEAD and LAG - Analytic Functions Introduced in SQL Server 2012 leadlag3

Just like any other analytic function we can easily partition this function as well. Let us see the use of PARTITION BY in this clause.

USE AdventureWorks
GO
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
LEAD(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID
ORDER BY SalesOrderDetailID
) LeadValue,
LAG(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID
ORDER BY SalesOrderDetailID
) LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
GO

Above query will give us following result, where now the data is partitioned by SalesOrderID and LEAD and LAG functions are returning the appropriate result in that window. As now there are smaller partition in my query, you will see higher presence of NULL.

SQL SERVER - Introduction to LEAD and LAG - Analytic Functions Introduced in SQL Server 2012 leadlag4

In future blog post we will see how this functions are compared to SELF JOIN.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Solarwinds
,
Previous Post
SQLAuthority News – A Real Story of Book Getting ‘Out of Stock’ to A 25% Discount Story Available
Next Post
SQL SERVER – Puzzle to Win Print Book – Write T-SQL Self Join Without Using LEAD and LAG

Related Posts

45 Comments. Leave new

  • Error Message:

    Msg 195, Level 15, State 10, Line 2
    ‘LEAD’ is not a recognized built-in function name.

    Sorry this is the error I am getting when I am trying to execute the below query.

    USE AdventureWorks2008R2
    GO
    SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
    LEAD(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
    ) LeadValue,
    LAG(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
    ) LagValue
    FROM Sales.SalesOrderDetail s
    WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
    ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
    GO

    Thanks
    Anil

    Reply
  • Vary good
    thinks

    Reply
  • very good article

    Reply
  • kirti sardana
    May 21, 2013 11:27 am

    Very easy and straight forward explanation Pinal :)
    Cheers !
    Kirti Sardana

    Reply
  • Arturo Hernandez
    August 1, 2013 3:20 am

    BTW, Oracle introduced this exact functionality in 2004.

    Reply
  • Excellent examples, very clear and understandable.

    Reply
  • Thanks Pinal Dave. I was looking for something similar.

    Reply
  • The lag-function is one those I use since many years in the statistical software SPSS – great that it is included in MS-SQL also now. Nevertheless I have problems using it to clean duplicates in my table “titles”. It consists of music titles and has a PK named “TID” (title-id), a column “IID” for the Artist (“Interpret” in German) and “Title”, which keeps the Name of the title as a string.
    Some titles have different TID’s although they equal, some others differ only a little due to a wrong spelling (i.e. “Rock N Roll Train” and “Rock`N`Roll Train”).
    SELECT IID, TID, Titel,
    SUBSTRING(Titel, 1, 10) AS ShortTitleCurrent,
    SUBSTRING(LAG(Titel) OVER(ORDER BY IID, Titel), 1, 10) AS ShortTitleLast,
    IIF(IID=LAG(IID) OVER(ORDER BY IID, Titel) AND
    SUBSTRING(Titel, 1, 10)=SUBSTRING(LAG(Titel) OVER(ORDER BY IID, Titel), 1, 10),
    LAG(TID) OVER(ORDER BY IID, Titel), NULL) AS LAG_TID
    FROM dbo.Titel
    ORDER BY IID, Titel ;

    This works fine, for all titles of every artist which are similar on the first 10 characters I get the correct “LAG_TID”. The problem begins when I try to save this LAG_TID in a column 2 “TID_NEU”. Because I can’t use “windowed functions” like LAG directly in UPDATE (as far as I know) I think I have to include a subquery. My idea to solve this (in a simplified version) looks like this (including the Partition-technique I just have learned in this article):
    UPDATE Titel
    SET TID_LAG =
    (SELECT LAG(TID) OVER(PARTITION BY IID ORDER BY Titel) AS TID_LAG
    FROM GEMA.dbo.Titel)

    But SQL has another opinion: “Msg 512, Level 16, State 1, Line 7
    Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, , >= or when the subquery is used as an expression.
    The statement has been terminated.”

    Does anybody has a solution for me? Any hint would be appreciated!
    Dirk

    Reply
  • How can achieve same functionality with SQL Server 2008

    Reply
    • I use a CTE (WITH statement) to build the set of data I want to work with, and in that I include a row_number with no partition, but ORDERED BY in the order I want.
      I then
      SELECT ….
      FROM CTE A
      LEFT JOIN CTE B ON B.rownum = A.rownum – 1
      LEFT JOIN CTE C ON C.rownum = A.rownum + 1

      B is then LAG and C is then LEAD, with nulls where LAG and LEAD cannot exist (end conditions)

      Reply
  • Kaushik,
    Have you seen

    Reply
  • Pinal,

    Can you the elaborate the same with any arbitrary expressions.

    Thanks
    Mahadev

    Reply
  • Is this possible to use aggregate functions while using Lead function

    Reply
  • Sunil Srivastav
    October 28, 2015 7:51 am

    Oracle Does have LEAD / LAG Analytical functions from begining.
    but what i love about oracle it multiversion ability.

    Reply
  • Narayan Prasad Dahal
    September 13, 2016 4:18 pm

    I have two tables

    tblA(sn, ID int pk, name varchar(50), amountA decimal(18,2))
    and

    tblB(ID int fk, amountB decimal(18,2))
    here: tblA occures only once and tblB may occure multiple time I need the query to display data like:

    sn | ID | name | AmountA | amountB | Balance
    1 | 1001 | abc | 5000.00 | | 5000.00
    2 | 1002 | xyz | 10000.00 | |
    | 1002 | | | 4000.00 | 6000.00 (AmountA-AmountB)
    3 | 1003 | pqr | 15000.00 | |
    | 1003 | | | 4000.00 |
    |1003 | | |3000.00 |
    |1003 | | |2000.00 | 6000.00 (AmountA-sum(AmountB))

    Please ask if any confusion I tried using lag and lead function but I couldnot get the desire result, Please help.

    Reply
  • Sujith Augustine
    January 4, 2017 1:09 pm

    I have a table in SQL 2008 with the below sample

    EmployeeID | Employee | PunchDateTime
    ——————|—————-|———————-

    11 | Sujith | 22/12/2016 08:16:00
    11 | Sujith | 22/12/2016 13:35:00
    11 | Sujith | 22/12/2016 17:23:00
    11 | Sujith | 22/12/2016 21:09:00
    12 | Tony | 22/12/2016 14:06:00
    12 | Tony | 22/12/2016 22:39:00
    13 | Jimmy | 22/12/2016 08:00:00
    13 | Jimmy | 22/12/2016 17:12:00
    14 | Nitha | 22/12/2016 18:50:00
    16 | Juby | 22/12/2016 09:00:00
    14 | Nitha | 23/12/2016 07:05:00
    16 | Juby | 23/12/2016 09:05:00

    I want to build a query to the below results:

    EmployeeID| Employee | DutydateTimeIn | DutyDateTimeOut | WorkingHours
    —————|————–|—————————|—————————|——————-

    11 | Sujith | 22/12/2016 08:16:00 | 22/12/2016 13:35:00 |05:19
    11 | Sujith | 22/12/2016 17:23:00 | 22/12/2016 21:09:00 |03:46
    12 | Tony | 22/12/2016 14:06:00 | 22/12/2016 22:39:00 |08:33
    13 | Jimmy | 22/12/2016 08:00:00 | 22/12/2016 17:12:00 |09:12
    14 | Nitha | 22/12/2016 18:50:00 | 23/12/2016 07:05:00 |12:15
    16 | Juby | 22/12/2016 09:00:00 | |
    16 | Juby | 23/12/2016 09:05:00 | |

    I have try a query, but it showing an error message

    Query
    ——–
    select EmployeeID, EmployeeName, in_time, out_time
    from(
    select Attendance.EmployeeID,
    EmployeeName,
    LogDate + ‘ ‘ + LogTime in_time,
    lead(LogDate + ‘ ‘ + LogTime) over(order by Attendance.EmployeeID) out_time,
    ROW_NUMBER() over(partition by Attendance.EmployeeID order by Attendance.EmployeeID) row_num
    from dbo.HR_AttendanceLog Attendance
    JOIN dbo.HR_Employee Employee ON Employee.EmployeeID = Attendance.EmployeeID) as cte
    where row_num % 2 != 0

    Error Message
    ——————
    Msg 195, Level 15, State 10, Line 6
    ‘lead’ is not a recognized built-in function name.

    Reply
  • Thanks

    Reply
  • Hello, is it possible to use lead and lag to gather last 15 days of data, to list or pull out, only those rows that have changed in the last 15 days? Or am I barking up the wrong tree, and there is something out there that would be better suited? Thank you.

    Reply
    • Also, the differences will be between two tables, so not sure if LEAD and LAG are specifically designed to be used on just the one table.

      Reply
  • Simply examples makes huge difference. Thx pinal. U rock always….:)

    Reply

Leave a Reply

Menu