SQL Server 2012 introduces new analytical function LEAD() and LAG(). These functions accesses data from a subsequent row (for lead) and previous row (for lag) in the same result set without the use of a self-join . It will be very difficult to explain this in words so I will attempt small example to explain you this function. Instead of creating new table, I will be using AdventureWorks sample database as most of the developer uses that for experiment.
Let us fun following query.
USE AdventureWorks
GO
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
LEAD(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
) LeadValue,
LAG(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
) LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
GO
Above query will give us following result.
When we look at above resultset it is very clear that LEAD function gives us value which is going to come in next line and LAG function gives us value which was encountered in previous line. If we have to generate the same result without using this function we will have to use self join. In future blog post we will see the same. Let us explore this function a bit more. This function not only provide previous or next line but it can also access any line before or after using offset.
Let us fun following query, where LEAD and LAG function accesses the row with offset of 2.
USE AdventureWorks
GO
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
LEAD(SalesOrderDetailID,2) OVER (ORDER BY SalesOrderDetailID
) LeadValue,
LAG(SalesOrderDetailID,2) OVER (ORDER BY SalesOrderDetailID
) LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
GO
Above query will give us following result.
You can see the LEAD and LAG functions now have interval of rows when they are returning results. As there is interval of two rows the first two rows in LEAD function and last two rows in LAG function will return NULL value. You can easily replace this NULL Value with any other default value by passing third parameter in LEAD and LAG function.
Let us fun following query.
USE AdventureWorks
GO
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
LEAD(SalesOrderDetailID,2,0) OVER (ORDER BY SalesOrderDetailID
) LeadValue,
LAG(SalesOrderDetailID,2,0) OVER (ORDER BY SalesOrderDetailID
) LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
GO
Above query will give us following result, where NULL are now replaced with value 0.
Just like any other analytic function we can easily partition this function as well. Let us see the use of PARTITION BY in this clause.
USE AdventureWorks
GO
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
LEAD(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID
ORDER BY SalesOrderDetailID
) LeadValue,
LAG(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID
ORDER BY SalesOrderDetailID
) LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
GO
Above query will give us following result, where now the data is partitioned by SalesOrderID and LEAD and LAG functions are returning the appropriate result in that window. As now there are smaller partition in my query, you will see higher presence of NULL.
In future blog post we will see how this functions are compared to SELF JOIN.
Reference: Pinal Dave (https://blog.sqlauthority.com)
45 Comments. Leave new
Error Message:
Msg 195, Level 15, State 10, Line 2
‘LEAD’ is not a recognized built-in function name.
Sorry this is the error I am getting when I am trying to execute the below query.
USE AdventureWorks2008R2
GO
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
LEAD(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
) LeadValue,
LAG(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
) LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
GO
Thanks
Anil
Anil
Seeing “USE AdventureWorks2008R2” makes me think you’re using SQL 2008R2. LEAD/LAG is only available in SQL 2012
Scott
I have server 2012 and it is still producing the same error for me…
Vary good
thinks
Very good article
very good article
Very easy and straight forward explanation Pinal :)
Cheers !
Kirti Sardana
BTW, Oracle introduced this exact functionality in 2004.
Excellent examples, very clear and understandable.
Thanks Pinal Dave. I was looking for something similar.
The lag-function is one those I use since many years in the statistical software SPSS – great that it is included in MS-SQL also now. Nevertheless I have problems using it to clean duplicates in my table “titles”. It consists of music titles and has a PK named “TID” (title-id), a column “IID” for the Artist (“Interpret” in German) and “Title”, which keeps the Name of the title as a string.
Some titles have different TID’s although they equal, some others differ only a little due to a wrong spelling (i.e. “Rock N Roll Train” and “Rock`N`Roll Train”).
SELECT IID, TID, Titel,
SUBSTRING(Titel, 1, 10) AS ShortTitleCurrent,
SUBSTRING(LAG(Titel) OVER(ORDER BY IID, Titel), 1, 10) AS ShortTitleLast,
IIF(IID=LAG(IID) OVER(ORDER BY IID, Titel) AND
SUBSTRING(Titel, 1, 10)=SUBSTRING(LAG(Titel) OVER(ORDER BY IID, Titel), 1, 10),
LAG(TID) OVER(ORDER BY IID, Titel), NULL) AS LAG_TID
FROM dbo.Titel
ORDER BY IID, Titel ;
This works fine, for all titles of every artist which are similar on the first 10 characters I get the correct “LAG_TID”. The problem begins when I try to save this LAG_TID in a column 2 “TID_NEU”. Because I can’t use “windowed functions” like LAG directly in UPDATE (as far as I know) I think I have to include a subquery. My idea to solve this (in a simplified version) looks like this (including the Partition-technique I just have learned in this article):
UPDATE Titel
SET TID_LAG =
(SELECT LAG(TID) OVER(PARTITION BY IID ORDER BY Titel) AS TID_LAG
FROM GEMA.dbo.Titel)
But SQL has another opinion: “Msg 512, Level 16, State 1, Line 7
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, , >= or when the subquery is used as an expression.
The statement has been terminated.”
Does anybody has a solution for me? Any hint would be appreciated!
Dirk
How can achieve same functionality with SQL Server 2008
I use a CTE (WITH statement) to build the set of data I want to work with, and in that I include a row_number with no partition, but ORDERED BY in the order I want.
I then
SELECT ….
FROM CTE A
LEFT JOIN CTE B ON B.rownum = A.rownum – 1
LEFT JOIN CTE C ON C.rownum = A.rownum + 1
B is then LAG and C is then LEAD, with nulls where LAG and LEAD cannot exist (end conditions)
Kaushik,
Have you seen
Incorrect syntax near ‘row_number’, expected ‘OVER’.
Pinal,
Can you the elaborate the same with any arbitrary expressions.
Thanks
Mahadev
Is this possible to use aggregate functions while using Lead function
Oracle Does have LEAD / LAG Analytical functions from begining.
but what i love about oracle it multiversion ability.
I have two tables
tblA(sn, ID int pk, name varchar(50), amountA decimal(18,2))
and
tblB(ID int fk, amountB decimal(18,2))
here: tblA occures only once and tblB may occure multiple time I need the query to display data like:
sn | ID | name | AmountA | amountB | Balance
1 | 1001 | abc | 5000.00 | | 5000.00
2 | 1002 | xyz | 10000.00 | |
| 1002 | | | 4000.00 | 6000.00 (AmountA-AmountB)
3 | 1003 | pqr | 15000.00 | |
| 1003 | | | 4000.00 |
|1003 | | |3000.00 |
|1003 | | |2000.00 | 6000.00 (AmountA-sum(AmountB))
Please ask if any confusion I tried using lag and lead function but I couldnot get the desire result, Please help.
I have a table in SQL 2008 with the below sample
EmployeeID | Employee | PunchDateTime
——————|—————-|———————-
11 | Sujith | 22/12/2016 08:16:00
11 | Sujith | 22/12/2016 13:35:00
11 | Sujith | 22/12/2016 17:23:00
11 | Sujith | 22/12/2016 21:09:00
12 | Tony | 22/12/2016 14:06:00
12 | Tony | 22/12/2016 22:39:00
13 | Jimmy | 22/12/2016 08:00:00
13 | Jimmy | 22/12/2016 17:12:00
14 | Nitha | 22/12/2016 18:50:00
16 | Juby | 22/12/2016 09:00:00
14 | Nitha | 23/12/2016 07:05:00
16 | Juby | 23/12/2016 09:05:00
I want to build a query to the below results:
EmployeeID| Employee | DutydateTimeIn | DutyDateTimeOut | WorkingHours
—————|————–|—————————|—————————|——————-
11 | Sujith | 22/12/2016 08:16:00 | 22/12/2016 13:35:00 |05:19
11 | Sujith | 22/12/2016 17:23:00 | 22/12/2016 21:09:00 |03:46
12 | Tony | 22/12/2016 14:06:00 | 22/12/2016 22:39:00 |08:33
13 | Jimmy | 22/12/2016 08:00:00 | 22/12/2016 17:12:00 |09:12
14 | Nitha | 22/12/2016 18:50:00 | 23/12/2016 07:05:00 |12:15
16 | Juby | 22/12/2016 09:00:00 | |
16 | Juby | 23/12/2016 09:05:00 | |
I have try a query, but it showing an error message
Query
——–
select EmployeeID, EmployeeName, in_time, out_time
from(
select Attendance.EmployeeID,
EmployeeName,
LogDate + ‘ ‘ + LogTime in_time,
lead(LogDate + ‘ ‘ + LogTime) over(order by Attendance.EmployeeID) out_time,
ROW_NUMBER() over(partition by Attendance.EmployeeID order by Attendance.EmployeeID) row_num
from dbo.HR_AttendanceLog Attendance
JOIN dbo.HR_Employee Employee ON Employee.EmployeeID = Attendance.EmployeeID) as cte
where row_num % 2 != 0
Error Message
——————
Msg 195, Level 15, State 10, Line 6
‘lead’ is not a recognized built-in function name.
Thanks
Hello, is it possible to use lead and lag to gather last 15 days of data, to list or pull out, only those rows that have changed in the last 15 days? Or am I barking up the wrong tree, and there is something out there that would be better suited? Thank you.
Also, the differences will be between two tables, so not sure if LEAD and LAG are specifically designed to be used on just the one table.
Simply examples makes huge difference. Thx pinal. U rock always….:)