[Note from Pinal]: In this episode of the Notes from the Field series database expert Kathi Kellenberger explains about T-SQL Window Functions. Kathi is an amazing instructor, she was the SQL author I have read in my early career. The reason, I love SQL Server because her writing has instigated love for this technology in me. Today she brings a relatively interesting topic for database experts. Read the experience of Kathi in her own words.
Microsoft introduced the first T-SQL window functions ten years ago with SQL Server 2005! They introduced more T-SQL window functions with SQL Server 2012. I’ve been on a mission to educate the SQL Server and developer communities about T-SQL window functions since then.
Feedback from my sessions tell me that ROW_NUMBER is the most used T-SQL window function. ROW_NUMBER returns a unique number for every row in the results. (NOTE: If the PARTITION BY option is used, it returns a unique number within each partition.)
Did you know that two other similar functions, RANK and DENSE_RANK, were introduced at the same time? RANK and DENSE_RANK are comparable to ROW_NUMBER, but they handle ties in the ORDER BY expression differently.
To demonstrate the differences, I have chosen a customer from AdventureWorks that placed two orders on the same day. These three functions require an ORDER BY expression in the OVER clause. This tells SQL Server in which order to apply the numbers generated by the window function. In this case, I am using OrderDate, which is not unique.
SELECT CustomerID, SalesOrderID, CAST(OrderDate AS DATE) AS OrderDate,
ROW_NUMBER() OVER(ORDER BY OrderDate) AS [ROW_NUMBER],
RANK() OVER(ORDER BY OrderDate) AS [RANK],
DENSE_RANK() OVER(ORDER BY OrderDate) AS [DENSE_RANK] FROM Sales.SalesOrderHeader
WHERE CustomerID = 11300;
Take a look at the partial results, starting where the numbers returned by the functions begin to get interesting.
The OrderDate values were unique until the 7th row, and all three functions return 7 on row 7. The 8th order has the same OrderDate as row 7 so it’s a tie. ROW_NUMBER doesn’t care; it continues to provide unique incrementing numbers. RANK and DENSE_RANK both return 7 again because both rows 7 and 8 are ranked the same.
Notice that on row 9, RANK “catches up” with ROW_NUMBER by skipping the value 8. RANK catches up with ROW_NUMBER every time once it’s past the tie. DENSE_RANK, on the other hand, does not. Instead of catching up, it returns the next possible value. DENSE_RANK doesn’t skip any numbers.
Take a look at row 9 again. ROW_NUMBER returns the position of the row. RANK returns the rank of the row based on its position. DENSE_RANK returns the logical rank: the rank over unique OrderDate values. The row with OrderDate 2913-11-14 is in the 9th position. It is ranked 9th over the set of rows. It is ranked 8th over the set of unique OrderDate values.
If you use a unique ORDER BY expression, all three functions will return the same values because there will be no ties.
These three functions are usually a step along the way to a more complex solution. Understanding how they work and how they differ will help you decide which one to use when faced with the choice.
If you would like to learn more about T-SQL window functions, be sure to check out my latest book Expert T-SQL Window Functions in SQL Server or my Pluralsight course T-SQL Window Functions.
If you want to get started with SQL Server with the help of experts, read more over at Fix Your SQL Server.
Reference: Pinal Dave (https://blog.sqlauthority.com)