Data Analytics with T-SQL: LEAD ()

This will be the first of a series of posts where I showcase some functions you can use in SQL Server to analyse your data. At the end of this post we will look at using the LEAD() function in a real world use case.

This week we will be checking out the LEAD() function.

What does it do?

LEAD() selects a value from a subsequent row in the same result set. You can specify the row by declaring an offset. This is useful to compare values in the current row with values in a following row.

The Syntax

LEAD ( scalar_expression [ ,offset ] , [ default ] )   
    OVER ( [ partition_by_clause ] order_by_clause )

Arguments

scalar_expression

The column you want to select values from.

offset

The subsequent row you want to select from the current row.

default

If there are no subsequent rows in the partition, you can declare a default value to be returned.

partition_by_clause

This allows you to separate the result set into partitions. If not specified, the entire result set is treated as a single group.

order_by_clause

This one’s pretty straightforward. This argument allows you to order the data according to your needs.

Example

Let’s see a quick demonstration of LEAD() in action.

First, below is the script to create the test table ‘Employees’, so you can try out this example in your local instance.

--Creating the table
CREATE TABLE Employees(
[Name] VARCHAR(50),
Gender VARCHAR(6),
Salary int
)

--Insert some records
INSERT INTO Employees VALUES ('Saman', 'Male', 1000)
INSERT INTO Employees VALUES ('Dostol', 'Male', 2000)
INSERT INTO Employees VALUES ('Eric', 'Male', 3000)
INSERT INTO Employees VALUES ('Chanuka', 'Male', 4000)
INSERT INTO Employees VALUES ('Amanda', 'Female', 5000)
INSERT INTO Employees VALUES ('Sandani', 'Female', 6000)
INSERT INTO Employees VALUES ('Hasini', 'Female', 7000)
INSERT INTO Employees VALUES ('Anthony', 'Male', 8000)
INSERT INTO Employees VALUES ('Hensiny', 'Female', 9000)
INSERT INTO Employees VALUES ('Rashmi', 'Female', 10000)

Run the below query against the created table.

SELECT [Name], Gender, Salary,
LEAD(Salary, 1, -1) OVER (ORDER BY Salary) AS [Lead]
FROM Employees

The LEAD() function derived the Lead column by grabbing the next subsequent row from the current row. Let’s change the offset value to 2 and see the results.

SELECT [Name], Gender, Salary,
LEAD(Salary, 2, -1) OVER (ORDER BY Salary) AS [Lead]
FROM Employees

The last two records have -1 in the Lead column as there are no subsequent rows that are two ahead.

Now let’s try partitioning the data on the Gender column and change the offset value back to 1.

SELECT [Name], Gender, Salary,
LEAD(Salary, 1, -1) OVER (PARTITION BY Gender ORDER BY Salary) AS [Lead]
FROM Employees

Notice how the Lead column values are now working in scope of the partitions placed on the Gender column.

We will now use LEAD() in a real world scenario.

Real World Use Case

Scenario

Adventure Works Cycles is a bicycle manufacturing company that has been in operation from 2011 to 2014. They create various products for the bicycle industry. You are tasked with finding out the profit of each of their products on a yearly basis from 2011 to 2014.

Solve the Issue

First you wrangle up the product names and the sales made in each year, then order it by product name.

	SELECT c.Name AS [Product Name], YEAR(b.OrderDate) AS [Year], SUM(CONVERT(int,a.LineTotal)) as [Current Year Sales]
	FROM sales.SalesOrderDetail AS a
	INNER JOIN sales.SalesOrderHeader AS b ON a.SalesOrderID = b.SalesOrderID
	INNER JOIN Production.Product AS c ON a.ProductID = c.ProductID
	GROUP BY c.Name, YEAR(b.OrderDate)
	ORDER BY C.Name

A snippet of the result is show below:

Now we want to compare each year of sales with the following year of sales and then afterwards calculate the difference.

To do this, we place the above query (removing the ORDER BY) in a Common Table Expression (CTE) and use the LEAD() function to partition the products together and have each partition ordered by the year.

WITH cte_ProductSalesByYear
AS
(
	SELECT c.Name AS [Product Name], SUM(CONVERT(int,a.LineTotal)) as [Current Year Sales], YEAR(b.OrderDate) AS [Year]
	FROM sales.SalesOrderDetail AS a
	INNER JOIN sales.SalesOrderHeader AS b ON a.SalesOrderID = b.SalesOrderID
	INNER JOIN Production.Product AS c ON a.ProductID = c.ProductID
	GROUP BY c.Name, YEAR(b.OrderDate)
)
SELECT [Product Name],[Year], [Current Year Sales], 
LEAD([Current Year Sales], 1, NULL) OVER(PARTITION BY [Product Name] ORDER BY [Year]) AS [Next Year Sales]
FROM cte_ProductSalesByYear

So far so good. However, we still need to calculate the difference between Current Year Sales and Next Year Sales in order to know the profit or loss between the years.

You will also notice that some products are missing in some years. This is simply because some products were not produced during those missing years.

We add a new derived column ‘Yearly Increase In Sales’, in the query to calculate the difference between the subsequent years :

WITH cte_ProductSalesByYear
AS
(
	SELECT c.Name AS [Product Name], YEAR(b.OrderDate) AS [Year], SUM(CONVERT(int,a.LineTotal)) as [Current Year Sales]
	FROM sales.SalesOrderDetail AS a
	INNER JOIN sales.SalesOrderHeader AS b ON a.SalesOrderID = b.SalesOrderID
	INNER JOIN Production.Product AS c ON a.ProductID = c.ProductID
	GROUP BY c.Name, YEAR(b.OrderDate)
)
SELECT [Product Name],[Year], [Current Year Sales], 
LEAD([Current Year Sales], 1, NULL) OVER(PARTITION BY [Product Name] ORDER BY [Year]) AS [Next Year Sales],
LEAD([Current Year Sales], 1, NULL) OVER(PARTITION BY [Product Name] ORDER BY [Year]) - [Current Year Sales]  AS [Yearly Increase In Sales]
FROM cte_ProductSalesByYear

We finally have arrived at the solution. Using the Yearly Increase In Sales column, we can analyse how well each product performed over the years, whether they made a profit or a loss.

Hopefully the LEAD() function becomes a handy addition to your data analysis toolbox!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s