SQL SERVER – Making Recursive Parent-Child Queries Efficient

August 14, 2023

SQL SERVER - Making Recursive Parent-Child Queries Efficient manager-800x578 In the world of managing data, every query tells a story. During a recent Comprehensive Database Performance Health Check, we encountered a puzzle – a costly recursive query. This query had the task of finding employees who report to managers within a company’s structure. One interesting puzzle is finding out how many employees each manager is responsible for in a company. Let us see how we can Make Recursive Parent-Child Queries Efficient.

The Challenge – Parent-Child Queries

Imagine a company where employees work for different managers. Our task is to figure out how many employees each manager has. We need a special table to store important information about the employees to do this.

Creating the Employee Table

Let’s start by creating a table to hold all the information about employees. Here’s how it looks:

CREATE TABLE Employees (
EmployeeID INT,
Name VARCHAR(50),
ManagerID INT
);

INSERT INTO Employees (EmployeeID, Name, ManagerID)
VALUES
(1, 'John Smith', NULL),
(2, 'Jane Doe', 1),
(3, 'Mark Johnson', 1),
(4, 'Lisa Brown', 2),
(5, 'Mike Davis', 3),
(6, 'Sarah White', 4);

This table has three columns: EmployeeID (a special number for each employee), Name (the employee’s name), and ManagerID (which shows who their manager is).

Solving the Puzzle

Now, let’s use SQL to solve the puzzle. Take a look at this code:

SELECT
Manager.EmployeeID AS ManagerID,
Manager.Name AS ManagerName,
COUNT(DISTINCT Employee.EmployeeID) AS TotalEmployees
FROM Employees AS Manager
LEFT JOIN Employees AS Employee ON Manager.EmployeeID = Employee.ManagerID
GROUP BY Manager.EmployeeID, Manager.Name
ORDER BY Manager.EmployeeID;

How the Solution Works

1. We start by picking the EmployeeID and Name of each manager from the table.
2. Using something called a left join, we connect managers with their employees.
3. The COUNT function helps us count how many different EmployeeIDs are linked to each manager. This tells us the number of employees each manager has.
4. We organize our results by Manager’s EmployeeID and Name using GROUP BY.
5. Lastly, we arrange everything in order using ORDER BY.

The Result

When we run this special SQL code, we get a list. Each line in the list shows the Manager’s ID, their name, and the total number of employees they manage.

ManagerID   ManagerName                                        TotalEmployees
----------- -------------------------------------------------- --------------
1           John Smith                                         2
2           Jane Doe                                           1
3           Mark Johnson                                       1
4           Lisa Brown                                         1
5           Mike Davis                                         0
6           Sarah White                                        0

Conclusion

By using SQL, we’ve solved the puzzle of counting employees for each manager in a company. This not only helps us solve puzzles but also lets us understand how companies work. Now you know how data can help us uncover interesting facts about the real world!

Reference: Pinal Dave (https://blog.sqlauthority.com)

Challenge, Recursive, SQL Joins

MySQL – Identifying Unused Indexes in Databases

MySQL’s Index Condition Pushdown (ICP) Optimization

2 Comments. Leave new

Joe Celko
August 14, 2023 11:30 pm
I think this approach to tree structured data is wrong for RDBMS. Were supposed to be doing declarative, set oriented programming. This is procedural (traversal) and violates the principles of basic data modeling. Each column in a properly designed relational table will model a scalar atomic attribute of the entity represented by the row. This is straight old-fashioned Dr. Codd RDBMS but your approach confuses an attribute with another entity and relationship. My boss is not an attribute of me! We have a relationship called “subordination” that we are part of.
Over the decades, I’ve discussed the nested set model for such data. I found that it will work quite well, and I don’t have to write really elaborate constraints. For example, you didn’t show us how we can avoid allowing cycles with constraints in your sample data. With the nested set model, data integrity is automatically done with few minor check constraints.
The count of any particular employee’s subordinates will look something like this.
SELECT @in_boss_id, COUNT(*) AS subordinate_count
FROM Personnel AS Boss, Personnel AS Employees
WHERE Employees.lft BETWEEN Boss.lft AND Boss.rgt
AND Boss.emp_name = @in_boss_id;
Obviously, I flipped out a good bit of simple code here, but it’s easy enough to find with a quick Google.
Reply
- Pinal Dave
  August 15, 2023 4:33 am
  Thank you sir. This is a great help.
  Reply