SQL SERVER – Introduction to Hierarchical Query using a Recursive CTE – A Primer

SQL SERVER - Introduction to Hierarchical Query using a Recursive CTE - A Primer book2 This blog post is inspired from SQL Queries Joes 2 Pros: SQL Query Techniques For Microsoft SQL Server 2008 – SQL Exam Prep Series 70-433 – Volume 2.

[Amazon] | [Flipkart] | [Kindle] | [IndiaPlaza]


What is a Common Table Expression (CTE)

A CTE can be thought of as a temporary result set and are similar to a derived table in that it is not stored as an object and lasts only for the duration of the query. A CTE is generally considered to be more readable than a derived table and does not require the extra effort of declaring a Temp Table while providing the same benefits to the user. However; a CTE is more powerful than a derived table as it can also be self-referencing, or even referenced multiple times in the same query.

The basic syntax structure for a CTE is shown below:
WITH MyCTE
AS ( SELECT EmpID, FirstName, LastName, ManagerID
FROM Employee
WHERE ManagerID IS NULL )
SELECT *
FROM MyCTE

Building a Recursive CTE

In the following examples, you will learn how to harness the power of a recursive CTE query by fulfilling a common business requirement, retrieving hierarchical data. By the time the final query is complete you will be able to easily determine how many levels from the top executive each employee is.

A recursive CTE requires four elements in order to work properly.

  1. Anchor query (runs once and the results ‘seed’ the Recursive query)
  2. Recursive query (runs multiple times and is the criteria for the remaining results)
  3. UNION ALL statement to bind the Anchor and Recursive queries together.
  4. INNER JOIN statement to bind the Recursive query to the results of the CTE.

WITH MyCTE
AS ( SELECT EmpID, FirstName, LastName, ManagerID
FROM Employee
WHERE ManagerID IS NULL
UNION ALL
SELECT EmpID, FirstName, LastName, ManagerID
FROM Employee
INNER JOIN MyCTE ON Employee.ManagerID = MyCTE.EmpID
WHERE Employee.ManagerID IS NOT NULL )
SELECT *
FROM MyCTE

Identify the Anchor and Recursive Query

Anyone who does not have a boss is considered to be at the top level of the company and everyone who does have a boss either works for the person(s) at the top level (upper management), or the people that work for them (mid-management thru base employees).

For example, a CEO is at the top level and thus has a ManagerID of null. Likewise, everyone below the CEO will have a ManagerID. This is demonstrated in the two queries below:

SQL SERVER - Introduction to Hierarchical Query using a Recursive CTE - A Primer j2p-day2-image-1

The first SELECT statement will become your Anchor query as it will find the employee that has a ManagerID of null (representing Level 1 of the organization). The second SELECT statement will become your Recursive query and it will find all employees that do have a ManagerID (representing Level 2-3 of this organization).

As you can see from the results so far, these queries are unable to give hierarchical data on which level each employee is at within the organization.

Add the Anchor and Recursive query to a CTE

Begin transforming this entire query into a CTE by placing a UNION ALL statement between the Anchor and Recursive queries. Now add parentheses around the entire query, indenting it, moving it down, and adding the declaration WITH EmployeeList AS before the open parenthesis, and then add SELECT * FROM EmployeeList on the next line after the close parenthesis.

Your query should now look like the screenshot below:

SQL SERVER - Introduction to Hierarchical Query using a Recursive CTE - A Primer j2p-day2-image-2

As you can see, the results from your CTE are exactly the same as the results returned from running the anchor and Recursive queries simultaneously in the previous example.

Add an expression to track hierarchical level

The Anchor query (aliased as ‘Boss’) inside the CTE represents everyone at Level 1 (i.e. Sally Smith). The Recursive query (aliased as ‘Emp’) represents everyone at Levels 2 and 3. In order to visualize each level in a result set, you will need to add an expression field to each query.

Add the expression “1 AS EmpLevel” to the Anchor query and the expression “2 AS EmpLevel” to the Recursive query. Before executing the entire query, look closely at the expression field. The EmpLevel expressions in the Anchor query will hard-code the numeral 1 (for Sally Smith’s level), while the EmpLevel expressions in the Recursive query will hard-code the numeral 2 for everyone else.

Your query should now look like the screenshot below:

SQL SERVER - Introduction to Hierarchical Query using a Recursive CTE - A Primer j2p-day2-image-3

The two new expression fields were a helpful step. In fact, they show the correct EmpLevel information for Sally Smith and for the people at Level 2 (i.e., Adams, Bender, Brown, Kennson, Lonning and Osako). However, the 2 is just a hard-coded placeholder to help visualize your next step. Lisa Kendall and several other employees need to be at Level 3.

Ideally you would like to make the expression dynamic by replacing “2 AS EmpLevel” with the expression “EmpLevel + 1”.

Add a self-referencing INNER JOIN statement

Let’s take a moment and recognise why this is not going to work quite so simply. The idea to increment EmpLevel in the recursive query of the CTE is on the right track. Unfortunately, the recursive query is trying to reference a field called EmpLevel but can’t find one, since it has only been materialized in the result set of the Anchor query and does not yet exist in the recursive set.

How can you materialize the EmpLevel field for the recursive query? We can use the CTE for this! Remember, a recursive CTE requires an INNER JOIN to connect the recursive query to the CTE itself. Go ahead and write an INNER JOIN statement binding the recursive query ‘Emp’ to the CTE ‘EmployeeList AS EL’ ON Emp.ManagerID = EL.EmpID.

Your query should now look like the screenshot below:

SQL SERVER - Introduction to Hierarchical Query using a Recursive CTE - A Primer j2p-day2-image-4

Success! You can now see that Sally is at the first level, Alex is at the second level and Lisa appears at the third level. Since a CTE can reference itself, the ‘Emp’ recursive query can now access the EmpLevel field materialized in the EmployeeList CTE.


This blog post is inspired from SQL Queries Joes 2 Pros: SQL Query Techniques For Microsoft SQL Server 2008 – SQL Exam Prep Series 70-433 – Volume 2.

[Amazon] | [Flipkart] | [Kindle] | [IndiaPlaza]

Reference: Pinal Dave (https://blog.sqlauthority.com)

Joes 2 Pros
Previous Post
SQL SERVER – Introduction to SQL Server Security – A Primer
Next Post
SQL SERVER – Introduction to Basics of a Query Hint – A Primer

Related Posts

74 Comments. Leave new

  • Pinal Sir,
    Thank you so much for writing this article.
    You have explained it so easily.. Excellent.

    I have a question, we can replace recursive functions with recursive CTE, performance wise which is better to use ? Any thumb rule?

    Reply
  • Christer Jönsson
    August 27, 2013 2:12 pm

    Great article – thanks a lot for a very good help to understand recursive CTE!

    Reply
  • Finally! I have been looking a long time for a clear and thorough explanation of how CTE queries work. You’ve done a great job.

    In fact, you always do. I find your blog extremely helpful. Keep up the good work; you are such a valuable part of the community.

    Reply
  • Now is clear for me how recursive CTE works but can I write other way around, i.e. for an employer to find all managers directly or indirectly

    Reply
  • Hey thanks for this! It makes CTE’s a bit clearer than good old MS can figure out.

    Reply
  • I need one more thing. Essentially an indented BOM. How can I take the employee list above and sort it so that Each employee comes out immediately under their Boss??
    11 Sally
    1 Alex
    2 Barry
    3 Lee
    9 James
    10 Terry
    4 David
    6 Lisa
    8 John
    12 Barbara
    12 Phil

    This would be HUGE for me.

    Reply
  • how to find number of level under given manager
    Thanks

    Reply
  • simple way and good

    Reply
  • Great example, thank you.
    Here’s a query using the above concept to get an account’s hoerarchy (parent accounts above and child accounts below) from Dynamics CRM 2011. You need the recursive CTE twice because it doesn’t survive after the initial select. Wrap up in a stored procedure with the accountid as a parameter for ease of use :):

    declare @accountId uniqueidentifier,@anchorid uniqueidentifier
    set @accountid = ”

    –recursive cte for all accounts
    ;with parentAccs as
    (select accountid as anchorid,
    name as anchorName,
    accountid,parentaccountid ,name, 1 as accLevel,statecode
    from account
    where account.parentaccountid is null and statecode = 0
    union all
    select pacc.anchorid, pacc.anchorName,
    acc.accountid,acc.parentaccountid ,acc.name, pacc.accLevel + 1 as accLevel,acc.statecode
    from account acc
    join parentAccs pacc on acc.parentAccountId = pacc.AccountId
    where acc.parentaccountid is not null and acc.statecode = 0 )

    –get the anchor accountid

    select @anchorid = (select anchorid from parentAccs where accountid = @accountid)

    –recursive cte for all accounts – again
    ;with parentAccs as
    (select accountid as anchorid,
    name as anchorName,
    accountid,parentaccountid ,name, 1 as accLevel,statecode
    from account
    where account.parentaccountid is null and statecode = 0
    union all
    select pacc.anchorid, pacc.anchorName,
    acc.accountid,acc.parentaccountid ,acc.name, pacc.accLevel + 1 as accLevel,acc.statecode
    from account acc
    join parentAccs pacc on acc.parentAccountId = pacc.AccountId
    where acc.parentaccountid is not null and acc.statecode = 0 )

    –select the hierarchy
    select * from parentAccs where anchorid = @anchorid
    order by accLevel

    Reply
  • James Eduard Espitia Coronado
    September 10, 2014 8:15 am

    Hello Dave!

    Please first of all, forgive my English.

    For me have been really dificult to understand how recursive CTE works. Now I have several tables each one with a hierachyid structure and datatypes. For example, I have a Table of Companies because one company may be on charge to others companies (or subsidiaries). I have another table of areas or departaments this one has a hierarchyid structure too, each area or departament has a foreign key referencing the company (or subsidiary) to wich it belongs. I have another table of positions with a hierarchyid structure too, each position has a foreign key referencing the area or departament to wich it belongs and finally I have another table of employees with a hierachyid structure too, each employee has a foreigh key referencing the position to wich he belongs. How to build a query to draw all organization chart going over all tables in hierarchyid structure? I thing i have to use a CTE to go for eache node and its children from zero node to each node with no children. Is ther another easier way to do it keepping hierarcyid datatype an structure in SQL Server? Please help me how.

    Thanks for your help.

    Reply
  • Dear Pinal:
    For every google search I do when I have a SQL problem, if I see your name in the results I always go there first. This post in particular was a great help when I needed it the most. Thank you very much for your great examples and clear explanations! The world is a better place because of you!

    Reply
  • looking to answer following:
    I am an employee Kamlesh with empid = 101.
    I want to generate rows that shows my manager, her manager, and so on with level value.
    example:

    Kamlesh with empid = 101
    Tracy is Kamlesh’s manager with empid 90
    Karen is Tracy’s manager with empid 80
    John is Karen’s manager with empid 70

    so how do i query to produce following output?

    name, empid, level
    Kamlesh, 101, 0
    Tracy, 90, 1
    Karen, 80, 2
    John, 70, 3

    Reply
  • Thanks this helped a lot

    Reply
  • Thanks a lot

    Reply
  • I have similar kind of hierarchy but it can have cycles. How to prevent cycles in result?

    Reply
  • I have part table contains 3 columns -Part_No,Part_Predecessor,Part_Successor,
    How to find get all 3 columns data in a chain using recursive query?

    Reply
  • The Bossdrum
    May 29, 2015 8:14 pm

    Pinal, Thank you for sharing this method. It was very clear and easy to understand. Really I can’t thank you enough, the nightmare I had is now gone and I have my hierarchical tables under total control.

    Reply
  • hi
    how can i use with in in statement like this query

    select actions.COMMENT, actions.ACTIVEOBJECTID, actions.DATETIME, users.USERNAME
    from PDBNEWAUDITTRAIL_TABLE actions, PDBUSER users
    WHERE actions.ACTIONID = 321 AND CONVERT(DATETIME,actions.DATETIME,103) BETWEEN CONVERT(DATETIME,’29/07/2001′,103)
    AND CONVERT(DATETIME,’29/07/2015′,103)
    AND actions.ACTIVEOBJECTID in (

    WITH N (FOLDERINDEX)
    AS (
    SELECT FOLDERINDEX FROM PDBFOLDER WHERE FOLDERINDEX = 14
    UNION ALL SELECT np.FOLDERINDEX
    FROM PDBFOLDER AS np JOIN N ON N.FOLDERINDEX = np.PARENTFOLDERINDEX )
    SELECT FOLDERINDEX FROM N

    )

    Reply
  • Hello,

    In MySQL it’s giving error for “with”, to run this query in MySQL what i need to change??

    Thanks
    Rozina

    Reply
    • I got the query which gives me the parent and it’s child and grand child up to the unknown level (MySQL) within the same table.
      I have the table in which have the column as ID, FirstName, LastName, ParentUserId (which has the value from Id column).
      Lets have one example
      Id FirstName LastName ParentUserId
      1 Rozi Nakhwa Null
      2 Chetan Gaonkar 1
      3 Namrata Pillai 2
      4 Guru Matharu 3
      5 Amit Kumar 3
      6 Sumit Chavan Null
      7 Rohan Patkar 6
      when I am passing the Id 1 I want the hierarchy of only this Id, I dont want the the other parent s. Following is My SP which gives me the expected result.

      CREATE PROCEDURE `partner_hierarchy` (
      in id int unsigned
      )
      BEGIN

      declare v_done tinyint unsigned default 0;
      declare v_depth smallint unsigned default 0;

      create temporary table hier(
      parent_id smallint unsigned,
      user_id smallint unsigned,
      depth smallint unsigned default 0
      )engine = memory;

      insert into hier select ParentUserId, Id, v_depth from pp_users where Id=id;

      create temporary table tmp engine=memory select * from hier;

      while not v_done do

      if exists( select 1 from pp_users p inner join hier on p.ParentUserId = hier.user_id and hier.depth = v_depth) then

      insert into hier
      select p.ParentUserId, p.Id, v_depth + 1 from pp_users p
      inner join tmp on p.ParentUserId = tmp.user_id and tmp.depth = v_depth;

      set v_depth = v_depth + 1;

      truncate table tmp;
      insert into tmp select * from hier where depth = v_depth;

      else
      set v_done = 1;
      end if;

      end while;

      select distinct p.Id,
      concat(p.FirstName, ‘ ‘, p.Lastname) as UserName,
      p.ParentUserId as ParentUserId
      from
      hier
      inner join pp_users p on hier.user_id = p.Id;

      drop temporary table if exists hier;
      drop temporary table if exists tmp;

      END

      Reply
    • I have not worked with MySQL in depth. Sorry.

      Reply
  • Good explanation which helped me understanding an important feature of SQL Server.
    I have a question
    Is there a referential integrity in place between EmpID and ManagerID?

    If EmpID is the primary key then it is not possible for an employee to report more than two managers and to make a foreign key constraint on ManagerID to reference EmpID it is necessary to make EmpID a Primary Key column. I used the same approach on SQLite and there it works fine because there I can include columns in table definition which can be null. Here is the definition:
    CREATE TABLE [bcp_users] (
    [user_email] nvarchar(100) NOT NULL
    , [user_name] nvarchar(100) NULL
    , [manager_email] nvarchar(100) NULL
    , [dept_id] int NOT NULL
    , CONSTRAINT [sqlite_autoindex_bcp_users_1] PRIMARY KEY ([user_email],[manager_email],[dept_id])
    , FOREIGN KEY ([dept_id]) REFERENCES [departments] ([dept_id]) ON DELETE NO ACTION ON UPDATE NO ACTION
    , FOREIGN KEY ([manager_email]) REFERENCES [bcp_users] ([user_email]) ON DELETE NO ACTION ON UPDATE NO ACTION
    )

    Thanks & regards,

    Reply

Leave a Reply