PostgreSQL – Recursive CTEs
Common Table Expressions (CTEs) are a powerful feature in SQL that allows you to create temporary result sets that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement. Recursive CTEs extend this capability by allowing a CTE to reference itself, enabling the execution of recursive queries. Let us delve into understanding recursive cte’s.
1. What is a Recursive CTE?
A recursive CTE is a CTE that references itself, typically used to traverse hierarchical or tree-structured data. It consists of two parts:
- The anchor member is a non-recursive query.
- The recursive member is a recursive query that references the CTE itself.
The recursive query repeatedly applies the recursive member to its output until no more rows are returned.
1.1 Advantages of Recursive CTEs
- Simplicity and Readability: Recursive CTEs allow complex hierarchical and recursive queries to be written in a simple and readable manner, improving maintainability.
- Modularity: CTEs break down complex queries into modular parts, making them easier to understand and debug.
- Performance: Recursive CTEs can be more efficient than other methods for processing hierarchical data, such as self-joins or loops in application code.
- Flexibility: They provide a flexible way to perform complex recursive operations directly within SQL, without the need for procedural code.
- Temporary Storage: CTEs use temporary storage that is automatically managed, freeing developers from the overhead of managing temporary tables.
1.2 Use Cases of Recursive CTEs
- Hierarchical Data: Managing and querying hierarchical data such as organizational structures, file systems, and bills of materials.
- Graph Traversal: Performing graph traversal operations, such as finding all paths in a network, identifying connected components, or detecting cycles.
- Data Aggregation: Aggregating data at different levels of a hierarchy, such as calculating total sales for a company, departments, and individual employees.
- Generating Sequences: Creating sequences of numbers or dates, such as generating a Fibonacci series or producing a list of dates for a given range.
- Tree Structures: Working with tree structures, including querying and manipulating trees in decision support systems or game development.
2. Working Example
Let us dive into some practice implementation on the postgresql database.
2.1 Pre-requirement – Postgres Setup
Usually, setting up the database is a tedious step but with Docker, it is a simple process. You can watch the video available at this link to understand the Docker installation on Windows OS. Once done open the terminal and trigger the below command to set and run postgresql.
Docker commands
-- command to run Postgres on docker – -- remember to change the password -- docker run -d -p 5433:5432 -e POSTGRES_PASSWORD= --name postgres postgres -- command to stop the Postgres docker container -- docker stop postgres -- command to remove the Postgres docker container -- docker rm postgres
Remember to enter the password of your choice. If everything goes well the postgresql database server will be up and running on a port number – 5433
and you can connect with the Dbeaver GUI tool to connect to the server.
2.2 Example: Organization Hierarchy
Consider an organization hierarchy stored in a table employees
:
create table employees ( employee_id SERIAL primary key, employee_name VARCHAR(100), manager_id INTEGER references employees (employee_id) ); insert into employees ( employee_name, manager_id ) values ('Alice', null), ('Bob', 1), ('Charlie', 1), ('David', 2), ('Eve', 2), ('Frank', 3);
This table defines a simple hierarchy with Alice as the top-level manager. Bob and Charlie report to Alice, David and Eve report to Bob, and Frank reports to Charlie.
We can use a recursive CTE to list all employees in the hierarchy, starting from the top-level manager:
with recursive employee_hierarchy as ( -- Anchor member select employee_id, employee_name, manager_id, 1 as level from employees where manager_id is null union all -- Recursive member select e.employee_id, e.employee_name, e.manager_id, eh.level + 1 as level from employees e inner join employee_hierarchy eh on e.manager_id = eh.employee_id ) select employee_id, employee_name, manager_id, level from employee_hierarchy order by level, manager_id;
In this query:
- The anchor member selects the top-level manager(s) (those without a manager).
- The recursive member joins the employees’ table with the CTE to find subordinates of the current level employees, incrementing the level each time.
2.3 Example: Fibonacci Sequence
Recursive CTEs can also be used for generating sequences. For example, the Fibonacci sequence:
with recursive fibonacci as ( select 1 as n, 0 as fib1, 1 as fib2 union all select n + 1, fib2, fib1 + fib2 from fibonacci where n < 10 ) select n, fib1 as fibonacci_number from fibonacci;
In this query:
- The anchor member starts the sequence with the first Fibonacci number (0).
- The recursive member generates the next Fibonacci number by summing the last two numbers in the sequence.
3. Conclusion
Recursive CTEs in PostgreSQL are a powerful tool for working with hierarchical and recursive data structures. By understanding and leveraging recursive CTEs, you can perform complex queries that would otherwise require more intricate and less efficient solutions.