Skip to main content

Recursive Queries in PostgreSQL

Published

At some point someone sent me a link to the PostgreSQL documentation describing how to use Common Table Expressions (CTEs) to perform recursive queries. The documentation is pretty good but it doesn’t really include any examples of what the data should look like. Here’s an example that uses some actual data.

Say you have a table with some data that looks like this:

CREATE TABLE public.participants (
    id int NOT NULL,
    name text NOT NULL,
    key text NOT NULL,
    parent_id int
);

-- define the constraints after because we are referencing ourselves
ALTER TABLE participants ADD CONSTRAINT participants_pk PRIMARY KEY (id);
ALTER TABLE participants ADD CONSTRAINT participants_participants_fk FOREIGN KEY (parent_id) REFERENCES participants (id);

INSERT INTO participants (id, name, key, parent_id) VALUES (1, 'K20', 'k20', NULL);
INSERT INTO participants (id, name, key, parent_id) VALUES (2, 'CWU', 'k20-cwu', 1);
INSERT INTO participants (id, name, key, parent_id) VALUES (3, 'CWU Residence Halls', 'k20-cwu-res', 2);
INSERT INTO participants (id, name, key, parent_id) VALUES (4, 'UW', 'uw', NULL);

We have four entries in our table, two of which are at the “top level”. You know that they are at the top level because they do not have a value for parent_id. Our two top level rows are: K20, UW. Our first child is “CWU” which is a child of “K20” and then “CWU” has its own child called “CWU Residence Halls”.

So I want to run a query where the children would be the next row after their parent and also tell me how deep into the tree I am. We also want to generate unique identifiers for each row based on the keys. This can be done with a simple recursive query:

WITH RECURSIVE nodes AS (
    -- non-recursive term
    SELECT p1.id, p1.name, p1.key, NULL::integer AS parent_id, NULL::varchar AS parent_name, NULL::varchar AS parent_key, p1.key AS order_by, 1 AS level
    FROM public.participants p1
    WHERE p1.parent_id IS NULL
    UNION
    -- recursive term
    SELECT p2.id, p2.name, p2.key, n.id, n.name, n.key, n.order_by||'-'||p2.key AS order_by, n.level + 1 AS level
    FROM nodes n, public.participants p2
    WHERE p2.parent_id = n.id
)
SELECT * FROM nodes
ORDER BY order_by ASC

Running this query gives me all of the results that I need:

 id |        name         |     key     | parent_id | parent_name | parent_key |        order_by         | level
----+---------------------+-------------+-----------+-------------+------------+-------------------------+-------
  1 | K20                 | k20         |           |             |            | k20                     |     1
  2 | CWU                 | k20-cwu     |         1 | K20         | k20        | k20-k20-cwu             |     2
  3 | CWU Residence Halls | k20-cwu-res |         2 | CWU         | k20-cwu    | k20-k20-cwu-k20-cwu-res |     3
  4 | UW                  | uw          |           |             |            | uw                      |     1