Developer @ altsol · reasonablegraph.org
~30 years with PostgreSQL
Started with Informix — never looked back
Greece PostgreSQL Users Group Meetup #2
{u, v} means: u is connected to v.(u, v) = connection from u to v.Directed graph WITHOUT cycles.
Example:
E has two parents (C and D).
Every tree is a DAG,
but not every DAG is a tree.
Connected graph without cycles.
N vertices → exactly N - 1 edges.In databases → rooted tree:
A collection of trees.
In databases this means:
Traversal = the process by which we visit all nodes of a tree in a specific order.
| Strategy | Idea |
|---|---|
| Depth-first (DFS) | We go as deep as possible down one branch before backtracking. |
| Breadth-first (BFS) | We visit all nodes of the same level first. |
Up to here we have been talking mathematics.
In PostgreSQL the question becomes:
How do we store these relationships so that we can read quickly subtrees, ancestors and paths?
That is where the models come in:
ltree (materialized path)A table with 2 columns for the hierarchy:
id — primary keyparent_id — pointer to the parent (self-FK) id | parent_id | name
----+-----------+------
1 | NULL | A
2 | 1 | B
3 | 1 | C
4 | 2 | D
5 | 3 | E
6 | 3 | F
(6 rows)
Goal:
We want to traverse the tree level by level.
We would have to write one SELECT per level
The problem:
SELECT per level.SELECTs. 10 levels → 10 SELECTs.WITH RECURSIVEInstead of:
▶ root, children, children of children, …
we write:
▶ start from the root, and as long as you find children, keep going
This query implements the chain of SELECTs from the previous slide:
— the anchor (lines 2-5) = the first SELECT (WHERE id=1)
— the recursive term (lines 9-12) = each subsequent SELECT
The difference: instead of writing n SELECTs, PG runs the recursive term until it produces no new rows.
id | parent_id | name | depth
----+-----------+------+-------
1 | NULL | A | 0
2 | 1 | B | 1
3 | 1 | C | 1
4 | 2 | D | 2
5 | 3 | E | 2
6 | 3 | F | 2
(6 rows)
Explanatory — not production code.
A recursive function has:
These two pieces are exactly the two arms of a recursive CTE (anchor + recursive term).
CREATE FUNCTION traverse_tree(node_id BIGINT, depth INT DEFAULT 1)
RETURNS TABLE (id BIGINT, name TEXT, level INT)
LANGUAGE plpgsql AS $$
DECLARE child RECORD;
BEGIN
-- BASE: return the CURRENT node
RETURN QUERY
SELECT n.id, n.name, depth FROM node n WHERE n.id = node_id;
-- RECURSIVE: for each child, call YOURSELF
FOR child IN
SELECT n.id FROM node n WHERE n.parent_id = node_id ORDER BY n.name
LOOP
RETURN QUERY
SELECT * FROM traverse_tree(child.id, depth + 1);
END LOOP;
END;
$$;
SELECT * FROM traverse_tree(1);RECURSION STEP: Calls itself on line 15:
traverse_tree(child.id, depth + 1)
SEARCH clause (PG 14+) — what it doesSEARCH DEPTH FIRST | BREADTH FIRST — DFS or BFSBY <columns> — order among siblingsSET <name> — opaque sortable column for the ORDER BYSEARCH clause (PG 14+)PostgreSQL ≥14 generates the sort column automatically:
WITH RECURSIVE tree AS (
SELECT id, parent_id, name, 0 AS depth
FROM node
WHERE parent_id IS NULL
UNION ALL
SELECT n.id, n.parent_id, n.name, tree.depth + 1
FROM node n
JOIN tree ON n.parent_id = tree.id
)
SEARCH DEPTH FIRST BY name SET ordercol
SELECT id, parent_id, name, depth FROM tree
ORDER BY ordercol;→ standard WITH RECURSIVE query with two extra lines:
12: traversal type defined via ordercol
14: ordering via ORDER BY ordercol
→ Cleaner, more declarative code — instead of the manual path array you had to carry yourself before PG14.
PostgreSQL has no default recursion depth limit.
If the data has a cycle (broken parent_id, or actually a DAG instead of a tree), the recursive CTE runs forever — until statement_timeout or Out Of Memory.
CYCLE clause (PG 14+)BEGIN;
SET LOCAL statement_timeout = '5s'; -- fallback timeout
-- B (id=2) becomes child of D (id=4) → cycle B ⇄ D
UPDATE node SET parent_id = 4 WHERE id = 2;
WITH RECURSIVE descendants AS (
SELECT id, parent_id, name, 1 AS level FROM node WHERE id = 2
UNION ALL
SELECT n.id, n.parent_id, n.name, d.level + 1
FROM node n JOIN descendants d ON n.parent_id = d.id
)
CYCLE id SET is_cycle USING cycle_path
SELECT level, id, name, is_cycle, cycle_path FROM descendants;
ROLLBACK;→ standard WITH RECURSIVE query with one extra line:
13: CYCLE clause
- CYCLE id — which column identifies the node
- SET is_cycle — true on the row that closes the cycle
- USING cycle_path — array with the path (the column is populated for every row, not just cycle-closing ones)
→ Cycle-safe by construction — no manual visited[] array to carry.
CYCLE — result level | id | name | is_cycle | cycle_path
-------+----+------+----------+-----------------
1 | 2 | B | f | {(2)}
2 | 4 | D | f | {(2),(4)}
3 | 2 | B | t | {(2),(4),(2)}
(3 rows)
is_cycle — marks rows where the path closes a cyclecycle_path — records the visited path for every row (not just cycle-closing ones).All of the above is about reads. Writes are trivial:
→ No extra structure to keep in sync. parent_id is the single source of truth for the hierarchy.
✅ Pros
⚠️ Cons
In the next models (closure / ltree / nested sets) the trade-off
is reversed: cheaper reads, more expensive writes.
node (the nodes)| Column | Description |
|---|---|
id |
unique identifier of the node |
node can have additional payload columns (name, JSON, timestamps, …) — the model only cares about id.
node_closure (all ancestor–descendant pairs)| Column | Description |
|---|---|
ancestor_id |
ancestor node (or the node itself) |
descendant_id |
descendant node (or the node itself) |
depth |
distance in edges: 0 = self, 1 = child, 2 = grandchild… |
The table contains all (ancestor, descendant) pairs of the tree along with the self-pairs (depth = 0).
In this model the fact that a tree is at the same time a graph is highlighted.
CREATE TABLE node (
id BIGSERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE node_closure (
ancestor_id BIGINT NOT NULL REFERENCES node(id) ON DELETE CASCADE,
descendant_id BIGINT NOT NULL REFERENCES node(id) ON DELETE CASCADE,
depth INT NOT NULL CHECK (depth >= 0),
PRIMARY KEY (ancestor_id, descendant_id)
);
closure:
id | name
----+------
1 | A
2 | B
3 | C
4 | D
5 | E
6 | F
(6 rows)
node_closure:
ancestor_id | descendant_id | depth
-------------+---------------+-------
1 | 1 | 0
2 | 2 | 0
1 | 2 | 1
3 | 3 | 0
1 | 3 | 1
4 | 4 | 0
2 | 4 | 1
1 | 4 | 2
5 | 5 | 0
3 | 5 | 1
1 | 5 | 2
6 | 6 | 0
3 | 6 | 1
1 | 6 | 2
(14 rows)
The dashed arrows in the tree correspond to the rows of the node_closure table.
node_closure:
ancestor_id | descendant_id | depth
-------------+---------------+-------
A | A | 0
B | B | 0
A | B | 1
C | C | 0
A | C | 1
D | D | 0
B | D | 1
A | D | 2
E | E | 0
C | E | 1
A | E | 2
F | F | 0
C | F | 1
A | F | 2
(14 rows)
For each node we build a path from the root as an array
(all the ancestors, bydepth DESC)
name | depth
------+-------
A | 0
B | 1
D | 2
C | 1
E | 2
F | 2
(6 rows)
The same array that goes into the ORDER BY of the previous slide — here we see it as a column.
paths with id:
name | path
------+---------
A | {1}
B | {1,2}
D | {1,2,4}
C | {1,3}
E | {1,3,5}
F | {1,3,6}
(6 rows)
paths join with label:
name | path
------+---------
A | {A}
B | {A,B}
D | {A,B,D}
C | {A,C}
E | {A,C,E}
F | {A,C,F}
(6 rows)
G under B→
-- 1) The node itself
INSERT INTO node (id, name) VALUES (7, 'G');
-- 2) Self-row: every node is ancestor of itself (depth=0)
INSERT INTO node_closure VALUES (7, 7, 0);
-- 3) Inherit ALL the ancestors of the parent, with depth+1
INSERT INTO node_closure (ancestor_id, descendant_id, depth)
SELECT c.ancestor_id, 7, c.depth + 1
FROM node_closure c
WHERE c.descendant_id = 2; -- the parent (B)→ →
New rows in node:
id | name
----+------
7 | G
New rows in node_closure:
ancestor_id | descendant_id | depth
-------------+---------------+-------
7 | 7 | 0 -- self (G)
2 | 7 | 1 -- parent (B)
1 | 7 | 2 -- grandparent (A)
1 row in
node
3 rows innode_closure
( =depth(parent) + 1)
✅ Pros
depth is ready in the table⚠️ Cons
depth(parent)+1 closure rowsIn the next model (
ltree— materialized path) the hierarchy is stored as one column on the same table — a different reads/writes balance.
node (the whole hierarchy lives in the path column)| Column | Description |
|---|---|
id |
unique identifier of the node |
path |
type LTREE — the path root → node as labels with . |
node can have additional payload columns (name, JSON, timestamps, …)
— the model only cares about path (+ a stable label per node).
LTREE isltree, in contrib).'1.2.4', 'shop.electronics.phones'._ (no spaces, no special characters).@> (is ancestor of), <@ (is descendant of), ~ (lquery match), @ (ltxtquery).No extra structure — the hierarchy is one column on the same table.
ltree.node:
id | name | path
----+------+--------
1 | A | 1
2 | B | 1.2
3 | C | 1.3
4 | D | 1.2.4
5 | E | 1.3.5
6 | F | 1.3.6
(6 rows)
ltree.node:
id | name | path
----+------+--------
1 | A | A
2 | B | A.B
3 | C | A.C
4 | D | A.B.D
5 | E | A.C.E
6 | F | A.C.F
(6 rows)
name | level
------+-------
A | 0
B | 1
C | 1
D | 2
E | 2
F | 2
(6 rows)
No recursion —
<@(“descendant of”) is served by the GiST index.
The level comes out ofnlevel(path)— number of labels.
name | level
------+-------
A | 0
B | 1
D | 2
C | 1
E | 2
F | 2
(6 rows)
Pre-order is free: the path of the parent is a prefix of the child
→ a singleORDER BY pathis enough.
G under B→
→
New row in node:
id | name | path
----+------+--------
7 | G | 1.2.7
1 row in
node
( = independent of depth )
✅ Pros
ORDER BY path (free)@> / <@ (GiST)nlevel(path) → level is ready<@⚠️ Cons
ltree) — not portablepath on all descendants_)Compared to closure: the insert/move trade-off is reversed
(closure struggles on insert, ltree on move).
Compared to adjacency: cheaper reads (no recursive CTE),
but writes go beyond “one row” when the structure changes.
Each node gets 2 numbers from a DFS traversal:
lftrgtThe subtree of a node = the nodes whose lft is inside the interval of the parent:
→ C, E, F with one indexed range scan. Zero recursion, zero joins (apart from the self-join for the range).
⚠️ Catastrophic writes
lft/rgt values in the table shiftUPDATE of ~N/2 rowsConcurrency
🧨 Fragility
lft/rgt invariant breaks easily🚀 PG has better options
WITH RECURSIVE → adjacency readsltree → indexed subtree read without renumberingltree does exactly the job of Nested Sets, without any of the drawbacks.
Score on the read/write spectrum:
Adjacency = cheap writes, expensive reads ·
Nested Sets = the opposite extreme: maximum reads, catastrophic writes ·
ltree& closure sit in the middle.
| Criterion | Adjacency | ltree | Closure |
|---|---|---|---|
| Subtree read | recursive CTE (N iter.) | 1 indexed predicate (<@) |
1 indexed join |
| Leaf insert | 1 row | 1 row | depth+1 rows |
| Subtree move | 1 UPDATE (1 row) | 1 UPDATE (N paths) | N × depth edges |
| Cycle check on move | recursive CTE | O(1) with <@ |
O(1) PK lookup |
| Storage | O(N) | O(N) | O(N × depth) |
| DAG / multiple parents | ✗ | ✗ | ✓ |
| Integrity from the DB | ✓ (1 FK) | ✗ (denormalized string) | ✓ (FK edges) |
| Extension | — | ltree (contrib) |
— |
| Query ergonomics | medium | high | medium |
| Model | Time | Buffers (shared hit) | What the read path does |
|---|---|---|---|
| ltree | ~13 ms | 372 | Seq scan + hash joins; plain ORDER BY path |
| closure | ~35 ms | 868 | 2× aggregation over 46,812 node_closure rows |
| adjacency | ~39 ms | 1545 | 2× Recursive Union — scans node per level |
The buffers (8KB pages the query touched) are deterministic and
predict scaling — time is noisy.
| Model | Rows written | Buffers | What the write path does |
|---|---|---|---|
| adjacency | 1 | ~26 | one UPDATE of parent_id — the subtree follows |
| ltree | 1,793 | ~22,300 | rewrites path of every node + GiST entries |
| closure | 14,344 | ~97,700 | DELETE 5,379 + INSERT 8,965 edges + ~17,930 FK trg |
parent_id.subtree × depth edges + triggers.| Model | Read (pre-order) | Write (move_subtree) |
|---|---|---|
| adjacency | worst (1545 buffers) | best (1 row) |
| ltree | best (372 buffers) | medium (1,793 rows) |
| closure | medium (868 buffers) | worst (14,344 rows) |
No model wins everywhere — the cost simply moves around.
The central message of the whole presentation: you choose where to pay.
| You have… | Pick |
|---|---|
| Strict tree, simple keys, read-heavy | ltree — the good default in PG |
| DAG / multiple parents | closure — ltree cannot |
| Integrity from the DB (FKs on edges) | closure |
| Edge data (weights, dates, “primary parent”) | closure — path doesn’t fit it |
Keys outside [A-Za-z0-9_] (UUID, Unicode) |
closure — ltree labels are restricted |
| No extension allowed | closure or adjacency |
| Write-heavy, simple modeling | adjacency — 1 row writes |
| None of the above — it just runs | adjacency — the simplest, often enough |
The comparison is not just ltree-vs-closure. For many applications, the adjacency list with a recursive CTE is simply “enough” — zero extension, one FK, cheapest writes.
Greece PostgreSQL Users Group — open to anyone.