Hierarchical data is everywhere in SaaS products: comment threads, folder structures, organizational charts, product categories, task dependencies, bill of materials. PostgreSQL's WITH RECURSIVE lets you traverse these structures in pure SQL — no application-level loops, no multiple round-trips, no graph database required.

This guide covers CTE fundamentals, recursive query mechanics, the three patterns for hierarchical data storage, and the performance pitfalls to avoid.

CTE Fundamentals

A Common Table Expression (CTE) is a named subquery scoped to a single statement. Think of it as a temporary view that only exists during query execution.

-- Basic CTE: named subquery
WITH active_workspaces AS (
  SELECT id, name, plan
  FROM workspaces
  WHERE status = 'active'
    AND created_at > NOW() - INTERVAL '90 days'
),
workspace_member_counts AS (
  SELECT workspace_id, COUNT(*) AS member_count
  FROM workspace_members
  GROUP BY workspace_id
)
SELECT
  aw.name,
  aw.plan,
  COALESCE(wmc.member_count, 0) AS members
FROM active_workspaces aw
LEFT JOIN workspace_member_counts wmc ON wmc.workspace_id = aw.id
ORDER BY members DESC;

When to use CTEs vs subqueries:

Use CTEs when you reference the same subquery multiple times
Use CTEs to break complex queries into readable named steps
Use subqueries when PostgreSQL's optimizer needs to push predicates down (CTEs can be optimization fences in older PostgreSQL — this changed in PG 12+)
Use WITH MATERIALIZED to force CTE materialization when you want it evaluated once

-- Force materialization (evaluated once, not inlined)
WITH MATERIALIZED expensive_calculation AS (
  SELECT user_id, complex_aggregate_function(data)
  FROM large_table
  GROUP BY user_id
)
SELECT * FROM expensive_calculation WHERE user_id = $1;

-- Allow inlining (PostgreSQL 12+ default for non-recursive)
WITH NOT MATERIALIZED recent_users AS (
  SELECT * FROM users WHERE created_at > NOW() - INTERVAL '7 days'
)
SELECT * FROM recent_users WHERE status = 'active';

Hierarchical Data Storage Patterns

Pattern 1: Adjacency List (Parent ID)

The simplest approach: each row stores its parent's ID.

CREATE TABLE categories (
  id          BIGSERIAL PRIMARY KEY,
  parent_id   BIGINT REFERENCES categories(id) ON DELETE CASCADE,
  name        TEXT NOT NULL,
  slug        TEXT NOT NULL UNIQUE,
  sort_order  INTEGER NOT NULL DEFAULT 0,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_categories_parent ON categories(parent_id);

-- Sample data
INSERT INTO categories (id, parent_id, name, slug) VALUES
  (1, NULL,  'Electronics',           'electronics'),
  (2, 1,     'Computers',             'computers'),
  (3, 1,     'Mobile Phones',         'mobile-phones'),
  (4, 2,     'Laptops',               'laptops'),
  (5, 2,     'Desktops',              'desktops'),
  (6, 3,     'Smartphones',           'smartphones'),
  (7, 3,     'Feature Phones',        'feature-phones'),
  (8, 4,     'Gaming Laptops',        'gaming-laptops'),
  (9, 4,     'Business Laptops',      'business-laptops');

Pros: Simple to understand, easy inserts/updates. Cons: Querying the full tree requires recursive SQL.

Pattern 2: Materialized Path (ltree)

Store the full path as a string: electronics.computers.laptops.

CREATE EXTENSION ltree;

CREATE TABLE categories_ltree (
  id        BIGSERIAL PRIMARY KEY,
  path      LTREE NOT NULL,
  name      TEXT NOT NULL
);

CREATE INDEX idx_categories_path_gist ON categories_ltree USING GIST(path);
CREATE INDEX idx_categories_path_btree ON categories_ltree USING BTREE(path);

INSERT INTO categories_ltree (path, name) VALUES
  ('electronics',                           'Electronics'),
  ('electronics.computers',                 'Computers'),
  ('electronics.computers.laptops',         'Laptops'),
  ('electronics.computers.laptops.gaming',  'Gaming Laptops');

-- Get all descendants of 'computers' — no recursion needed!
SELECT * FROM categories_ltree
WHERE path <@ 'electronics.computers';

-- Get direct children only
SELECT * FROM categories_ltree
WHERE path ~ 'electronics.computers.*{1}';

-- Get path from root to a node
SELECT * FROM categories_ltree
WHERE 'electronics.computers.laptops' @> path;

-- Get depth
SELECT nlevel(path) AS depth, name FROM categories_ltree;

Pros: Fast subtree queries without recursion, depth is trivial. Cons: Moving a subtree requires updating all descendant paths.

Pattern 3: Nested Sets (Modified Preorder Tree Traversal)

CREATE TABLE categories_nested (
  id      BIGSERIAL PRIMARY KEY,
  name    TEXT NOT NULL,
  lft     INTEGER NOT NULL,
  rgt     INTEGER NOT NULL,
  depth   INTEGER NOT NULL DEFAULT 0
);

CREATE INDEX idx_cat_nested_lft ON categories_nested(lft);
CREATE INDEX idx_cat_nested_rgt ON categories_nested(rgt);

-- Electronics: lft=1, rgt=18
-- Computers: lft=2, rgt=11 (children: lft 3–10)
-- A node and all its descendants: WHERE lft BETWEEN parent.lft AND parent.rgt

-- Get all descendants of Computers (lft=2, rgt=11)
SELECT * FROM categories_nested
WHERE lft BETWEEN 2 AND 11
ORDER BY lft;

-- Get ancestors (breadcrumb)
SELECT * FROM categories_nested n
JOIN categories_nested node ON node.lft BETWEEN n.lft AND n.rgt
WHERE node.id = 8  -- Gaming Laptops
ORDER BY n.lft;

Pros: Ultra-fast subtree and ancestor queries. Cons: Inserts are expensive (requires renumbering). Best for read-heavy, rarely-modified trees.

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

AWS, GCP, Azure certified engineers
Infrastructure as Code (Terraform, CDK)
Docker, Kubernetes, GitHub Actions CI/CD
Typical audit recovers $500–$3,000/month in savings

Get a Free Cloud Audit WhatsApp

WITH RECURSIVE: Mechanics

-- Anatomy of a recursive CTE
WITH RECURSIVE cte_name AS (
  -- 1. Anchor member: starting rows (non-recursive)
  SELECT ...

  UNION ALL  -- (or UNION to deduplicate)

  -- 2. Recursive member: references cte_name
  -- Executes repeatedly until no new rows are added
  SELECT ... FROM cte_name JOIN ...
)
SELECT * FROM cte_name;

How PostgreSQL executes this:

Evaluates the anchor member, stores results in a "working table"
Evaluates the recursive member using the working table
Appends results to the working table, replaces working table with new results
Repeats until the recursive member returns zero rows
Returns all accumulated results

Traversing the Adjacency List Tree

Get All Descendants

-- All descendants of category 'Computers' (id=2)
WITH RECURSIVE descendants AS (
  -- Anchor: start at the target node
  SELECT id, parent_id, name, slug, 0 AS depth
  FROM categories
  WHERE id = 2

  UNION ALL

  -- Recursive: find children of current level
  SELECT c.id, c.parent_id, c.name, c.slug, d.depth + 1
  FROM categories c
  INNER JOIN descendants d ON c.parent_id = d.id
)
SELECT * FROM descendants ORDER BY depth, name;

-- Result:
-- id=2  Computers      (depth 0)
-- id=4  Laptops        (depth 1)
-- id=5  Desktops       (depth 1)
-- id=8  Gaming Laptops (depth 2)
-- id=9  Business Lapt. (depth 2)

Get All Ancestors (Breadcrumb Path)

-- Full path from root to 'Gaming Laptops' (id=8)
WITH RECURSIVE ancestors AS (
  -- Anchor: start at the target node
  SELECT id, parent_id, name, 0 AS level
  FROM categories
  WHERE id = 8

  UNION ALL

  -- Recursive: go UP to parent
  SELECT c.id, c.parent_id, c.name, a.level + 1
  FROM categories c
  INNER JOIN ancestors a ON c.id = a.parent_id
)
SELECT * FROM ancestors ORDER BY level DESC;

-- Result (reversed for breadcrumb display):
-- Electronics → Computers → Laptops → Gaming Laptops

Build Full Path String

WITH RECURSIVE paths AS (
  SELECT id, parent_id, name,
         name AS path,
         ARRAY[id] AS id_path
  FROM categories
  WHERE parent_id IS NULL  -- Start at roots

  UNION ALL

  SELECT c.id, c.parent_id, c.name,
         p.path || ' > ' || c.name,
         p.id_path || c.id
  FROM categories c
  INNER JOIN paths p ON c.parent_id = p.id
)
SELECT id, name, path, id_path
FROM paths
ORDER BY path;

-- Result:
-- Electronics
-- Electronics > Computers
-- Electronics > Computers > Laptops
-- Electronics > Computers > Laptops > Gaming Laptops
-- etc.

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

Staging + production environments with feature flags
Automated security scanning in the pipeline
Uptime monitoring + alerting + runbook automation
On-call support handover docs included

Modernize My DevOps WhatsApp

Real-World Examples

Organizational Chart

CREATE TABLE employees (
  id           BIGSERIAL PRIMARY KEY,
  name         TEXT NOT NULL,
  title        TEXT NOT NULL,
  manager_id   BIGINT REFERENCES employees(id),
  department   TEXT,
  salary       NUMERIC(10,2)
);

-- Everyone who reports to CEO (id=1), at any depth
WITH RECURSIVE org_chart AS (
  SELECT id, name, title, manager_id, department, 0 AS depth,
         ARRAY[id] AS path
  FROM employees
  WHERE id = 1

  UNION ALL

  SELECT e.id, e.name, e.title, e.manager_id, e.department,
         oc.depth + 1,
         oc.path || e.id
  FROM employees e
  INNER JOIN org_chart oc ON e.manager_id = oc.id
  WHERE NOT e.id = ANY(oc.path)  -- Cycle detection!
)
SELECT
  repeat('  ', depth) || name AS indented_name,
  title,
  department,
  depth
FROM org_chart
ORDER BY path;

-- Total headcount and salary budget per manager
WITH RECURSIVE reports AS (
  SELECT id, name, manager_id, salary, id AS root_manager_id
  FROM employees

  UNION ALL

  SELECT e.id, e.name, e.manager_id, e.salary, r.root_manager_id
  FROM employees e
  INNER JOIN reports r ON e.manager_id = r.id
)
SELECT
  root_manager_id,
  m.name AS manager_name,
  COUNT(*) AS total_reports,
  SUM(salary) AS total_salary_budget
FROM reports r
JOIN employees m ON m.id = r.root_manager_id
GROUP BY root_manager_id, m.name
ORDER BY total_reports DESC;

Comment Threads (Reddit-Style)

CREATE TABLE comments (
  id         BIGSERIAL PRIMARY KEY,
  post_id    BIGINT NOT NULL,
  parent_id  BIGINT REFERENCES comments(id),
  author_id  BIGINT NOT NULL,
  body       TEXT NOT NULL,
  upvotes    INTEGER NOT NULL DEFAULT 0,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Fetch entire comment tree for a post, ordered by thread
WITH RECURSIVE comment_tree AS (
  -- Top-level comments
  SELECT id, parent_id, author_id, body, upvotes, created_at,
         0 AS depth,
         LPAD(id::text, 10, '0') AS sort_path
  FROM comments
  WHERE post_id = $1 AND parent_id IS NULL

  UNION ALL

  SELECT c.id, c.parent_id, c.author_id, c.body, c.upvotes, c.created_at,
         ct.depth + 1,
         ct.sort_path || '/' || LPAD(c.id::text, 10, '0')
  FROM comments c
  INNER JOIN comment_tree ct ON c.parent_id = ct.id
  WHERE ct.depth < 10  -- Limit nesting depth
)
SELECT
  id,
  depth,
  repeat('  ', depth) || body AS indented_body,
  upvotes,
  created_at
FROM comment_tree
ORDER BY sort_path;  -- Sort by thread path preserves reply order

Bill of Materials (Parts Explosion)

CREATE TABLE parts (
  id           BIGSERIAL PRIMARY KEY,
  name         TEXT NOT NULL,
  part_number  TEXT NOT NULL UNIQUE,
  unit_cost    NUMERIC(10,4) NOT NULL
);

CREATE TABLE bom (
  parent_part_id  BIGINT NOT NULL REFERENCES parts(id),
  child_part_id   BIGINT NOT NULL REFERENCES parts(id),
  quantity        NUMERIC(10,3) NOT NULL,
  PRIMARY KEY (parent_part_id, child_part_id)
);

-- Total cost explosion for a product (includes all sub-components)
WITH RECURSIVE cost_explosion AS (
  -- Anchor: the top-level product
  SELECT b.child_part_id, b.quantity, p.name, p.unit_cost,
         b.quantity AS cumulative_quantity
  FROM bom b
  JOIN parts p ON p.id = b.child_part_id
  WHERE b.parent_part_id = $1  -- product id

  UNION ALL

  -- Recursive: sub-components
  SELECT b.child_part_id, b.quantity, p.name, p.unit_cost,
         ce.cumulative_quantity * b.quantity
  FROM bom b
  JOIN parts p ON p.id = b.child_part_id
  JOIN cost_explosion ce ON b.parent_part_id = ce.child_part_id
)
SELECT
  name,
  SUM(cumulative_quantity) AS total_quantity,
  unit_cost,
  SUM(cumulative_quantity * unit_cost) AS total_cost
FROM cost_explosion
GROUP BY child_part_id, name, unit_cost
ORDER BY total_cost DESC;

Task Dependency Graph (Critical Path)

CREATE TABLE tasks (
  id       BIGSERIAL PRIMARY KEY,
  name     TEXT NOT NULL,
  duration INTEGER NOT NULL  -- days
);

CREATE TABLE task_dependencies (
  task_id      BIGINT NOT NULL REFERENCES tasks(id),
  depends_on   BIGINT NOT NULL REFERENCES tasks(id),
  PRIMARY KEY (task_id, depends_on)
);

-- Find longest path (critical path) through dependency graph
WITH RECURSIVE critical_path AS (
  -- Anchor: tasks with no dependencies (start nodes)
  SELECT t.id, t.name, t.duration, t.duration AS earliest_finish,
         ARRAY[t.id] AS path, FALSE AS has_cycle
  FROM tasks t
  WHERE NOT EXISTS (
    SELECT 1 FROM task_dependencies WHERE task_id = t.id
  )

  UNION ALL

  SELECT t.id, t.name, t.duration,
         cp.earliest_finish + t.duration,
         cp.path || t.id,
         t.id = ANY(cp.path)  -- Cycle detection
  FROM tasks t
  JOIN task_dependencies td ON td.task_id = t.id
  JOIN critical_path cp ON cp.id = td.depends_on
  WHERE NOT cp.has_cycle
)
SELECT
  path,
  name AS final_task,
  earliest_finish AS total_duration_days,
  array_length(path, 1) AS task_count
FROM critical_path
ORDER BY earliest_finish DESC
LIMIT 1;  -- The critical path is the longest

Performance Considerations

-- Prevent infinite loops with depth limit
WITH RECURSIVE tree AS (
  SELECT id, parent_id, 0 AS depth
  FROM nodes WHERE id = $1

  UNION ALL

  SELECT n.id, n.parent_id, t.depth + 1
  FROM nodes n
  JOIN tree t ON n.parent_id = t.id
  WHERE t.depth < 20  -- Hard limit prevents runaway recursion
)
SELECT * FROM tree;

-- Cycle detection with path array
WITH RECURSIVE tree AS (
  SELECT id, parent_id, ARRAY[id] AS visited
  FROM nodes WHERE id = $1

  UNION ALL

  SELECT n.id, n.parent_id, t.visited || n.id
  FROM nodes n
  JOIN tree t ON n.parent_id = t.id
  WHERE NOT n.id = ANY(t.visited)  -- Stop if cycle detected
)
SELECT * FROM tree;

-- Check for index usage with EXPLAIN
EXPLAIN (ANALYZE, BUFFERS)
WITH RECURSIVE ...

-- Critical index: parent_id must be indexed for recursive joins to be fast
CREATE INDEX idx_nodes_parent_id ON nodes(parent_id);
-- Without this index, each recursive step does a seq scan

When to Use Each Pattern

Scenario	Best Pattern	Reason
Unlimited nesting, reads dominant	ltree	O(1) subtree queries, no recursion
Nesting rarely changes, fast reads	Nested sets	Subtree = range scan
Frequent moves/inserts	Adjacency list + WITH RECURSIVE	Simple updates, any-depth traversal
Depth ≤ 5, simple	Adjacency list	Minimal complexity
Graph (not tree — multi-parent)	Adjacency list	Only pattern supporting DAGs
Need breadcrumbs frequently	ltree or path-building CTE	Path is stored/computed once

Cost and Timeline Estimates

Scope	Team	Timeline	Cost Range
Adjacency list schema + basic CTE queries	1 dev	0.5–1 day	$200–500
Full hierarchical system (ltree + UI)	1 dev	2–3 days	$600–1,200
Complex graph traversal (BOM, critical path)	1–2 devs	1 week	$2,000–4,000
Migration from flat to hierarchical structure	1–2 devs	1–2 weeks	$3,000–6,000

Working With Viprasol

Hierarchical data modelling is one of those database decisions that's easy to get wrong and expensive to migrate later. Our team has designed and queried tree structures in PostgreSQL for org charts, e-commerce categories, multi-level comment systems, and supply chain bill-of-materials — choosing the right pattern for each access profile.

What we deliver:

Schema design for your specific hierarchy use case
WITH RECURSIVE queries optimized with proper indexing
ltree migration for read-heavy trees
Cycle detection and depth-limiting for production safety
Query performance benchmarks across approaches

Talk to our team about your data modelling challenge →

Or explore our cloud infrastructure services.

PostgreSQL CTEs and Recursive Queries: Tree Traversal, Hierarchical Data, and WITH Patterns