Back to Blog

PostgreSQL CTEs and Recursive Queries: Tree Traversal, Hierarchical Data, and WITH Patterns

Master PostgreSQL Common Table Expressions (CTEs) and recursive queries. Covers WITH clauses, recursive tree traversal, organizational charts, bill of materials, path enumeration, and query optimization.

Viprasol Tech Team
March 15, 2027
13 min read

Hierarchical data is everywhere in SaaS products: comment threads, folder structures, organizational charts, product categories, task dependencies, bill of materials. PostgreSQL's WITH RECURSIVE lets you traverse these structures in pure SQL — no application-level loops, no multiple round-trips, no graph database required.

This guide covers CTE fundamentals, recursive query mechanics, the three patterns for hierarchical data storage, and the performance pitfalls to avoid.

CTE Fundamentals

A Common Table Expression (CTE) is a named subquery scoped to a single statement. Think of it as a temporary view that only exists during query execution.

-- Basic CTE: named subquery
WITH active_workspaces AS (
  SELECT id, name, plan
  FROM workspaces
  WHERE status = 'active'
    AND created_at > NOW() - INTERVAL '90 days'
),
workspace_member_counts AS (
  SELECT workspace_id, COUNT(*) AS member_count
  FROM workspace_members
  GROUP BY workspace_id
)
SELECT
  aw.name,
  aw.plan,
  COALESCE(wmc.member_count, 0) AS members
FROM active_workspaces aw
LEFT JOIN workspace_member_counts wmc ON wmc.workspace_id = aw.id
ORDER BY members DESC;

When to use CTEs vs subqueries:

  • Use CTEs when you reference the same subquery multiple times
  • Use CTEs to break complex queries into readable named steps
  • Use subqueries when PostgreSQL's optimizer needs to push predicates down (CTEs can be optimization fences in older PostgreSQL — this changed in PG 12+)
  • Use WITH MATERIALIZED to force CTE materialization when you want it evaluated once
-- Force materialization (evaluated once, not inlined)
WITH MATERIALIZED expensive_calculation AS (
  SELECT user_id, complex_aggregate_function(data)
  FROM large_table
  GROUP BY user_id
)
SELECT * FROM expensive_calculation WHERE user_id = $1;

-- Allow inlining (PostgreSQL 12+ default for non-recursive)
WITH NOT MATERIALIZED recent_users AS (
  SELECT * FROM users WHERE created_at > NOW() - INTERVAL '7 days'
)
SELECT * FROM recent_users WHERE status = 'active';

Hierarchical Data Storage Patterns

Pattern 1: Adjacency List (Parent ID)

The simplest approach: each row stores its parent's ID.

CREATE TABLE categories (
  id          BIGSERIAL PRIMARY KEY,
  parent_id   BIGINT REFERENCES categories(id) ON DELETE CASCADE,
  name        TEXT NOT NULL,
  slug        TEXT NOT NULL UNIQUE,
  sort_order  INTEGER NOT NULL DEFAULT 0,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_categories_parent ON categories(parent_id);

-- Sample data
INSERT INTO categories (id, parent_id, name, slug) VALUES
  (1, NULL,  'Electronics',           'electronics'),
  (2, 1,     'Computers',             'computers'),
  (3, 1,     'Mobile Phones',         'mobile-phones'),
  (4, 2,     'Laptops',               'laptops'),
  (5, 2,     'Desktops',              'desktops'),
  (6, 3,     'Smartphones',           'smartphones'),
  (7, 3,     'Feature Phones',        'feature-phones'),
  (8, 4,     'Gaming Laptops',        'gaming-laptops'),
  (9, 4,     'Business Laptops',      'business-laptops');

Pros: Simple to understand, easy inserts/updates. Cons: Querying the full tree requires recursive SQL.

Pattern 2: Materialized Path (ltree)

Store the full path as a string: electronics.computers.laptops.

CREATE EXTENSION ltree;

CREATE TABLE categories_ltree (
  id        BIGSERIAL PRIMARY KEY,
  path      LTREE NOT NULL,
  name      TEXT NOT NULL
);

CREATE INDEX idx_categories_path_gist ON categories_ltree USING GIST(path);
CREATE INDEX idx_categories_path_btree ON categories_ltree USING BTREE(path);

INSERT INTO categories_ltree (path, name) VALUES
  ('electronics',                           'Electronics'),
  ('electronics.computers',                 'Computers'),
  ('electronics.computers.laptops',         'Laptops'),
  ('electronics.computers.laptops.gaming',  'Gaming Laptops');

-- Get all descendants of 'computers' — no recursion needed!
SELECT * FROM categories_ltree
WHERE path <@ 'electronics.computers';

-- Get direct children only
SELECT * FROM categories_ltree
WHERE path ~ 'electronics.computers.*{1}';

-- Get path from root to a node
SELECT * FROM categories_ltree
WHERE 'electronics.computers.laptops' @> path;

-- Get depth
SELECT nlevel(path) AS depth, name FROM categories_ltree;

Pros: Fast subtree queries without recursion, depth is trivial. Cons: Moving a subtree requires updating all descendant paths.

Pattern 3: Nested Sets (Modified Preorder Tree Traversal)

CREATE TABLE categories_nested (
  id      BIGSERIAL PRIMARY KEY,
  name    TEXT NOT NULL,
  lft     INTEGER NOT NULL,
  rgt     INTEGER NOT NULL,
  depth   INTEGER NOT NULL DEFAULT 0
);

CREATE INDEX idx_cat_nested_lft ON categories_nested(lft);
CREATE INDEX idx_cat_nested_rgt ON categories_nested(rgt);

-- Electronics: lft=1, rgt=18
-- Computers: lft=2, rgt=11 (children: lft 3–10)
-- A node and all its descendants: WHERE lft BETWEEN parent.lft AND parent.rgt

-- Get all descendants of Computers (lft=2, rgt=11)
SELECT * FROM categories_nested
WHERE lft BETWEEN 2 AND 11
ORDER BY lft;

-- Get ancestors (breadcrumb)
SELECT * FROM categories_nested n
JOIN categories_nested node ON node.lft BETWEEN n.lft AND n.rgt
WHERE node.id = 8  -- Gaming Laptops
ORDER BY n.lft;

Pros: Ultra-fast subtree and ancestor queries. Cons: Inserts are expensive (requires renumbering). Best for read-heavy, rarely-modified trees.

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

  • AWS, GCP, Azure certified engineers
  • Infrastructure as Code (Terraform, CDK)
  • Docker, Kubernetes, GitHub Actions CI/CD
  • Typical audit recovers $500–$3,000/month in savings

WITH RECURSIVE: Mechanics

-- Anatomy of a recursive CTE
WITH RECURSIVE cte_name AS (
  -- 1. Anchor member: starting rows (non-recursive)
  SELECT ...

  UNION ALL  -- (or UNION to deduplicate)

  -- 2. Recursive member: references cte_name
  -- Executes repeatedly until no new rows are added
  SELECT ... FROM cte_name JOIN ...
)
SELECT * FROM cte_name;

How PostgreSQL executes this:

  1. Evaluates the anchor member, stores results in a "working table"
  2. Evaluates the recursive member using the working table
  3. Appends results to the working table, replaces working table with new results
  4. Repeats until the recursive member returns zero rows
  5. Returns all accumulated results

Traversing the Adjacency List Tree

Get All Descendants

-- All descendants of category 'Computers' (id=2)
WITH RECURSIVE descendants AS (
  -- Anchor: start at the target node
  SELECT id, parent_id, name, slug, 0 AS depth
  FROM categories
  WHERE id = 2

  UNION ALL

  -- Recursive: find children of current level
  SELECT c.id, c.parent_id, c.name, c.slug, d.depth + 1
  FROM categories c
  INNER JOIN descendants d ON c.parent_id = d.id
)
SELECT * FROM descendants ORDER BY depth, name;

-- Result:
-- id=2  Computers      (depth 0)
-- id=4  Laptops        (depth 1)
-- id=5  Desktops       (depth 1)
-- id=8  Gaming Laptops (depth 2)
-- id=9  Business Lapt. (depth 2)

Get All Ancestors (Breadcrumb Path)

-- Full path from root to 'Gaming Laptops' (id=8)
WITH RECURSIVE ancestors AS (
  -- Anchor: start at the target node
  SELECT id, parent_id, name, 0 AS level
  FROM categories
  WHERE id = 8

  UNION ALL

  -- Recursive: go UP to parent
  SELECT c.id, c.parent_id, c.name, a.level + 1
  FROM categories c
  INNER JOIN ancestors a ON c.id = a.parent_id
)
SELECT * FROM ancestors ORDER BY level DESC;

-- Result (reversed for breadcrumb display):
-- Electronics → Computers → Laptops → Gaming Laptops

Build Full Path String

WITH RECURSIVE paths AS (
  SELECT id, parent_id, name,
         name AS path,
         ARRAY[id] AS id_path
  FROM categories
  WHERE parent_id IS NULL  -- Start at roots

  UNION ALL

  SELECT c.id, c.parent_id, c.name,
         p.path || ' > ' || c.name,
         p.id_path || c.id
  FROM categories c
  INNER JOIN paths p ON c.parent_id = p.id
)
SELECT id, name, path, id_path
FROM paths
ORDER BY path;

-- Result:
-- Electronics
-- Electronics > Computers
-- Electronics > Computers > Laptops
-- Electronics > Computers > Laptops > Gaming Laptops
-- etc.

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

  • Staging + production environments with feature flags
  • Automated security scanning in the pipeline
  • Uptime monitoring + alerting + runbook automation
  • On-call support handover docs included

Real-World Examples

Organizational Chart

CREATE TABLE employees (
  id           BIGSERIAL PRIMARY KEY,
  name         TEXT NOT NULL,
  title        TEXT NOT NULL,
  manager_id   BIGINT REFERENCES employees(id),
  department   TEXT,
  salary       NUMERIC(10,2)
);

-- Everyone who reports to CEO (id=1), at any depth
WITH RECURSIVE org_chart AS (
  SELECT id, name, title, manager_id, department, 0 AS depth,
         ARRAY[id] AS path
  FROM employees
  WHERE id = 1

  UNION ALL

  SELECT e.id, e.name, e.title, e.manager_id, e.department,
         oc.depth + 1,
         oc.path || e.id
  FROM employees e
  INNER JOIN org_chart oc ON e.manager_id = oc.id
  WHERE NOT e.id = ANY(oc.path)  -- Cycle detection!
)
SELECT
  repeat('  ', depth) || name AS indented_name,
  title,
  department,
  depth
FROM org_chart
ORDER BY path;

-- Total headcount and salary budget per manager
WITH RECURSIVE reports AS (
  SELECT id, name, manager_id, salary, id AS root_manager_id
  FROM employees

  UNION ALL

  SELECT e.id, e.name, e.manager_id, e.salary, r.root_manager_id
  FROM employees e
  INNER JOIN reports r ON e.manager_id = r.id
)
SELECT
  root_manager_id,
  m.name AS manager_name,
  COUNT(*) AS total_reports,
  SUM(salary) AS total_salary_budget
FROM reports r
JOIN employees m ON m.id = r.root_manager_id
GROUP BY root_manager_id, m.name
ORDER BY total_reports DESC;

Comment Threads (Reddit-Style)

CREATE TABLE comments (
  id         BIGSERIAL PRIMARY KEY,
  post_id    BIGINT NOT NULL,
  parent_id  BIGINT REFERENCES comments(id),
  author_id  BIGINT NOT NULL,
  body       TEXT NOT NULL,
  upvotes    INTEGER NOT NULL DEFAULT 0,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Fetch entire comment tree for a post, ordered by thread
WITH RECURSIVE comment_tree AS (
  -- Top-level comments
  SELECT id, parent_id, author_id, body, upvotes, created_at,
         0 AS depth,
         LPAD(id::text, 10, '0') AS sort_path
  FROM comments
  WHERE post_id = $1 AND parent_id IS NULL

  UNION ALL

  SELECT c.id, c.parent_id, c.author_id, c.body, c.upvotes, c.created_at,
         ct.depth + 1,
         ct.sort_path || '/' || LPAD(c.id::text, 10, '0')
  FROM comments c
  INNER JOIN comment_tree ct ON c.parent_id = ct.id
  WHERE ct.depth < 10  -- Limit nesting depth
)
SELECT
  id,
  depth,
  repeat('  ', depth) || body AS indented_body,
  upvotes,
  created_at
FROM comment_tree
ORDER BY sort_path;  -- Sort by thread path preserves reply order

Bill of Materials (Parts Explosion)

CREATE TABLE parts (
  id           BIGSERIAL PRIMARY KEY,
  name         TEXT NOT NULL,
  part_number  TEXT NOT NULL UNIQUE,
  unit_cost    NUMERIC(10,4) NOT NULL
);

CREATE TABLE bom (
  parent_part_id  BIGINT NOT NULL REFERENCES parts(id),
  child_part_id   BIGINT NOT NULL REFERENCES parts(id),
  quantity        NUMERIC(10,3) NOT NULL,
  PRIMARY KEY (parent_part_id, child_part_id)
);

-- Total cost explosion for a product (includes all sub-components)
WITH RECURSIVE cost_explosion AS (
  -- Anchor: the top-level product
  SELECT b.child_part_id, b.quantity, p.name, p.unit_cost,
         b.quantity AS cumulative_quantity
  FROM bom b
  JOIN parts p ON p.id = b.child_part_id
  WHERE b.parent_part_id = $1  -- product id

  UNION ALL

  -- Recursive: sub-components
  SELECT b.child_part_id, b.quantity, p.name, p.unit_cost,
         ce.cumulative_quantity * b.quantity
  FROM bom b
  JOIN parts p ON p.id = b.child_part_id
  JOIN cost_explosion ce ON b.parent_part_id = ce.child_part_id
)
SELECT
  name,
  SUM(cumulative_quantity) AS total_quantity,
  unit_cost,
  SUM(cumulative_quantity * unit_cost) AS total_cost
FROM cost_explosion
GROUP BY child_part_id, name, unit_cost
ORDER BY total_cost DESC;

Task Dependency Graph (Critical Path)

CREATE TABLE tasks (
  id       BIGSERIAL PRIMARY KEY,
  name     TEXT NOT NULL,
  duration INTEGER NOT NULL  -- days
);

CREATE TABLE task_dependencies (
  task_id      BIGINT NOT NULL REFERENCES tasks(id),
  depends_on   BIGINT NOT NULL REFERENCES tasks(id),
  PRIMARY KEY (task_id, depends_on)
);

-- Find longest path (critical path) through dependency graph
WITH RECURSIVE critical_path AS (
  -- Anchor: tasks with no dependencies (start nodes)
  SELECT t.id, t.name, t.duration, t.duration AS earliest_finish,
         ARRAY[t.id] AS path, FALSE AS has_cycle
  FROM tasks t
  WHERE NOT EXISTS (
    SELECT 1 FROM task_dependencies WHERE task_id = t.id
  )

  UNION ALL

  SELECT t.id, t.name, t.duration,
         cp.earliest_finish + t.duration,
         cp.path || t.id,
         t.id = ANY(cp.path)  -- Cycle detection
  FROM tasks t
  JOIN task_dependencies td ON td.task_id = t.id
  JOIN critical_path cp ON cp.id = td.depends_on
  WHERE NOT cp.has_cycle
)
SELECT
  path,
  name AS final_task,
  earliest_finish AS total_duration_days,
  array_length(path, 1) AS task_count
FROM critical_path
ORDER BY earliest_finish DESC
LIMIT 1;  -- The critical path is the longest

Performance Considerations

-- Prevent infinite loops with depth limit
WITH RECURSIVE tree AS (
  SELECT id, parent_id, 0 AS depth
  FROM nodes WHERE id = $1

  UNION ALL

  SELECT n.id, n.parent_id, t.depth + 1
  FROM nodes n
  JOIN tree t ON n.parent_id = t.id
  WHERE t.depth < 20  -- Hard limit prevents runaway recursion
)
SELECT * FROM tree;

-- Cycle detection with path array
WITH RECURSIVE tree AS (
  SELECT id, parent_id, ARRAY[id] AS visited
  FROM nodes WHERE id = $1

  UNION ALL

  SELECT n.id, n.parent_id, t.visited || n.id
  FROM nodes n
  JOIN tree t ON n.parent_id = t.id
  WHERE NOT n.id = ANY(t.visited)  -- Stop if cycle detected
)
SELECT * FROM tree;

-- Check for index usage with EXPLAIN
EXPLAIN (ANALYZE, BUFFERS)
WITH RECURSIVE ...

-- Critical index: parent_id must be indexed for recursive joins to be fast
CREATE INDEX idx_nodes_parent_id ON nodes(parent_id);
-- Without this index, each recursive step does a seq scan

When to Use Each Pattern

ScenarioBest PatternReason
Unlimited nesting, reads dominantltreeO(1) subtree queries, no recursion
Nesting rarely changes, fast readsNested setsSubtree = range scan
Frequent moves/insertsAdjacency list + WITH RECURSIVESimple updates, any-depth traversal
Depth ≤ 5, simpleAdjacency listMinimal complexity
Graph (not tree — multi-parent)Adjacency listOnly pattern supporting DAGs
Need breadcrumbs frequentlyltree or path-building CTEPath is stored/computed once

Cost and Timeline Estimates

ScopeTeamTimelineCost Range
Adjacency list schema + basic CTE queries1 dev0.5–1 day$200–500
Full hierarchical system (ltree + UI)1 dev2–3 days$600–1,200
Complex graph traversal (BOM, critical path)1–2 devs1 week$2,000–4,000
Migration from flat to hierarchical structure1–2 devs1–2 weeks$3,000–6,000

See Also


Working With Viprasol

Hierarchical data modelling is one of those database decisions that's easy to get wrong and expensive to migrate later. Our team has designed and queried tree structures in PostgreSQL for org charts, e-commerce categories, multi-level comment systems, and supply chain bill-of-materials — choosing the right pattern for each access profile.

What we deliver:

  • Schema design for your specific hierarchy use case
  • WITH RECURSIVE queries optimized with proper indexing
  • ltree migration for read-heavy trees
  • Cycle detection and depth-limiting for production safety
  • Query performance benchmarks across approaches

Talk to our team about your data modelling challenge →

Or explore our cloud infrastructure services.

Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need DevOps & Cloud Expertise?

Scale your infrastructure with confidence. AWS, GCP, Azure certified team.

Free consultation • No commitment • Response within 24 hours

Viprasol · Big Data & Analytics

Making sense of your data at scale?

Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.