Back to Blog

PostgreSQL Advanced Indexing: Covering Indexes

Master advanced PostgreSQL indexing. Covers covering indexes with INCLUDE, expression indexes for computed values, index-only scans, partial indexes, index bloat monitoring and REINDEX CONCURRENTLY, and multi-column index column ordering.

Viprasol Tech Team
13 min read
Updated 2027

Quick answer. A covering index uses INCLUDE to store extra columns alongside the key, enabling index-only scans that return results without touching the table. Combined with expression indexes for computed values, partial indexes that index only queried rows, and regular bloat monitoring, these techniques cut query times by orders of magnitude at scale. Adding a B-tree index on a single column is just the beginning. Advanced PostgreSQL indexing — covering indexes that eliminate table lookups, expression indexes for computed values, partial indexes that index only the rows you query, and bloat monitoring to catch indexes that silently degrade — can transform query performance by orders of magnitude.

This guide goes beyond basic indexing to the techniques that make a real difference at scale.

1. Covering Indexes (INCLUDE)

A covering index stores additional columns alongside the index key, enabling index-only scans — the query fetches everything from the index without touching the table.

-- Query: fetch invoice amount and paid_at for a workspace
-- Without covering index:
EXPLAIN ANALYZE
SELECT id, amount, paid_at
FROM invoices
WHERE workspace_id = $1 AND status = 'paid'
ORDER BY paid_at DESC;

-- Execution plan:
-- Index Scan using idx_invoices_workspace on invoices
--   Filter: (status = 'paid')
--   Heap Fetches: 45231    ← reads 45K table pages to get amount and paid_at

-- WITH covering index: INCLUDE adds amount and paid_at to the index leaf pages
CREATE INDEX idx_invoices_workspace_paid
  ON invoices(workspace_id, status, paid_at DESC)
  INCLUDE (amount);
  -- Can also: INCLUDE (amount, currency, customer_id) for wider coverage

-- Now EXPLAIN shows:
-- Index Only Scan using idx_invoices_workspace_paid on invoices
--   Heap Fetches: 0    ← zero table reads!

-- When to use INCLUDE:
-- ✅ Query selects a few extra columns beyond the filter/sort columns
-- ✅ Table is large (row fetches are expensive)
-- ❌ Don't include columns that are updated frequently (index maintenance cost)
-- ❌ Don't include columns you ORDER BY or filter on (put those in key, not INCLUDE)

2. Expression Indexes

-- Case-insensitive email lookup
-- ❌ Without expression index: full seq scan (can't use idx on email for lower())
SELECT * FROM users WHERE lower(email) = lower($1);

-- ✅ Expression index: index the result of the expression
CREATE INDEX idx_users_email_lower ON users(lower(email));

-- Query must use the same expression to use the index:
SELECT * FROM users WHERE lower(email) = lower($1);  -- Uses index ✅
SELECT * FROM users WHERE email = $1;                 -- Doesn't use it ❌ (different expression)

-- Date-based partitioning without partitioned tables
CREATE INDEX idx_events_year_month ON events(
  EXTRACT(YEAR FROM created_at)::int,
  EXTRACT(MONTH FROM created_at)::int
);
SELECT * FROM events
WHERE EXTRACT(YEAR FROM created_at) = 2027
  AND EXTRACT(MONTH FROM created_at) = 4;

-- JSON field extraction
CREATE INDEX idx_metadata_user_id
  ON events((metadata->>'userId'));
-- Indexes the text value of the JSON field
SELECT * FROM events WHERE metadata->>'userId' = $1;

-- Computed string for full-text search
CREATE INDEX idx_products_search
  ON products USING GIN(
    to_tsvector('english', name || ' ' || coalesce(description, ''))
  );
-- Query:
SELECT * FROM products
WHERE to_tsvector('english', name || ' ' || coalesce(description, '')) @@ plainto_tsquery('english', $1);

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

  • AWS, GCP, Azure certified engineers
  • Infrastructure as Code (Terraform, CDK)
  • Docker, Kubernetes, GitHub Actions CI/CD
  • Typical audit recovers $500–$3,000/month in savings

3. Partial Indexes

-- Index only the rows you actually query — much smaller, faster to maintain

-- Active records only (most queries filter deleted_at IS NULL)
CREATE INDEX idx_projects_active
  ON projects(workspace_id, updated_at DESC)
  WHERE deleted_at IS NULL;
-- Index size: orders of magnitude smaller when most records are deleted

-- Unread notifications (active processing queue)
CREATE INDEX idx_notifications_unread
  ON notifications(user_id, created_at DESC)
  WHERE read_at IS NULL;
-- This index might be 1% the size of a full index

-- Pending jobs (hot path in job processing)
CREATE INDEX idx_jobs_pending
  ON background_jobs(scheduled_at ASC, priority DESC)
  WHERE status = 'pending';
-- Only indexes the small fraction of jobs not yet processed

-- Partial unique constraint: one active record per user per type
CREATE UNIQUE INDEX idx_subscriptions_active_unique
  ON subscriptions(workspace_id, plan)
  WHERE status = 'active' AND cancelled_at IS NULL;
-- Allows multiple cancelled subscriptions (historical records) per workspace
-- Enforces only one active subscription per workspace per plan

-- IMPORTANT: Query WHERE clause must match index WHERE clause
-- This uses the partial index:
SELECT * FROM projects WHERE workspace_id = $1 AND deleted_at IS NULL ORDER BY updated_at DESC;
-- This does NOT use it (missing deleted_at IS NULL):
SELECT * FROM projects WHERE workspace_id = $1 ORDER BY updated_at DESC;

4. Multi-Column Index Column Order

-- Rule: equality filters first, range/sort last

-- Query pattern: WHERE status = ? AND created_at > ? ORDER BY created_at DESC
-- ✅ Correct order: status (equality) before created_at (range + sort)
CREATE INDEX idx_invoices_status_created
  ON invoices(status, created_at DESC);

-- ❌ Wrong order: created_at first means status filter can't use the index efficiently
CREATE INDEX idx_invoices_created_status
  ON invoices(created_at DESC, status);  -- status lookup requires full index scan per date range

-- For workspace-scoped multi-tenant apps: workspace_id ALWAYS goes first
CREATE INDEX idx_projects_ws_status_created
  ON projects(workspace_id, status, created_at DESC)
  WHERE deleted_at IS NULL;
-- Supports: WHERE workspace_id = ? AND status = ? ORDER BY created_at DESC
-- Also supports: WHERE workspace_id = ? ORDER BY ... (leftmost prefix)
-- Does NOT support efficiently: WHERE status = ? (missing workspace_id)

-- The leading column rule: an index on (A, B, C) supports queries filtering on:
-- A           ✅
-- A, B        ✅
-- A, B, C     ✅
-- B           ❌ (no A prefix — not efficient)
-- B, C        ❌
-- A, C        ✅ but only A is selective; C may need re-check
PostgreSQL - PostgreSQL Advanced Indexing: Covering Indexes

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

  • Staging + production environments with feature flags
  • Automated security scanning in the pipeline
  • Uptime monitoring + alerting + runbook automation
  • On-call support handover docs included

5. Index Bloat Monitoring

-- Detect bloated indexes (dead index entries from UPDATE/DELETE)
-- Requires pgstattuple extension
CREATE EXTENSION IF NOT EXISTS pgstattuple;

SELECT
  i.relname                                AS index_name,
  pg_size_pretty(pg_relation_size(i.oid)) AS index_size,
  s.leaf_fragmentation                    AS bloat_pct
FROM
  pg_class t
  JOIN pg_index x ON t.oid = x.indrelid
  JOIN pg_class i ON i.oid = x.indexrelid
  CROSS JOIN LATERAL pgstatindex(i.relname) s
WHERE t.relkind = 'r'
  AND t.relname NOT LIKE 'pg_%'
  AND s.leaf_fragmentation > 30   -- More than 30% bloat
ORDER BY pg_relation_size(i.oid) DESC;

-- Simple bloat estimate (without pgstattuple — no extension needed)
SELECT
  schemaname,
  tablename,
  indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  idx_scan,
  idx_tup_read,
  idx_tup_fetch
FROM pg_stat_user_indexes
WHERE pg_relation_size(indexrelid) > 10 * 1024 * 1024  -- Indexes > 10MB
  AND idx_scan < 100                                     -- Barely used
ORDER BY pg_relation_size(indexrelid) DESC;

6. REINDEX Without Downtime

-- REINDEX TABLE blocks all reads and writes — never use on production
-- REINDEX TABLE CONCURRENTLY: builds new index while allowing reads/writes

-- Rebuild a bloated index without locking
REINDEX INDEX CONCURRENTLY idx_invoices_workspace_paid;

-- Rebuild all indexes on a table
REINDEX TABLE CONCURRENTLY invoices;

-- Note: CONCURRENTLY takes longer and temporarily has twice the index storage
-- It also cannot run inside a transaction block

-- Alternative: CREATE INDEX CONCURRENTLY + DROP old index
-- This is safer because you can verify the new index before dropping the old one
CREATE INDEX CONCURRENTLY idx_invoices_workspace_paid_v2
  ON invoices(workspace_id, status, paid_at DESC) INCLUDE (amount);

-- Verify it works (check EXPLAIN uses the new index)
EXPLAIN SELECT id, amount FROM invoices WHERE workspace_id = $1 AND status = 'paid';

-- Drop the old bloated index
DROP INDEX CONCURRENTLY idx_invoices_workspace_paid;

-- Rename new to canonical name
ALTER INDEX idx_invoices_workspace_paid_v2 RENAME TO idx_invoices_workspace_paid;

7. Index Usage Statistics

-- Find the most-used indexes (worth protecting from bloat)
SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan                                       AS times_used,
  pg_size_pretty(pg_relation_size(indexrelid))   AS size,
  idx_tup_read                                   AS tuples_read,
  idx_tup_fetch                                  AS tuples_fetched
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC
LIMIT 20;

-- Find tables with high seq scan rate (likely missing indexes)
SELECT
  relname                                   AS table_name,
  seq_scan,
  idx_scan,
  n_live_tup                                AS live_rows,
  round(seq_scan::numeric / nullif(seq_scan + idx_scan, 0) * 100, 1) AS seq_scan_pct
FROM pg_stat_user_tables
WHERE seq_scan > 0 AND n_live_tup > 10000
ORDER BY seq_scan DESC
LIMIT 20;

-- Check if index-only scans are being used
-- High heap_fetches with low idx_scan suggests visibility map isn't up to date
-- Run VACUUM to fix
SELECT
  relname,
  indexrelname,
  idx_scan,
  idx_tup_fetch,
  idx_tup_read
FROM pg_stat_user_indexes
WHERE indexrelname LIKE '%workspace%'
ORDER BY idx_scan DESC;

8. Visibility Map and Index-Only Scans

-- Index-only scans require up-to-date visibility map
-- If heap_fetches is high, VACUUM more aggressively

SELECT
  relname,
  n_dead_tup,
  last_autovacuum,
  last_vacuum
FROM pg_stat_user_tables
WHERE relname = 'invoices';

-- Force VACUUM to update visibility map
VACUUM (ANALYZE, VERBOSE) invoices;

-- Check visibility map status
SELECT
  relname,
  all_visible,     -- Pages where all rows are visible to all transactions
  all_frozen       -- Pages with all rows frozen
FROM pg_class c
JOIN pg_visibility_map_summary(c.oid) v ON true
WHERE relname = 'invoices';
-- If all_visible is low, index-only scans will fall back to heap fetches

Indexing Decision Matrix

Query PatternIndex Type
WHERE col = ?B-tree on col
WHERE col = ? ORDER BY other DESCB-tree on (col, other DESC)
WHERE col IN (?, ?, ?)B-tree on col
WHERE col LIKE 'prefix%'B-tree on col
WHERE col LIKE '%suffix'GIN with pg_trgm
WHERE lower(col) = ?Expression: lower(col)
WHERE jsonb_col->>'key' = ?Expression: (jsonb_col->>'key')
WHERE col @@ tsqueryGIN on tsvector
WHERE array_col @> ARRAY[?]GIN on array
WHERE status = 'active' (most rows inactive)Partial WHERE status = 'active'
ORDER BY col on large tableB-tree on col with sort direction
WHERE a = ? AND b = ? — select c, dCovering: (a, b) INCLUDE (c, d)

Cost and Timeline

Indexing is inexpensive — the cost is analyst time and the disk space for index storage:

ActivityTimeStorage Overhead
Add a covering index (CONCURRENTLY)5–30 min (build time)10–50% of table size
Expression index5–20 minVaries
Partial index1–10 min (only indexes matching rows)1–50% of full index
REINDEX CONCURRENTLY (bloat fix)10 min – 2 hours2× index during build

You Might Also Like


Why Clients Trust Viprasol

The right index can turn a 30-second query into 30 milliseconds. Our team audits your query patterns using pg_stat_statements and EXPLAIN ANALYZE, identifies missing covering indexes, expression indexes for case-insensitive lookups, and partial indexes for your active-record query patterns — all applied CONCURRENTLY with zero production downtime.

What we deliver:

  • Covering index audit: identify queries with high heap_fetches that benefit from INCLUDE
  • Expression index setup for lower(email), JSONB fields, and computed columns
  • Partial index strategy for soft-deleted records and status-filtered hot paths
  • Index bloat scan + REINDEX CONCURRENTLY runbook
  • Column ordering review for all composite indexes (equality before range)

Talk to our team about your PostgreSQL indexing strategy →

Or explore our cloud and database services.

PostgreSQLIndexingPerformanceSQLDatabaseOptimizationTypeScript
Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 1000+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need DevOps & Cloud Expertise?

Scale your infrastructure with confidence. AWS, GCP, Azure certified team.

Free consultation • No commitment • Response within 24 hours

Viprasol · Big Data & Analytics

Making sense of your data at scale?

Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.