Back to Blog

PostgreSQL Performance Tuning: EXPLAIN ANALYZE, Index Types, Vacuum, and pg_stat

Deep-dive PostgreSQL performance tuning guide. Read EXPLAIN ANALYZE output, choose the right index type (B-tree, GIN, BRIN, partial), configure autovacuum, and use pg_stat views to find bottlenecks.

Viprasol Tech Team
March 10, 2027
14 min read

PostgreSQL performance problems follow a predictable pattern: the application works fine at launch, slows down as data grows, and becomes a crisis when the team has no idea where to look. The good news is that PostgreSQL's tooling for diagnosing problems is excellent — EXPLAIN ANALYZE, pg_stat_statements, pg_stat_user_tables, and pg_stat_user_indexes give you everything you need to find and fix bottlenecks systematically.

This guide covers reading query plans, choosing indexes, and tuning autovacuum — the three areas that solve 90% of PostgreSQL performance problems.

Reading EXPLAIN ANALYZE Output

EXPLAIN shows the query plan. EXPLAIN ANALYZE executes the query and shows actual vs. estimated row counts and timing.

-- Always use ANALYZE + BUFFERS together
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT
  u.id,
  u.email,
  COUNT(o.id) AS order_count,
  SUM(o.total_amount) AS total_spent
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2027-01-01'
  AND u.status = 'active'
GROUP BY u.id, u.email
ORDER BY total_spent DESC
LIMIT 20;

Sample output:

Limit  (cost=4521.32..4521.37 rows=20 width=52)
       (actual time=312.455..312.461 rows=20 loops=1)
  Buffers: shared hit=1842 read=3201
  ->  Sort  (cost=4521.32..4534.82 rows=5400 width=52)
            (actual time=312.453..312.455 rows=20 loops=1)
        Sort Key: (sum(o.total_amount)) DESC
        Sort Method: top-N heapsort  Memory: 27kB
        Buffers: shared hit=1842 read=3201
        ->  HashAggregate  (cost=4194.00..4302.00 rows=5400 width=52)
                           (actual time=308.211..311.994 rows=5382 loops=1)
              Group Key: u.id
              Batches: 1  Memory Usage: 1553kB
              Buffers: shared hit=1842 read=3201
              ->  Hash Left Join  (cost=1823.00..3897.00 rows=59400 width=36)
                                  (actual time=45.122..289.452 rows=58941 loops=1)
                    Hash Cond: (o.user_id = u.id)
                    Buffers: shared hit=1842 read=3201
                    ->  Seq Scan on orders o  (cost=0.00..1823.00 rows=82300 width=16)
                                              (actual time=0.021..89.344 rows=82300 loops=1)
                          Buffers: shared hit=842 read=1201
                    ->  Hash  (cost=1598.00..1598.00 rows=18000 width=28)
                               (actual time=44.918..44.919 rows=18000 loops=1)
                          Buckets: 32768  Batches: 1  Memory Usage: 1130kB
                          Buffers: shared hit=1000 read=2000
                          ->  Seq Scan on users u  (cost=0.00..1598.00 rows=18000 width=28)
                                                   (actual time=0.015..39.213 rows=18000 loops=1)
                                Filter: ((status = 'active') AND (created_at > '2027-01-01'))
                                Rows Removed by Filter: 27000
                                Buffers: shared hit=1000 read=2000
Planning Time: 1.234 ms
Execution Time: 312.761 ms

What to look for:

SignalMeaningAction
Seq Scan on large tableNo usable indexAdd index on filter columns
Rows Removed by Filter: highIndex exists but not selective enoughPartial index or composite index
cost=X..Y estimated vs actual time large gapStale statisticsANALYZE tablename
Buffers: read=N (not hit=N)Cache misses — reading from diskIncrease shared_buffers, or add index to reduce rows read
loops=N on inner planNested loop with many iterationsSwitch to hash join, add index on join key
Sort Method: external merge DiskSort spilling to diskIncrease work_mem for this session

Index Types

B-tree (Default)

B-tree indexes support =, <, >, <=, >=, BETWEEN, IN, IS NULL, LIKE 'prefix%'.

-- Standard B-tree
CREATE INDEX idx_orders_user_id ON orders(user_id);

-- Composite B-tree — column order matters!
-- Put equality columns first, range column last
CREATE INDEX idx_orders_user_status_created ON orders(user_id, status, created_at DESC);
-- Supports: WHERE user_id = X AND status = Y AND created_at > Z
-- Does NOT support: WHERE status = Y (without user_id)

-- Covering index (INCLUDE avoids table heap fetch)
CREATE INDEX idx_orders_user_covering
  ON orders(user_id)
  INCLUDE (status, total_amount, created_at);
-- SELECT status, total_amount, created_at WHERE user_id = X
-- can be satisfied from index alone (index-only scan)

Partial Index

Index a subset of rows — much smaller, faster to maintain:

-- Only index active subscriptions (most queries filter on active)
CREATE INDEX idx_subscriptions_active
  ON subscriptions(workspace_id, plan)
  WHERE status = 'active';

-- Only index recent orders
CREATE INDEX idx_orders_recent
  ON orders(user_id, created_at DESC)
  WHERE created_at > '2026-01-01';

-- Only index pending jobs
CREATE INDEX idx_jobs_pending
  ON background_jobs(created_at, priority)
  WHERE status = 'pending';
-- Tiny index: only unprocessed rows, not millions of completed ones

GIN Index (Generalized Inverted Index)

For full-text search, JSONB, and arrays:

-- Full-text search
CREATE INDEX idx_articles_fts ON articles
  USING GIN(to_tsvector('english', title || ' ' || body));

-- JSONB containment (@>) and existence (?)
CREATE INDEX idx_users_metadata ON users USING GIN(metadata);
-- Supports: WHERE metadata @> '{"plan": "pro"}'
-- Supports: WHERE metadata ? 'custom_field'

-- Array containment
CREATE INDEX idx_posts_tags ON posts USING GIN(tags);
-- Supports: WHERE tags @> ARRAY['postgresql']
-- Supports: WHERE tags && ARRAY['postgresql', 'mysql']

-- pg_trgm for ILIKE and fuzzy search
CREATE EXTENSION pg_trgm;
CREATE INDEX idx_users_email_trgm ON users USING GIN(email gin_trgm_ops);
-- Supports: WHERE email ILIKE '%smith%'
-- Supports: similarity(email, 'johm') > 0.3

BRIN Index (Block Range INdex)

For naturally-ordered append-only data (timestamps, serial IDs):

-- 128× smaller than B-tree for time-series data
-- Good for: WHERE created_at > '2027-01-01' on a 100M row table
CREATE INDEX idx_events_created_brin
  ON events USING BRIN(created_at)
  WITH (pages_per_range = 128);

-- BRIN works because new rows always have later timestamps
-- Row order in heap must correlate with index column
-- NOT suitable for random UUIDs or frequently updated columns

Expression Index

Index the result of an expression:

-- Case-insensitive email lookup
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- Supports: WHERE LOWER(email) = LOWER($1)

-- Date part extraction
CREATE INDEX idx_events_year_month ON events(
  EXTRACT(YEAR FROM created_at),
  EXTRACT(MONTH FROM created_at)
);
-- Supports: WHERE EXTRACT(YEAR FROM created_at) = 2027

-- Computed status priority
CREATE INDEX idx_jobs_priority ON jobs(
  CASE status
    WHEN 'critical' THEN 1
    WHEN 'high' THEN 2
    WHEN 'normal' THEN 3
    ELSE 4
  END
);

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

  • AWS, GCP, Azure certified engineers
  • Infrastructure as Code (Terraform, CDK)
  • Docker, Kubernetes, GitHub Actions CI/CD
  • Typical audit recovers $500–$3,000/month in savings

Finding Missing Indexes

-- Sequential scans on large tables (missing index candidates)
SELECT
  schemaname,
  relname AS table_name,
  seq_scan,
  seq_tup_read,
  idx_scan,
  seq_tup_read / NULLIF(seq_scan, 0) AS avg_rows_per_seq_scan,
  pg_size_pretty(pg_total_relation_size(relid)) AS table_size
FROM pg_stat_user_tables
WHERE seq_scan > 100
  AND seq_tup_read > 100000
ORDER BY seq_tup_read DESC
LIMIT 20;

-- Unused indexes (wasting write overhead and storage)
SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan AS scans,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_relation_size(indexrelid) DESC;

-- Duplicate indexes (same column set, different names)
SELECT
  array_agg(indexname) AS duplicate_indexes,
  tablename,
  array_agg(indexdef) AS definitions
FROM pg_indexes
WHERE schemaname = 'public'
GROUP BY tablename, indexdef HAVING COUNT(*) > 1;

pg_stat_statements: Finding Slow Queries

-- Enable pg_stat_statements (add to postgresql.conf)
-- shared_preload_libraries = 'pg_stat_statements'
-- pg_stat_statements.max = 10000
-- pg_stat_statements.track = all

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Top 20 slowest queries by total time
SELECT
  round(total_exec_time::numeric, 2) AS total_ms,
  round(mean_exec_time::numeric, 2) AS mean_ms,
  round(stddev_exec_time::numeric, 2) AS stddev_ms,
  calls,
  round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS pct_of_total,
  LEFT(query, 120) AS query_preview
FROM pg_stat_statements
WHERE calls > 10
ORDER BY total_exec_time DESC
LIMIT 20;

-- Queries with highest row reads (table scan candidates)
SELECT
  calls,
  round(mean_exec_time::numeric, 2) AS mean_ms,
  rows / NULLIF(calls, 0) AS avg_rows,
  shared_blks_read / NULLIF(calls, 0) AS avg_blocks_read,
  LEFT(query, 120) AS query
FROM pg_stat_statements
WHERE calls > 10
ORDER BY shared_blks_read DESC
LIMIT 20;

-- Cache hit ratio per query (low = too many disk reads)
SELECT
  round(mean_exec_time::numeric, 2) AS mean_ms,
  calls,
  round(
    100.0 * shared_blks_hit /
    NULLIF(shared_blks_hit + shared_blks_read, 0),
    2
  ) AS cache_hit_pct,
  LEFT(query, 120) AS query
FROM pg_stat_statements
WHERE calls > 10
  AND (shared_blks_hit + shared_blks_read) > 0
ORDER BY cache_hit_pct ASC
LIMIT 20;

-- Reset statistics after analysis
SELECT pg_stat_statements_reset();

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

  • Staging + production environments with feature flags
  • Automated security scanning in the pipeline
  • Uptime monitoring + alerting + runbook automation
  • On-call support handover docs included

Autovacuum Tuning

Dead tuples accumulate from UPDATEs and DELETEs. Autovacuum reclaims space and updates statistics, but default settings are optimized for small tables:

-- Check table bloat and autovacuum stats
SELECT
  schemaname,
  relname AS table_name,
  n_live_tup,
  n_dead_tup,
  round(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pct,
  last_vacuum,
  last_autovacuum,
  last_analyze,
  last_autoanalyze,
  autovacuum_count,
  pg_size_pretty(pg_total_relation_size(relid)) AS total_size
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC
LIMIT 20;

-- Table-level autovacuum tuning for high-churn tables
ALTER TABLE orders SET (
  autovacuum_vacuum_scale_factor = 0.01,    -- vacuum when 1% is dead (default: 20%)
  autovacuum_analyze_scale_factor = 0.005,  -- analyze when 0.5% changed (default: 10%)
  autovacuum_vacuum_cost_delay = 2,         -- ms between work cost units (default: 2)
  autovacuum_vacuum_cost_limit = 400        -- work per cycle (default: 200)
);

-- For write-heavy tables, reduce cost delay further
ALTER TABLE events SET (
  autovacuum_vacuum_scale_factor = 0.02,
  autovacuum_vacuum_cost_delay = 0,         -- no throttling for urgent vacuums
  autovacuum_vacuum_cost_limit = 800
);

-- Manual vacuum (non-blocking) for an urgent case
VACUUM (ANALYZE, VERBOSE) orders;

-- VACUUM FULL reclaims space but takes full table lock — use sparingly
-- Only run in maintenance window with no active queries
VACUUM FULL orders;

postgresql.conf Autovacuum Settings

# postgresql.conf

# Global autovacuum settings (override per-table for high-churn tables)
autovacuum = on
autovacuum_max_workers = 5           # Default 3 — increase for many tables
autovacuum_naptime = 15s             # How often autovacuum checks (default: 1min)
autovacuum_vacuum_threshold = 50     # Min dead tuples before vacuum
autovacuum_vacuum_scale_factor = 0.1 # % of table (default: 0.2 = 20%)
autovacuum_analyze_threshold = 50
autovacuum_analyze_scale_factor = 0.05 # (default: 0.1 = 10%)

autovacuum_vacuum_cost_delay = 2ms   # Throttle to avoid I/O saturation
autovacuum_vacuum_cost_limit = 400   # Work units per cost delay cycle (default: 200)

# Table statistics — affects query planning
default_statistics_target = 200      # Default 100 — increase for complex queries

Memory Configuration

# postgresql.conf — key memory settings

# Shared buffer pool (data cache)
# Rule: 25% of RAM (not more than 40%)
shared_buffers = 4GB                  # For 16GB RAM server

# Work memory per sort/hash operation
# WARNING: each sort/hash uses up to this much
# Rule: (RAM - shared_buffers) / max_connections / 2
work_mem = 64MB                       # For low-concurrency workloads
                                      # Reduce to 4-16MB for high-concurrency

# Maintenance operations (VACUUM, CREATE INDEX)
maintenance_work_mem = 512MB

# OS cache estimate (PostgreSQL does not control this)
# Helps planner decide between index scan vs seq scan
effective_cache_size = 12GB           # ~75% of RAM

# WAL buffering (write performance)
wal_buffers = 64MB

# Checkpoint behavior (trade durability for write performance)
checkpoint_completion_target = 0.9
max_wal_size = 4GB
min_wal_size = 1GB

Connection Pool Sizing

-- Current connection usage
SELECT
  count(*) FILTER (WHERE state = 'active') AS active,
  count(*) FILTER (WHERE state = 'idle') AS idle,
  count(*) FILTER (WHERE state = 'idle in transaction') AS idle_in_transaction,
  count(*) AS total,
  (SELECT setting::int FROM pg_settings WHERE name = 'max_connections') AS max_connections,
  round(100.0 * count(*) / (SELECT setting::int FROM pg_settings WHERE name = 'max_connections'), 1) AS pct_used
FROM pg_stat_activity
WHERE backend_type = 'client backend';

-- Queries running longer than 30 seconds
SELECT
  pid,
  now() - query_start AS duration,
  state,
  LEFT(query, 200) AS query
FROM pg_stat_activity
WHERE state != 'idle'
  AND query_start < now() - INTERVAL '30 seconds'
  AND backend_type = 'client backend'
ORDER BY duration DESC;

-- Kill a long-running query (use carefully)
SELECT pg_cancel_backend(pid);    -- sends SIGINT (graceful)
SELECT pg_terminate_backend(pid); -- sends SIGTERM (force)

Monitoring Query: Database Health Dashboard

-- One-stop health check view
SELECT
  'Cache Hit Ratio' AS metric,
  round(
    100.0 * sum(blks_hit) / NULLIF(sum(blks_hit + blks_read), 0),
    2
  )::text || '%' AS value,
  CASE
    WHEN 100.0 * sum(blks_hit) / NULLIF(sum(blks_hit + blks_read), 0) >= 99 THEN '✅'
    WHEN 100.0 * sum(blks_hit) / NULLIF(sum(blks_hit + blks_read), 0) >= 95 THEN '⚠️'
    ELSE '❌'
  END AS status
FROM pg_stat_database
WHERE datname = current_database()

UNION ALL

SELECT
  'Total Dead Tuples',
  to_char(sum(n_dead_tup), 'FM999,999,999'),
  CASE WHEN sum(n_dead_tup) < 1000000 THEN '✅' ELSE '⚠️' END
FROM pg_stat_user_tables

UNION ALL

SELECT
  'Index Hit Ratio',
  round(
    100.0 * sum(idx_blks_hit) / NULLIF(sum(idx_blks_hit + idx_blks_read), 0),
    2
  )::text || '%',
  CASE
    WHEN 100.0 * sum(idx_blks_hit) / NULLIF(sum(idx_blks_hit + idx_blks_read), 0) >= 99 THEN '✅'
    ELSE '⚠️'
  END
FROM pg_statio_user_indexes

UNION ALL

SELECT
  'Tables Needing Vacuum',
  count(*)::text,
  CASE WHEN count(*) = 0 THEN '✅' ELSE '⚠️' END
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
  AND (last_autovacuum < now() - INTERVAL '1 hour' OR last_autovacuum IS NULL);

Cost and Timeline Estimates

Engagement TypeTeamDurationCost Range
Performance audit (find top 10 bottlenecks)1 DBA1–2 days$500–1,500
Index strategy implementation1 DBA/dev1 week$2,000–4,000
Full tuning (indexes + autovacuum + memory + partitioning)1–2 engineers2–3 weeks$6,000–14,000
Ongoing DBA retainer1 DBAMonthly$2,000–6,000/month

See Also


Working With Viprasol

PostgreSQL performance problems rarely announce themselves clearly — a query that runs fine at 100K rows falls apart at 10M. Our team has tuned PostgreSQL databases from startup scale to hundreds of millions of rows, using the same systematic approach: pg_stat_statements to find the queries, EXPLAIN ANALYZE to understand the plans, and targeted indexes and autovacuum configuration to fix them.

What we deliver:

  • Query plan analysis and index recommendations
  • Autovacuum and memory configuration tuned to your workload
  • pg_stat_statements setup with alerting on slow query regressions
  • Partitioning strategy for large tables
  • Written optimization report with before/after benchmarks

Talk to our team about PostgreSQL performance →

Or explore our cloud infrastructure services.

Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need DevOps & Cloud Expertise?

Scale your infrastructure with confidence. AWS, GCP, Azure certified team.

Free consultation • No commitment • Response within 24 hours

Viprasol · Big Data & Analytics

Making sense of your data at scale?

Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.