Back to Blog

PostgreSQL Replication: Streaming Replication, Read Replicas, Logical Replication, and Failover

Set up PostgreSQL replication in 2026 — streaming replication configuration, read replica routing, logical replication for zero-downtime migrations, replication

Viprasol Tech Team
June 28, 2026
12 min read

PostgreSQL Replication: Streaming Replication, Read Replicas, Logical Replication, and Failover

A single PostgreSQL primary is a single point of failure and a single write bottleneck. Replication solves both: read replicas distribute read load, and standby servers provide failover capability. Understanding the types of replication and their tradeoffs is essential for production database architecture.


Replication Types

TypeWhat ReplicatesGranularityUse Case
Streaming (physical)All WAL records — entire clusterRow-level, byte-for-byte copyRead replicas, standby failover
LogicalDecoded change events (INSERT/UPDATE/DELETE)Table-level, selectiveCross-version migration, CDC, partial replication
CascadingReplica replicates from another replicaLike streamingReduce primary load (many replicas)

Streaming Replication Setup

# Primary server: postgresql.conf
wal_level = replica                    # Minimum for replication
max_wal_senders = 10                   # Max concurrent replication connections
wal_keep_size = 1GB                    # Keep 1GB of WAL for replicas to catch up
synchronous_commit = on                # Async replicas: 'off' for better write performance
                                       # Sync replicas: 'on' guarantees no data loss on failover

# For synchronous replication (zero data loss, but slower writes):
# synchronous_standby_names = 'FIRST 1 (replica1, replica2)'
# Primary server: pg_hba.conf — allow replication connections
host  replication  replicator  10.0.0.0/8  scram-sha-256
# On primary: create replication user
psql -c "CREATE USER replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'strong-password';"

# On replica: create base backup from primary
pg_basebackup \
  --host=primary-db.internal \
  --username=replicator \
  --pgdata=/var/lib/postgresql/17/main \
  --wal-method=stream \
  --checkpoint=fast \
  --progress \
  --verbose
# Replica server: postgresql.conf
hot_standby = on                       # Allow reads on replica (essential for read replicas)
primary_conninfo = 'host=primary-db.internal user=replicator password=strong-password application_name=replica1'
primary_slot_name = 'replica1_slot'    # Replication slot (primary keeps WAL until replica confirms receipt)
recovery_target_timeline = 'latest'
# Signal the replica to start as standby
touch /var/lib/postgresql/17/main/standby.signal
systemctl start postgresql

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

  • AWS, GCP, Azure certified engineers
  • Infrastructure as Code (Terraform, CDK)
  • Docker, Kubernetes, GitHub Actions CI/CD
  • Typical audit recovers $500–$3,000/month in savings

Read Replica Routing in Your Application

// lib/db.ts — primary/replica routing with Prisma
import { PrismaClient } from '@prisma/client';

// Separate clients for primary (writes) and replica (reads)
const primaryClient = new PrismaClient({
  datasources: { db: { url: process.env.DATABASE_URL } },
  log: ['error'],
});

const replicaClient = new PrismaClient({
  datasources: { db: { url: process.env.DATABASE_REPLICA_URL } },
  log: ['error'],
});

// Wrapper that routes by operation type
export const db = {
  // Read operations → replica
  users: {
    findUnique: (...args: Parameters<typeof primaryClient.users.findUnique>) =>
      replicaClient.users.findUnique(...args),
    findMany: (...args: Parameters<typeof primaryClient.users.findMany>) =>
      replicaClient.users.findMany(...args),
    // Write operations → primary
    create: (...args: Parameters<typeof primaryClient.users.create>) =>
      primaryClient.users.create(...args),
    update: (...args: Parameters<typeof primaryClient.users.update>) =>
      primaryClient.users.update(...args),
    delete: (...args: Parameters<typeof primaryClient.users.delete>) =>
      primaryClient.users.delete(...args),
    upsert: (...args: Parameters<typeof primaryClient.users.upsert>) =>
      primaryClient.users.upsert(...args),
  },

  // Transactions always go to primary
  $transaction: (...args: Parameters<typeof primaryClient.$transaction>) =>
    primaryClient.$transaction(...args),

  $executeRaw: (...args: Parameters<typeof primaryClient.$executeRaw>) =>
    primaryClient.$executeRaw(...args),
};

Important caveat: After a write on the primary, a subsequent read from the replica may not see the new data yet (replication lag). For reads that must be consistent with a just-completed write, route them to the primary:

// After writing: read from primary to avoid stale replica read
async function createOrderAndReturnFresh(data: CreateOrderData): Promise<Order> {
  const order = await primaryClient.orders.create({ data });

  // Read from PRIMARY (same connection guarantees fresh data)
  // If you read from replica, you might get stale data for ~100ms
  return primaryClient.orders.findUnique({ where: { id: order.id } })!;
}

Monitoring Replication Lag

-- On primary: check replica lag
SELECT
  application_name,
  state,
  sent_lsn,
  write_lsn,
  flush_lsn,
  replay_lsn,
  -- Lag in bytes
  (sent_lsn - replay_lsn)::bigint AS replay_lag_bytes,
  -- Lag in time
  write_lag,
  flush_lag,
  replay_lag
FROM pg_stat_replication;

-- On replica: check own lag
SELECT
  now() - pg_last_xact_replay_timestamp() AS replication_lag_seconds,
  pg_is_in_recovery() AS is_replica,
  pg_last_wal_receive_lsn() AS received_lsn,
  pg_last_wal_replay_lsn() AS replayed_lsn;
// lib/replication-monitor.ts — alert on high lag
export async function checkReplicationLag(): Promise<void> {
  const result = await primaryClient.$queryRaw<Array<{ replay_lag_bytes: number; application_name: string }>>`
    SELECT application_name, (sent_lsn - replay_lsn)::bigint AS replay_lag_bytes
    FROM pg_stat_replication;
  `;

  for (const replica of result) {
    const lagMB = replica.replay_lag_bytes / (1024 * 1024);

    if (lagMB > 100) {
      await alerting.send({
        severity: 'warning',
        title: `Replica ${replica.application_name} lag: ${lagMB.toFixed(0)}MB`,
        description: 'Replica is falling behind. Check for: slow network, long queries, autovacuum contention.',
      });
    }
  }
}

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

  • Staging + production environments with feature flags
  • Automated security scanning in the pipeline
  • Uptime monitoring + alerting + runbook automation
  • On-call support handover docs included

Logical Replication

Logical replication decodes WAL into SQL-level events. Unlike physical replication, it's selective (table-level) and works across major PostgreSQL versions — making it essential for zero-downtime major version upgrades.

-- On source (old version): create publication
CREATE PUBLICATION migration_pub FOR TABLE orders, users, products;

-- On target (new version): create subscription
CREATE SUBSCRIPTION migration_sub
  CONNECTION 'host=old-db.internal user=replicator password=xxx dbname=app'
  PUBLICATION migration_pub;

-- Monitor subscription progress
SELECT
  subname,
  pid,
  received_lsn,
  latest_end_lsn,
  latest_end_time
FROM pg_stat_subscription;

Zero-downtime major version migration steps:

1. Spin up new PostgreSQL 17 instance alongside existing PostgreSQL 15
2. Create schema on new instance (migrations only, no data)
3. Start logical replication: new subscribes to old's publication
4. Wait for replication lag to reach < 1 second
5. Maintenance window (seconds):
   a. Block writes on old instance (set to read-only)
   b. Wait for replication to fully catch up (lag = 0)
   c. Promote new instance to primary
   d. Update DATABASE_URL in app config
   e. Restart app servers
6. Monitor for 1 hour
7. Decommission old instance

Automatic Failover with Patroni

Patroni manages PostgreSQL high availability — detecting primary failure and promoting a replica automatically:

# patroni.yml — on each PostgreSQL node
scope: postgres-cluster
name: pg-node-1

restapi:
  listen: 0.0.0.0:8008
  connect_address: 10.0.1.1:8008

etcd3:
  hosts:
    - etcd-1:2379
    - etcd-2:2379
    - etcd-3:2379

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 30
    maximum_lag_on_failover: 10485760  # 10MB — don't promote replica with > 10MB lag
    postgresql:
      use_pg_rewind: true
      parameters:
        max_connections: 200
        shared_buffers: 4GB
        wal_level: replica

postgresql:
  listen: 0.0.0.0:5432
  connect_address: 10.0.1.1:5432
  data_dir: /var/lib/postgresql/17/main
  authentication:
    replication:
      username: replicator
      password: strong-password
    superuser:
      username: postgres
      password: postgres-password
# Monitor cluster health
patronictl -c /etc/patroni.yml list
# Member      Host         Role    State   TL  Lag in MB
# pg-node-1   10.0.1.1:5432  Leader  running  3  |
# pg-node-2   10.0.1.2:5432  Replica running  3  0
# pg-node-3   10.0.1.3:5432  Replica running  3  0

# Manual failover (e.g., for maintenance)
patronictl -c /etc/patroni.yml failover postgres-cluster --master pg-node-1 --candidate pg-node-2

RDS Multi-AZ vs Read Replicas

For teams on AWS RDS, the two options:

RDS Multi-AZRDS Read Replica
PurposeHigh availability (failover)Read scaling
SyncSynchronousAsynchronous
Failover time60–120 secondsNot automatic
Can read from standby?❌ (Multi-AZ standby is not readable)
Cost2× instance costAdditional instance
In same region?Same AZ + standby AZAny region
# terraform/rds.tf
resource "aws_db_instance" "primary" {
  identifier        = "app-postgres"
  engine            = "postgres"
  engine_version    = "17.2"
  instance_class    = "db.r6g.large"
  allocated_storage = 100

  multi_az = true  # Creates synchronous standby in different AZ

  backup_retention_period = 7  # Days of automated backups
  deletion_protection     = true
}

resource "aws_db_instance" "read_replica" {
  identifier          = "app-postgres-read"
  replicate_source_db = aws_db_instance.primary.id
  instance_class      = "db.r6g.large"
  # No storage needed — derived from source
}

Working With Viprasol

We architect and implement PostgreSQL replication infrastructure — streaming replication, read replica routing, logical replication for zero-downtime migrations, Patroni HA setup, and RDS Multi-AZ configuration.

Talk to our team about database reliability and infrastructure.


See Also

Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need DevOps & Cloud Expertise?

Scale your infrastructure with confidence. AWS, GCP, Azure certified team.

Free consultation • No commitment • Response within 24 hours

Viprasol · Big Data & Analytics

Making sense of your data at scale?

Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.