Back to Blog

PostgreSQL Backup and Restore in 2026: pg_dump, WAL Archiving, PITR, and Restore Testing

Production PostgreSQL backup strategy: pg_dump for logical backups, WAL archiving with pgBackRest, point-in-time recovery, RDS automated backups, and restore testing automation.

Viprasol Tech Team
March 4, 2027
14 min read

PostgreSQL Backup and Restore in 2026: pg_dump, WAL Archiving, PITR, and Restore Testing

Most teams think about backup strategy after their first data loss incident. The correct time to think about it is now, when the cost of getting it wrong is zero. Backups that aren't tested aren't backups — they're a false sense of security. And RDS automated backups, while necessary, aren't sufficient on their own for most SaaS products.

This post covers the full backup strategy: logical backups with pg_dump, WAL-based continuous archiving with pgBackRest, point-in-time recovery, AWS RDS snapshot configuration, and — most importantly — automated restore testing.


Backup Strategy Layers

LayerToolRPOWhat It Protects Against
Logical (schema + data)pg_dump / pg_dumpallDailyAccidental DROP TABLE, corruption
Continuous WAL archivingpgBackRestSecondsServer failure, need for PITR
Cloud snapshotsRDS automated backups5 minAZ failure, instance corruption
Cross-region copyS3 replication / RDS copyRPO + copy timeRegion failure
Point-in-time recoveryWAL replayAny point"Restore to 10 min ago"

pg_dump: Logical Backups

#!/bin/bash
# scripts/pg-dump-backup.sh
# Run daily via Lambda cron or EC2 cron

set -euo pipefail

DB_HOST="${PGHOST}"
DB_NAME="${PGDATABASE}"
DB_USER="${PGUSER}"
S3_BUCKET="${BACKUP_S3_BUCKET}"
DATE=$(date +%Y-%m-%d-%H%M%S)
BACKUP_FILE="/tmp/backup-${DB_NAME}-${DATE}.dump"

echo "Starting pg_dump backup: ${DB_NAME}"

# Custom format: compressed, parallel restore capable
pg_dump \
  --host="${DB_HOST}" \
  --username="${DB_USER}" \
  --dbname="${DB_NAME}" \
  --format=custom \         # Binary format — faster restore than plain SQL
  --compress=9 \            # Max compression
  --no-password \           # Use PGPASSWORD env var
  --verbose \
  --file="${BACKUP_FILE}"

# Upload to S3 with server-side encryption
aws s3 cp "${BACKUP_FILE}" \
  "s3://${S3_BUCKET}/logical/${DATE:0:7}/${DATE}.dump" \
  --storage-class STANDARD_IA \   # Cheaper for infrequent access
  --sse aws:kms

# Clean up local file
rm -f "${BACKUP_FILE}"

echo "Backup complete: s3://${S3_BUCKET}/logical/${DATE:0:7}/${DATE}.dump"
# Restore from pg_dump backup
pg_restore \
  --host="${RESTORE_HOST}" \
  --username="${DB_USER}" \
  --dbname="${RESTORE_DB}" \
  --format=custom \
  --jobs=4 \          # Parallel restore (use CPU cores)
  --verbose \
  --no-owner \        # Don't restore ownership (useful when restoring to different user)
  /path/to/backup.dump

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

  • AWS, GCP, Azure certified engineers
  • Infrastructure as Code (Terraform, CDK)
  • Docker, Kubernetes, GitHub Actions CI/CD
  • Typical audit recovers $500–$3,000/month in savings

pgBackRest: Continuous WAL Archiving to S3

pgBackRest is the modern replacement for pg_basebackup + WAL-E. It supports:

  • Full, differential, and incremental backups
  • WAL archiving to S3
  • Parallel backup and restore
  • Backup verification
# /etc/pgbackrest/pgbackrest.conf

[global]
repo1-type=s3
repo1-s3-bucket=myapp-production-pgbackrest
repo1-s3-region=us-east-1
repo1-s3-key=<aws-access-key>
repo1-s3-key-secret=<aws-secret-key>
repo1-path=/pgbackrest
repo1-retention-full=7           # Keep 7 full backups
repo1-retention-diff=14          # Keep 14 differential backups

# Encryption at rest
repo1-cipher-type=aes-256-cbc
repo1-cipher-pass=your-strong-passphrase

log-level-console=info
log-level-file=detail

[myapp]
pg1-path=/var/lib/postgresql/15/main
pg1-port=5432
-- postgresql.conf changes for WAL archiving
-- archive_mode = on
-- archive_command = 'pgbackrest --stanza=myapp archive-push %p'
-- wal_level = replica   (minimum for WAL archiving)
# Initial setup
pgbackrest --stanza=myapp stanza-create

# First full backup
pgbackrest --stanza=myapp --type=full backup

# Schedule via cron:
# Full backup: weekly
# 0 2 * * 0 pgbackrest --stanza=myapp --type=full backup

# Differential backup: daily
# 0 2 * * 1-6 pgbackrest --stanza=myapp --type=diff backup

# Verify backup integrity
pgbackrest --stanza=myapp verify

Point-In-Time Recovery (PITR)

# Scenario: user deleted all orders at 14:32 UTC
# Restore to 14:30 UTC (2 minutes before)

# Step 1: Stop the current database
systemctl stop postgresql

# Step 2: Restore base backup
pgbackrest --stanza=myapp \
  --type=time \
  --target="2027-03-04 14:30:00+00" \
  --target-action=promote \
  restore

# Step 3: Start PostgreSQL (it will replay WAL up to 14:30)
systemctl start postgresql

# Step 4: Verify data is as expected
psql -c "SELECT COUNT(*) FROM orders WHERE created_at < '2027-03-04 14:30:00';"

# Step 5: Once verified, promote to normal operation
# (already promoted via --target-action=promote)

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

  • Staging + production environments with feature flags
  • Automated security scanning in the pipeline
  • Uptime monitoring + alerting + runbook automation
  • On-call support handover docs included

AWS RDS: Automated Backup Configuration

# terraform/rds.tf

resource "aws_rds_cluster" "main" {
  cluster_identifier      = "${var.name}-${var.environment}"
  engine                  = "aurora-postgresql"
  engine_version          = "16.2"
  database_name           = var.db_name
  master_username         = var.db_username
  manage_master_user_password = true  # Rotate via Secrets Manager

  # Backup configuration
  backup_retention_period   = var.environment == "production" ? 35 : 7
  # 35 days retention in production (max for Aurora)
  # PITR available for any point in this window

  preferred_backup_window   = "03:00-04:00"  # Off-peak UTC
  copy_tags_to_snapshot     = true
  deletion_protection       = var.environment == "production"
  skip_final_snapshot       = var.environment != "production"
  final_snapshot_identifier = "${var.name}-${var.environment}-final-${formatdate("YYYY-MM-DD", timestamp())}"

  db_subnet_group_name   = aws_db_subnet_group.main.name
  vpc_security_group_ids = [aws_security_group.rds.id]

  tags = var.common_tags

  lifecycle {
    ignore_changes = [availability_zones]
  }
}

# Cross-region snapshot copy (for disaster recovery)
resource "aws_db_cluster_snapshot" "cross_region" {
  count                          = var.environment == "production" ? 1 : 0
  db_cluster_identifier          = aws_rds_cluster.main.id
  db_cluster_snapshot_identifier = "${var.name}-${var.environment}-cross-region"
}

RDS PITR Restore

# AWS CLI: restore RDS cluster to a point in time
aws rds restore-db-cluster-to-point-in-time \
  --db-cluster-identifier myapp-production-restored \
  --source-db-cluster-identifier myapp-production \
  --restore-to-time "2027-03-04T14:30:00Z" \
  --db-subnet-group-name myapp-production-subnet-group \
  --vpc-security-group-ids sg-0abc123 \
  --tags Key=Purpose,Value=PITRRestore

# Wait for restore to complete
aws rds wait db-cluster-available \
  --db-cluster-identifier myapp-production-restored

# Connect and verify
psql "postgresql://user:pass@myapp-production-restored.cluster-xxxx.us-east-1.rds.amazonaws.com/myapp"

Automated Restore Testing

A backup that isn't tested isn't a backup. Run this weekly:

// scripts/test-backup-restore.ts
// Run via Lambda cron: every Sunday at 4am UTC

import {
  RDSClient,
  RestoreDBClusterToPointInTimeCommand,
  DeleteDBClusterCommand,
  DescribeDBClustersCommand,
} from "@aws-sdk/client-rds";
import { db } from "@/lib/db"; // Connected to restored cluster

const rds = new RDSClient({ region: process.env.AWS_REGION });

async function testRestoreAndValidate() {
  const testClusterId = `${process.env.SOURCE_CLUSTER_ID}-restore-test-${Date.now()}`;

  try {
    console.log("Step 1: Restoring cluster to 1 hour ago...");
    await rds.send(new RestoreDBClusterToPointInTimeCommand({
      DBClusterIdentifier: testClusterId,
      SourceDBClusterIdentifier: process.env.SOURCE_CLUSTER_ID!,
      RestoreToTime: new Date(Date.now() - 60 * 60 * 1000), // 1 hour ago
      DBSubnetGroupName: process.env.SUBNET_GROUP!,
      VpcSecurityGroupIds: [process.env.SECURITY_GROUP_ID!],
    }));

    // Wait up to 20 minutes for restore
    await waitForCluster(testClusterId, 20 * 60 * 1000);
    console.log("Step 2: Restore complete.");

    // Step 3: Run validation queries against restored cluster
    const endpoint = await getClusterEndpoint(testClusterId);
    const testDb = createTestConnection(endpoint);

    const [userCount, orderCount, recentOrder] = await Promise.all([
      testDb.user.count(),
      testDb.order.count(),
      testDb.order.findFirst({ orderBy: { createdAt: "desc" } }),
    ]);

    console.log(`Validation: ${userCount} users, ${orderCount} orders`);
    console.log(`Most recent order: ${recentOrder?.id} at ${recentOrder?.createdAt}`);

    if (userCount === 0) throw new Error("Restore validation failed: no users found");
    if (orderCount === 0) throw new Error("Restore validation failed: no orders found");

    console.log("✅ Restore test PASSED");

    // Send success notification
    await sendAlert({
      type: "success",
      message: `Weekly restore test passed: ${userCount} users, ${orderCount} orders`,
    });
  } catch (err) {
    console.error("❌ Restore test FAILED:", err);
    await sendAlert({ type: "failure", message: String(err) });
    throw err;
  } finally {
    // Always clean up the test cluster
    console.log("Cleaning up test cluster...");
    await rds.send(new DeleteDBClusterCommand({
      DBClusterIdentifier: testClusterId,
      SkipFinalSnapshot: true,
    }));
    console.log("Test cluster deleted.");
  }
}

async function waitForCluster(clusterId: string, timeoutMs: number) {
  const deadline = Date.now() + timeoutMs;
  while (Date.now() < deadline) {
    const { DBClusters } = await rds.send(new DescribeDBClustersCommand({
      DBClusterIdentifier: clusterId,
    }));
    if (DBClusters?.[0]?.Status === "available") return;
    await new Promise((r) => setTimeout(r, 30_000)); // Poll every 30s
  }
  throw new Error(`Cluster ${clusterId} did not become available within ${timeoutMs}ms`);
}

Backup Monitoring

-- Check WAL archiving status
SELECT
  archived_count,
  last_archived_wal,
  last_archived_time,
  failed_count,
  last_failed_wal,
  last_failed_time
FROM pg_stat_archiver;

-- Alert if failed_count > 0 or last_archived_time > 1 hour ago

Cost Estimate

Backup TypeStorageCost/Month
pg_dump (daily, 30 day retention, 10GB compressed)300 GB S3-IA~$5
WAL archiving (continuous, 7 day retention)~50 GB~$1.15
RDS automated backup (35 day, 100GB DB)3.5 TB~$87 (first 100% free)
Cross-region snapshot copySame as above+$87

See Also


Working With Viprasol

We design and implement PostgreSQL backup strategies for production SaaS databases — from simple pg_dump schedules through continuous WAL archiving with automated weekly restore testing. Our team has implemented backup systems that have successfully restored production data multiple times.

What we deliver:

  • Layered backup strategy design (logical + WAL + snapshots)
  • pgBackRest configuration with S3 archiving
  • RDS backup retention and cross-region copy configuration
  • Automated restore testing Lambda (weekly, validates row counts)
  • CloudWatch alarms for WAL archiving failures and backup age

Explore our cloud infrastructure services or contact us to implement a tested backup strategy.

Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need DevOps & Cloud Expertise?

Scale your infrastructure with confidence. AWS, GCP, Azure certified team.

Free consultation • No commitment • Response within 24 hours

Viprasol · Big Data & Analytics

Making sense of your data at scale?

Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.