Back to Blog

PostgreSQL Backup Strategy: pg_dump, WAL Archiving, Point-in-Time Recovery, and RDS Automated Backups

Design a production PostgreSQL backup strategy. Covers pg_dump for logical backups, WAL archiving with pgBackRest for continuous backup, point-in-time recovery (PITR), RDS automated backups, backup testing, and retention policies.

Viprasol Tech Team
April 30, 2027
13 min read

Every PostgreSQL backup strategy fails silently until you need it. Teams discover their backups are corrupt, incomplete, or simply not restorable during the worst possible moment — after a data loss incident. A backup strategy isn't complete until it includes regular restore tests, and recovery time objectives (RTO) written down before the incident, not during.

This guide covers the three layers of PostgreSQL backup, how to implement each, and how to test them.

The Three Backup Layers

LayerWhat It CoversRPORTO
Logical (pg_dump)Schema + data snapshotHours to days30 min – 4 hours
Physical + WAL (pgBackRest)Continuous block-level backupSeconds to minutes15 min – 2 hours
RDS AutomatedManaged backup + PITR5 minutes15 min – 1 hour

RPO (Recovery Point Objective): How much data you can afford to lose (time). RTO (Recovery Time Objective): How long restoration can take before it's unacceptable.

Layer 1: Logical Backups with pg_dump

#!/bin/bash
# scripts/pg-backup.sh — daily logical backup to S3

set -euo pipefail

DATE=$(date +%Y-%m-%d)
TIMESTAMP=$(date +%Y-%m-%dT%H-%M-%S)
DB_NAME="${POSTGRES_DB:-myapp}"
BACKUP_FILE="${DB_NAME}-${TIMESTAMP}.dump"
S3_PATH="s3://${S3_BACKUPS_BUCKET}/postgres/logical/${DATE}/${BACKUP_FILE}"

echo "Starting backup: ${DB_NAME}${S3_PATH}"

# Custom format (-Fc): compressed, supports parallel restore
# --no-owner: don't include ownership (useful when restoring to different user)
# --no-acl: don't include ACL (permissions), cleaner for restores
pg_dump \
  --host="${POSTGRES_HOST}" \
  --port="${POSTGRES_PORT:-5432}" \
  --username="${POSTGRES_USER}" \
  --dbname="${DB_NAME}" \
  --format=custom \
  --compress=9 \
  --no-owner \
  --no-acl \
  --verbose \
  --file="/tmp/${BACKUP_FILE}"

# Verify dump is valid before uploading
pg_restore --list "/tmp/${BACKUP_FILE}" > /dev/null
echo "Dump integrity check: PASSED"

# Upload to S3 with server-side encryption
aws s3 cp "/tmp/${BACKUP_FILE}" "${S3_PATH}" \
  --sse AES256 \
  --metadata "db=${DB_NAME},timestamp=${TIMESTAMP}"

# Clean up local file
rm "/tmp/${BACKUP_FILE}"

echo "Backup complete: ${S3_PATH}"

# Cleanup old backups (keep 30 days)
aws s3 ls "s3://${S3_BACKUPS_BUCKET}/postgres/logical/" | \
  awk '{print $2}' | \
  sort | \
  head -n -30 | \
  xargs -I{} aws s3 rm "s3://${S3_BACKUPS_BUCKET}/postgres/logical/{}" --recursive
# scripts/pg-restore.sh — restore from logical backup
#!/bin/bash
set -euo pipefail

S3_PATH="$1"  # e.g., s3://my-backups/postgres/logical/2027-04-30/myapp-2027-04-30T02-00-00.dump
TARGET_DB="${2:-myapp_restore}"

echo "Restoring ${S3_PATH}${TARGET_DB}"

# Download
aws s3 cp "${S3_PATH}" "/tmp/restore.dump"

# Create target DB (must not exist)
createdb --host="${POSTGRES_HOST}" --username="${POSTGRES_USER}" "${TARGET_DB}"

# Parallel restore: -j 4 uses 4 workers (faster for large DBs)
pg_restore \
  --host="${POSTGRES_HOST}" \
  --username="${POSTGRES_USER}" \
  --dbname="${TARGET_DB}" \
  --jobs=4 \
  --no-owner \
  --no-acl \
  --verbose \
  "/tmp/restore.dump"

rm "/tmp/restore.dump"
echo "Restore complete: ${TARGET_DB}"

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

  • AWS, GCP, Azure certified engineers
  • Infrastructure as Code (Terraform, CDK)
  • Docker, Kubernetes, GitHub Actions CI/CD
  • Typical audit recovers $500–$3,000/month in savings

Layer 2: Continuous Backup with pgBackRest

pgBackRest provides incremental physical backups + WAL archiving for point-in-time recovery. Configure it on your PostgreSQL server:

# /etc/pgbackrest/pgbackrest.conf

[global]
# S3 repository
repo1-type=s3
repo1-s3-bucket=my-app-pg-backups
repo1-s3-region=us-east-1
repo1-s3-key=<ACCESS_KEY>
repo1-s3-key-secret=<SECRET_KEY>
repo1-path=/pgbackrest

# Encryption
repo1-cipher-type=aes-256-cbc
repo1-cipher-pass=<STRONG_RANDOM_PASSPHRASE>

# Retention
repo1-retention-full=4          # Keep 4 full backups
repo1-retention-diff=14         # Keep 14 days of differential backups
repo1-retention-archive=14      # Keep 14 days of WAL archives

# Compression
compress-type=lz4               # Fast compression (lz4 or zst)

[myapp]
pg1-path=/var/lib/postgresql/16/main
pg1-user=postgres
-- postgresql.conf additions for WAL archiving
archive_mode = on
archive_command = 'pgbackrest --stanza=myapp archive-push %p'
wal_level = replica       -- Required for WAL archiving
# Initial setup (run once)
pgbackrest --stanza=myapp stanza-create

# Schedule in cron
# Full backup: weekly (Sunday 2am)
0 2 * * 0  postgres /usr/bin/pgbackrest --stanza=myapp --type=full backup

# Differential backup: daily (2am)
0 2 * * 1-6 postgres /usr/bin/pgbackrest --stanza=myapp --type=diff backup

# Check backup status
pgbackrest --stanza=myapp info

Point-in-Time Recovery (PITR)

# Restore to a specific point in time (e.g., 1 minute before a bad migration)
# This is your "undo button" for database disasters

# 1. Stop the database
systemctl stop postgresql

# 2. Restore from the most recent backup before target time
pgbackrest --stanza=myapp restore \
  --target="2027-04-30 14:29:59" \
  --target-action=promote \
  --type=time \
  --delta                    # Only replace changed files (faster)

# 3. Start PostgreSQL — it will replay WAL up to the target time
systemctl start postgresql

# 4. Verify
psql -c "SELECT NOW();" -c "SELECT COUNT(*) FROM orders;"

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

  • Staging + production environments with feature flags
  • Automated security scanning in the pipeline
  • Uptime monitoring + alerting + runbook automation
  • On-call support handover docs included

Layer 3: RDS Automated Backups

If you're on AWS RDS or Aurora, automated backups are built in. Configure them properly:

# terraform/rds.tf

resource "aws_db_instance" "postgres" {
  identifier        = "${var.app_name}-${var.environment}"
  engine            = "postgres"
  engine_version    = "16.2"
  instance_class    = "db.t4g.medium"

  # Storage
  allocated_storage     = 100
  max_allocated_storage = 500    # Auto-scaling storage
  storage_type          = "gp3"
  storage_encrypted     = true
  kms_key_id            = aws_kms_key.rds.arn

  # Backup configuration
  backup_retention_period   = 30       # Days (max 35)
  backup_window             = "03:00-04:00"  # UTC — low-traffic window
  maintenance_window        = "sun:04:00-sun:05:00"
  copy_tags_to_snapshot     = true
  delete_automated_backups  = false    # Keep backups even after instance deletion

  # Enable PITR (enabled by default when backup_retention_period > 0)
  # RDS PITR granularity: 5 minutes

  # Multi-AZ for production (automatic failover)
  multi_az = var.environment == "production"

  # Final snapshot before destroy
  final_snapshot_identifier  = "${var.app_name}-${var.environment}-final"
  skip_final_snapshot         = var.environment != "production"

  tags = { Environment = var.environment }
}

# Cross-region backup copy (disaster recovery)
resource "aws_db_instance_automated_backups_replication" "cross_region" {
  count                   = var.environment == "production" ? 1 : 0
  source_db_instance_arn  = aws_db_instance.postgres.arn
  retention_period        = 7     # Keep 7 days of cross-region backups
  kms_key_id              = aws_kms_key.rds_dr.arn
  provider                = aws.dr_region   # Secondary region provider
}

RDS: Point-in-Time Restore

# Restore RDS to a specific point in time via AWS CLI
aws rds restore-db-instance-to-point-in-time \
  --source-db-instance-identifier myapp-production \
  --target-db-instance-identifier myapp-production-restore-20270430 \
  --restore-time 2027-04-30T14:29:59Z \
  --db-instance-class db.t4g.medium \
  --multi-az false   # Don't need HA for temporary restore

# Wait for restoration to complete
aws rds wait db-instance-available \
  --db-instance-identifier myapp-production-restore-20270430

echo "Restore complete — connect to myapp-production-restore-20270430"

Backup Testing: The Critical Step Most Teams Skip

#!/bin/bash
# scripts/test-restore.sh — run monthly in CI or manually
# Tests that backups are actually restorable

set -euo pipefail

RESTORE_DB="backup_test_$(date +%s)"
LATEST_BACKUP=$(aws s3 ls "s3://${S3_BACKUPS_BUCKET}/postgres/logical/" \
  --recursive | sort | tail -1 | awk '{print $4}')

echo "Testing restore from: ${LATEST_BACKUP}"

# Restore to temporary database
./pg-restore.sh "s3://${S3_BACKUPS_BUCKET}/${LATEST_BACKUP}" "${RESTORE_DB}"

# Sanity checks
ROW_COUNT=$(psql \
  --host="${POSTGRES_HOST}" \
  --username="${POSTGRES_USER}" \
  --dbname="${RESTORE_DB}" \
  --tuples-only \
  --command="SELECT COUNT(*) FROM users;")

echo "User count in restored backup: ${ROW_COUNT}"

if [ "${ROW_COUNT}" -lt "1" ]; then
  echo "ERROR: Restore test failed — no users found"
  exit 1
fi

# Run a few more checks
psql \
  --host="${POSTGRES_HOST}" \
  --username="${POSTGRES_USER}" \
  --dbname="${RESTORE_DB}" \
  --command="SELECT MAX(created_at) FROM orders; SELECT relname, n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC LIMIT 5;"

# Clean up
dropdb --host="${POSTGRES_HOST}" --username="${POSTGRES_USER}" "${RESTORE_DB}"
echo "Restore test PASSED — backup from ${LATEST_BACKUP} is valid"

Backup Cost Estimates

ApproachStorageMonthly Cost (100GB DB)
pg_dump to S3 (daily, 30-day retention)~3TB compressed~$70/month
pgBackRest to S3 (full weekly + WAL)~500GB~$12/month
RDS Automated Backup (30-day retention)~3TBFree up to 100% of DB size; ~$70/month above
RDS Cross-region copy (7-day)~700GB~$17/month + data transfer

See Also


Working With Viprasol

A backup strategy that's never been tested isn't a strategy — it's wishful thinking. Our team designs multi-layer backup architectures with tested restore procedures, documented RTOs, and monthly restore drills. We configure pg_dump for logical backups, pgBackRest for continuous WAL archiving and PITR, and RDS automated backups with cross-region replication for regulated workloads.

What we deliver:

  • pg_dump backup script with S3 upload, integrity check, and 30-day retention
  • pgBackRest configuration: full weekly + differential daily + WAL archiving
  • PITR restore procedure with runbook (tested before you need it)
  • Terraform: RDS with 30-day backup retention, Multi-AZ, cross-region copy
  • Monthly restore test script with sanity checks (row counts, table sizes)

Talk to our team about your database backup and recovery strategy →

Or explore our cloud infrastructure services.

Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need DevOps & Cloud Expertise?

Scale your infrastructure with confidence. AWS, GCP, Azure certified team.

Free consultation • No commitment • Response within 24 hours

Viprasol · Big Data & Analytics

Making sense of your data at scale?

Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.