PostgreSQL Backup Strategy: pg_dump, WAL Archiving, Point-in-Time Recovery, and RDS Automated Backups
Design a production PostgreSQL backup strategy. Covers pg_dump for logical backups, WAL archiving with pgBackRest for continuous backup, point-in-time recovery (PITR), RDS automated backups, backup testing, and retention policies.
Every PostgreSQL backup strategy fails silently until you need it. Teams discover their backups are corrupt, incomplete, or simply not restorable during the worst possible moment — after a data loss incident. A backup strategy isn't complete until it includes regular restore tests, and recovery time objectives (RTO) written down before the incident, not during.
This guide covers the three layers of PostgreSQL backup, how to implement each, and how to test them.
The Three Backup Layers
| Layer | What It Covers | RPO | RTO |
|---|---|---|---|
| Logical (pg_dump) | Schema + data snapshot | Hours to days | 30 min – 4 hours |
| Physical + WAL (pgBackRest) | Continuous block-level backup | Seconds to minutes | 15 min – 2 hours |
| RDS Automated | Managed backup + PITR | 5 minutes | 15 min – 1 hour |
RPO (Recovery Point Objective): How much data you can afford to lose (time). RTO (Recovery Time Objective): How long restoration can take before it's unacceptable.
Layer 1: Logical Backups with pg_dump
#!/bin/bash
# scripts/pg-backup.sh — daily logical backup to S3
set -euo pipefail
DATE=$(date +%Y-%m-%d)
TIMESTAMP=$(date +%Y-%m-%dT%H-%M-%S)
DB_NAME="${POSTGRES_DB:-myapp}"
BACKUP_FILE="${DB_NAME}-${TIMESTAMP}.dump"
S3_PATH="s3://${S3_BACKUPS_BUCKET}/postgres/logical/${DATE}/${BACKUP_FILE}"
echo "Starting backup: ${DB_NAME} → ${S3_PATH}"
# Custom format (-Fc): compressed, supports parallel restore
# --no-owner: don't include ownership (useful when restoring to different user)
# --no-acl: don't include ACL (permissions), cleaner for restores
pg_dump \
--host="${POSTGRES_HOST}" \
--port="${POSTGRES_PORT:-5432}" \
--username="${POSTGRES_USER}" \
--dbname="${DB_NAME}" \
--format=custom \
--compress=9 \
--no-owner \
--no-acl \
--verbose \
--file="/tmp/${BACKUP_FILE}"
# Verify dump is valid before uploading
pg_restore --list "/tmp/${BACKUP_FILE}" > /dev/null
echo "Dump integrity check: PASSED"
# Upload to S3 with server-side encryption
aws s3 cp "/tmp/${BACKUP_FILE}" "${S3_PATH}" \
--sse AES256 \
--metadata "db=${DB_NAME},timestamp=${TIMESTAMP}"
# Clean up local file
rm "/tmp/${BACKUP_FILE}"
echo "Backup complete: ${S3_PATH}"
# Cleanup old backups (keep 30 days)
aws s3 ls "s3://${S3_BACKUPS_BUCKET}/postgres/logical/" | \
awk '{print $2}' | \
sort | \
head -n -30 | \
xargs -I{} aws s3 rm "s3://${S3_BACKUPS_BUCKET}/postgres/logical/{}" --recursive
# scripts/pg-restore.sh — restore from logical backup
#!/bin/bash
set -euo pipefail
S3_PATH="$1" # e.g., s3://my-backups/postgres/logical/2027-04-30/myapp-2027-04-30T02-00-00.dump
TARGET_DB="${2:-myapp_restore}"
echo "Restoring ${S3_PATH} → ${TARGET_DB}"
# Download
aws s3 cp "${S3_PATH}" "/tmp/restore.dump"
# Create target DB (must not exist)
createdb --host="${POSTGRES_HOST}" --username="${POSTGRES_USER}" "${TARGET_DB}"
# Parallel restore: -j 4 uses 4 workers (faster for large DBs)
pg_restore \
--host="${POSTGRES_HOST}" \
--username="${POSTGRES_USER}" \
--dbname="${TARGET_DB}" \
--jobs=4 \
--no-owner \
--no-acl \
--verbose \
"/tmp/restore.dump"
rm "/tmp/restore.dump"
echo "Restore complete: ${TARGET_DB}"
☁️ Is Your Cloud Costing Too Much?
Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.
- AWS, GCP, Azure certified engineers
- Infrastructure as Code (Terraform, CDK)
- Docker, Kubernetes, GitHub Actions CI/CD
- Typical audit recovers $500–$3,000/month in savings
Layer 2: Continuous Backup with pgBackRest
pgBackRest provides incremental physical backups + WAL archiving for point-in-time recovery. Configure it on your PostgreSQL server:
# /etc/pgbackrest/pgbackrest.conf
[global]
# S3 repository
repo1-type=s3
repo1-s3-bucket=my-app-pg-backups
repo1-s3-region=us-east-1
repo1-s3-key=<ACCESS_KEY>
repo1-s3-key-secret=<SECRET_KEY>
repo1-path=/pgbackrest
# Encryption
repo1-cipher-type=aes-256-cbc
repo1-cipher-pass=<STRONG_RANDOM_PASSPHRASE>
# Retention
repo1-retention-full=4 # Keep 4 full backups
repo1-retention-diff=14 # Keep 14 days of differential backups
repo1-retention-archive=14 # Keep 14 days of WAL archives
# Compression
compress-type=lz4 # Fast compression (lz4 or zst)
[myapp]
pg1-path=/var/lib/postgresql/16/main
pg1-user=postgres
-- postgresql.conf additions for WAL archiving
archive_mode = on
archive_command = 'pgbackrest --stanza=myapp archive-push %p'
wal_level = replica -- Required for WAL archiving
# Initial setup (run once)
pgbackrest --stanza=myapp stanza-create
# Schedule in cron
# Full backup: weekly (Sunday 2am)
0 2 * * 0 postgres /usr/bin/pgbackrest --stanza=myapp --type=full backup
# Differential backup: daily (2am)
0 2 * * 1-6 postgres /usr/bin/pgbackrest --stanza=myapp --type=diff backup
# Check backup status
pgbackrest --stanza=myapp info
Point-in-Time Recovery (PITR)
# Restore to a specific point in time (e.g., 1 minute before a bad migration)
# This is your "undo button" for database disasters
# 1. Stop the database
systemctl stop postgresql
# 2. Restore from the most recent backup before target time
pgbackrest --stanza=myapp restore \
--target="2027-04-30 14:29:59" \
--target-action=promote \
--type=time \
--delta # Only replace changed files (faster)
# 3. Start PostgreSQL — it will replay WAL up to the target time
systemctl start postgresql
# 4. Verify
psql -c "SELECT NOW();" -c "SELECT COUNT(*) FROM orders;"
⚙️ DevOps Done Right — Zero Downtime, Full Automation
Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.
- Staging + production environments with feature flags
- Automated security scanning in the pipeline
- Uptime monitoring + alerting + runbook automation
- On-call support handover docs included
Layer 3: RDS Automated Backups
If you're on AWS RDS or Aurora, automated backups are built in. Configure them properly:
# terraform/rds.tf
resource "aws_db_instance" "postgres" {
identifier = "${var.app_name}-${var.environment}"
engine = "postgres"
engine_version = "16.2"
instance_class = "db.t4g.medium"
# Storage
allocated_storage = 100
max_allocated_storage = 500 # Auto-scaling storage
storage_type = "gp3"
storage_encrypted = true
kms_key_id = aws_kms_key.rds.arn
# Backup configuration
backup_retention_period = 30 # Days (max 35)
backup_window = "03:00-04:00" # UTC — low-traffic window
maintenance_window = "sun:04:00-sun:05:00"
copy_tags_to_snapshot = true
delete_automated_backups = false # Keep backups even after instance deletion
# Enable PITR (enabled by default when backup_retention_period > 0)
# RDS PITR granularity: 5 minutes
# Multi-AZ for production (automatic failover)
multi_az = var.environment == "production"
# Final snapshot before destroy
final_snapshot_identifier = "${var.app_name}-${var.environment}-final"
skip_final_snapshot = var.environment != "production"
tags = { Environment = var.environment }
}
# Cross-region backup copy (disaster recovery)
resource "aws_db_instance_automated_backups_replication" "cross_region" {
count = var.environment == "production" ? 1 : 0
source_db_instance_arn = aws_db_instance.postgres.arn
retention_period = 7 # Keep 7 days of cross-region backups
kms_key_id = aws_kms_key.rds_dr.arn
provider = aws.dr_region # Secondary region provider
}
RDS: Point-in-Time Restore
# Restore RDS to a specific point in time via AWS CLI
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier myapp-production \
--target-db-instance-identifier myapp-production-restore-20270430 \
--restore-time 2027-04-30T14:29:59Z \
--db-instance-class db.t4g.medium \
--multi-az false # Don't need HA for temporary restore
# Wait for restoration to complete
aws rds wait db-instance-available \
--db-instance-identifier myapp-production-restore-20270430
echo "Restore complete — connect to myapp-production-restore-20270430"
Backup Testing: The Critical Step Most Teams Skip
#!/bin/bash
# scripts/test-restore.sh — run monthly in CI or manually
# Tests that backups are actually restorable
set -euo pipefail
RESTORE_DB="backup_test_$(date +%s)"
LATEST_BACKUP=$(aws s3 ls "s3://${S3_BACKUPS_BUCKET}/postgres/logical/" \
--recursive | sort | tail -1 | awk '{print $4}')
echo "Testing restore from: ${LATEST_BACKUP}"
# Restore to temporary database
./pg-restore.sh "s3://${S3_BACKUPS_BUCKET}/${LATEST_BACKUP}" "${RESTORE_DB}"
# Sanity checks
ROW_COUNT=$(psql \
--host="${POSTGRES_HOST}" \
--username="${POSTGRES_USER}" \
--dbname="${RESTORE_DB}" \
--tuples-only \
--command="SELECT COUNT(*) FROM users;")
echo "User count in restored backup: ${ROW_COUNT}"
if [ "${ROW_COUNT}" -lt "1" ]; then
echo "ERROR: Restore test failed — no users found"
exit 1
fi
# Run a few more checks
psql \
--host="${POSTGRES_HOST}" \
--username="${POSTGRES_USER}" \
--dbname="${RESTORE_DB}" \
--command="SELECT MAX(created_at) FROM orders; SELECT relname, n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC LIMIT 5;"
# Clean up
dropdb --host="${POSTGRES_HOST}" --username="${POSTGRES_USER}" "${RESTORE_DB}"
echo "Restore test PASSED — backup from ${LATEST_BACKUP} is valid"
Backup Cost Estimates
| Approach | Storage | Monthly Cost (100GB DB) |
|---|---|---|
| pg_dump to S3 (daily, 30-day retention) | ~3TB compressed | ~$70/month |
| pgBackRest to S3 (full weekly + WAL) | ~500GB | ~$12/month |
| RDS Automated Backup (30-day retention) | ~3TB | Free up to 100% of DB size; ~$70/month above |
| RDS Cross-region copy (7-day) | ~700GB | ~$17/month + data transfer |
See Also
- PostgreSQL Performance Tuning
- PostgreSQL Schema Migrations with Prisma
- AWS RDS Aurora Serverless
- AWS RDS Aurora Multi-Region
- Terraform State Management
Working With Viprasol
A backup strategy that's never been tested isn't a strategy — it's wishful thinking. Our team designs multi-layer backup architectures with tested restore procedures, documented RTOs, and monthly restore drills. We configure pg_dump for logical backups, pgBackRest for continuous WAL archiving and PITR, and RDS automated backups with cross-region replication for regulated workloads.
What we deliver:
- pg_dump backup script with S3 upload, integrity check, and 30-day retention
- pgBackRest configuration: full weekly + differential daily + WAL archiving
- PITR restore procedure with runbook (tested before you need it)
- Terraform: RDS with 30-day backup retention, Multi-AZ, cross-region copy
- Monthly restore test script with sanity checks (row counts, table sizes)
Talk to our team about your database backup and recovery strategy →
Or explore our cloud infrastructure services.
About the Author
Viprasol Tech Team
Custom Software Development Specialists
The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.
Need DevOps & Cloud Expertise?
Scale your infrastructure with confidence. AWS, GCP, Azure certified team.
Free consultation • No commitment • Response within 24 hours
Making sense of your data at scale?
Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.