Database Schema Migration: Zero-Downtime Patterns for Production PostgreSQL
Database schema migration best practices in 2026 — zero-downtime expand/contract patterns, migration tooling (Flyway, Liquibase, Prisma), and safe techniques fo
Database Schema Migration: Zero-Downtime Patterns for Production PostgreSQL
Database schema migrations are the highest-risk operation in most production systems. A poorly planned migration can lock tables for minutes, take a read replica offline, or corrupt data in ways that aren't immediately obvious.
The techniques for zero-downtime migrations are well-established but not widely practiced. This guide covers the patterns that work — expand/contract, online index builds, safe column operations, and table restructuring — with production-grade PostgreSQL examples.
Why Migrations Cause Downtime
Most migration-related outages come from:
- Exclusive table locks —
ALTER TABLE ADD COLUMN NOT NULLtakes anACCESS EXCLUSIVElock that blocks all reads and writes - Long-running migrations blocking — A large
UPDATEthat touches millions of rows holds a transaction lock while it runs - Version incompatibility — New code deployed before migration runs (or after migration runs, but before old code is retired)
- Missing indexes — Adding a
NOT NULLconstraint on an unindexed column triggers a full sequential scan
Understanding PostgreSQL's locking model is the foundation:
-- What locks are active right now?
SELECT
pid,
relation::regclass AS table,
mode,
granted,
query_start,
query
FROM pg_locks l
JOIN pg_stat_activity a ON a.pid = l.pid
WHERE relation IS NOT NULL
ORDER BY query_start;
The Expand/Contract Pattern
The core pattern for zero-downtime schema changes. Split every breaking change into three phases across three separate deployments.
Phase 1: EXPAND — Add new structure, keep old structure
Phase 2: MIGRATE — Backfill data, update application to use new structure
Phase 3: CONTRACT — Remove old structure
At each phase boundary, a full deployment cycle occurs. Old and new application versions run simultaneously. The schema must be compatible with both.
Worked Example: Renaming a Column
Renaming a column (user_name → username) is one of the most common "simple" operations that breaks production:
-- ❌ WRONG: Single-step rename — breaks any running app instances using old name
ALTER TABLE users RENAME COLUMN user_name TO username;
-- ✅ CORRECT: Three phases
-- PHASE 1: EXPAND — Add new column, keep old column
ALTER TABLE users ADD COLUMN username TEXT;
-- Sync via trigger: writes to user_name automatically propagate to username
CREATE OR REPLACE FUNCTION sync_username()
RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' OR NEW.user_name IS DISTINCT FROM OLD.user_name THEN
NEW.username := NEW.user_name;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_sync_username
BEFORE INSERT OR UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION sync_username();
-- Application: write to user_name (old behavior preserved), can read from either
-- PHASE 2: MIGRATE — Backfill + update application
-- Backfill existing rows (run in batches to avoid long transactions)
DO $$
DECLARE
batch_size INT := 10000;
last_id BIGINT := 0;
max_id BIGINT;
BEGIN
SELECT MAX(id) INTO max_id FROM users;
WHILE last_id < max_id LOOP
UPDATE users
SET username = user_name
WHERE id > last_id
AND id <= last_id + batch_size
AND username IS NULL;
last_id := last_id + batch_size;
PERFORM pg_sleep(0.1); -- Brief pause to reduce lock contention
END LOOP;
END $$;
-- Add NOT NULL after backfill
ALTER TABLE users ALTER COLUMN username SET NOT NULL;
-- Application: now reads from username, writes to both (still backward compatible)
-- PHASE 3: CONTRACT — Remove old structure
-- Drop trigger first
DROP TRIGGER trg_sync_username ON users;
DROP FUNCTION sync_username();
-- Drop old column
ALTER TABLE users DROP COLUMN user_name;
-- Application: writes only to username
🌐 Looking for a Dev Team That Actually Delivers?
Most agencies sell you a project manager and assign juniors. Viprasol is different — senior engineers only, direct Slack access, and a 5.0★ Upwork record across 100+ projects.
- React, Next.js, Node.js, TypeScript — production-grade stack
- Fixed-price contracts — no surprise invoices
- Full source code ownership from day one
- 90-day post-launch support included
Safe Index Creation
CREATE INDEX takes a SHARE lock, which blocks writes. On a large table, this is unacceptable in production.
-- ❌ Blocks writes during index creation
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
-- ✅ CONCURRENT build — no write lock, runs in background
CREATE INDEX CONCURRENTLY idx_orders_customer_id ON orders(customer_id);
CONCURRENTLY is safe for production but:
- Takes 2–3x longer to build
- Cannot be run inside a transaction
- May fail if there are long-running transactions (retryable)
-- If CONCURRENT index creation fails, clean up and retry
DROP INDEX CONCURRENTLY IF EXISTS idx_orders_customer_id;
CREATE INDEX CONCURRENTLY idx_orders_customer_id ON orders(customer_id);
-- Verify index was created successfully
SELECT
indexname,
indexdef,
indisvalid
FROM pg_indexes
JOIN pg_index ON pg_index.indexrelid = (
SELECT oid FROM pg_class WHERE relname = indexname
)
WHERE tablename = 'orders';
Adding a Unique Constraint
Same pattern — build index first, then add constraint:
-- Phase 1: Build unique index concurrently
CREATE UNIQUE INDEX CONCURRENTLY idx_users_email_unique ON users(email);
-- Phase 2: Add constraint using the pre-built index (fast — no scan required)
ALTER TABLE users ADD CONSTRAINT uq_users_email UNIQUE USING INDEX idx_users_email_unique;
Adding NOT NULL Columns
The classic trap: adding a NOT NULL column to a large table causes PostgreSQL to scan and rewrite the entire table.
-- ❌ WRONG: Full table rewrite + lock
ALTER TABLE events ADD COLUMN processed BOOLEAN NOT NULL DEFAULT FALSE;
-- ✅ CORRECT: Three steps (PostgreSQL 11+)
-- Step 1: Add nullable column with default (fast — metadata change only in PG11+)
ALTER TABLE events ADD COLUMN processed BOOLEAN DEFAULT FALSE;
-- Step 2: Backfill (already has default, but explicit update for older rows)
-- Usually not needed if DEFAULT handles it, but explicitly confirm:
-- SELECT COUNT(*) FROM events WHERE processed IS NULL;
-- Step 3: Add NOT NULL constraint (PG 12+ can validate without full lock if backfill is complete)
ALTER TABLE events ALTER COLUMN processed SET NOT NULL;
PostgreSQL 11+ optimization: For columns with a constant default value, adding the column is a metadata-only operation (no table rewrite). For expressions or function defaults, it still requires a rewrite.
🚀 Senior Engineers. No Junior Handoffs. Ever.
You get the senior developer, not a project manager who relays your requirements to someone you never meet. Every Viprasol project has a senior lead from kickoff to launch.
- MVPs in 4–8 weeks, full platforms in 3–5 months
- Lighthouse 90+ performance scores standard
- Works across US, UK, AU timezones
- Free 30-min architecture review, no commitment
Table Splitting
When a table has grown too large and needs to be decomposed:
-- Original: users table with profile + auth data mixed
-- Target: users (auth) + user_profiles (display data)
-- PHASE 1: Create new table + sync trigger
CREATE TABLE user_profiles (
user_id UUID PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
display_name TEXT,
avatar_url TEXT,
bio TEXT,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Sync trigger: new/updated users automatically propagate to profiles
CREATE OR REPLACE FUNCTION sync_user_profile()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO user_profiles (user_id, display_name, avatar_url, bio)
VALUES (NEW.id, NEW.display_name, NEW.avatar_url, NEW.bio)
ON CONFLICT (user_id) DO UPDATE SET
display_name = EXCLUDED.display_name,
avatar_url = EXCLUDED.avatar_url,
bio = EXCLUDED.bio;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_sync_user_profile
AFTER INSERT OR UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION sync_user_profile();
-- PHASE 2: Backfill profiles table
INSERT INTO user_profiles (user_id, display_name, avatar_url, bio)
SELECT id, display_name, avatar_url, bio FROM users
ON CONFLICT (user_id) DO NOTHING;
-- Application updated to JOIN users + user_profiles for reads
-- Application writes to both tables
-- PHASE 3: Remove old columns from users table
DROP TRIGGER trg_sync_user_profile ON users;
DROP FUNCTION sync_user_profile();
ALTER TABLE users DROP COLUMN display_name;
ALTER TABLE users DROP COLUMN avatar_url;
ALTER TABLE users DROP COLUMN bio;
Migration Tooling Comparison
| Tool | Language | Strengths | Limitations |
|---|---|---|---|
| Flyway | Java/JVM | Mature, enterprise-ready, strong versioning | Java dependency, less flexible |
| Liquibase | Java/JVM | XML/YAML/JSON migrations, diff support | Complexity, XML verbosity |
| Prisma Migrate | Node.js | Auto-generated from schema, dev-friendly | Less control for complex migrations |
| golang-migrate | Go | Lightweight, CLI-friendly, multiple drivers | Minimal — no rollback support |
| Alembic | Python | SQLAlchemy integration, autogenerate | Python-specific |
| Sqitch | Perl/SQL | Pure SQL, dependency-aware, change-based | Less adoption, Perl runtime |
For most Node.js/TypeScript projects: Prisma Migrate for development, raw SQL migrations for complex zero-downtime operations.
// prisma/migrations/20260323_add_username.sql
-- Custom migration file for complex zero-downtime patterns
-- Prisma runs these in order
-- Phase 1 of username migration (deploy separately)
ALTER TABLE users ADD COLUMN username TEXT;
CREATE INDEX CONCURRENTLY idx_users_username ON users(username);
For Python projects: Alembic with SQLAlchemy.
# alembic/versions/2026_03_23_add_username.py
from alembic import op
import sqlalchemy as sa
def upgrade():
# Expand phase — add nullable column
op.add_column('users', sa.Column('username', sa.Text(), nullable=True))
# Note: index creation CONCURRENTLY can't be in a transaction
# Run via op.execute() with connection.execution_options(isolation_level="AUTOCOMMIT")
def downgrade():
op.drop_column('users', 'username')
CI/CD Integration
Migrations must run before application deployment in most cases:
# GitHub Actions deployment pipeline
name: Deploy
on:
push:
branches: [main]
jobs:
migrate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run database migrations
env:
DATABASE_URL: ${{ secrets.PRODUCTION_DATABASE_URL }}
run: |
npx prisma migrate deploy
# Or: flyway migrate
deploy:
needs: migrate # Deploy only after migrations succeed
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Deploy application
run: ./scripts/deploy.sh production
Exception: For expand/contract, the expand migration runs before the application deploy, but the contract migration runs after all old application versions are retired.
Cost of Getting It Wrong
| Incident Type | Typical Impact |
|---|---|
| Table lock during business hours | 5–30 minutes full outage |
| Failed migration mid-deploy | 1–4 hours rollback + rework |
| Data corruption from racing migrations | 1–7 days investigation + remediation |
| Missing index on new constraint | Ongoing performance degradation |
Migration incidents are expensive but almost entirely preventable with proper technique.
Working With Viprasol
We handle complex database migrations for production systems — schema redesigns, table splits, column renames, and index rebuilds — all without downtime.
Our database migration services:
- Migration strategy design and review
- Expand/contract implementation
- Backfill job development and execution
- Post-migration performance validation
→ Talk to us about your database migration →
→ Software Development Services →
See Also
- Zero-Downtime Deployment
- Technical Debt Management
- DevOps as a Service
- Database Development Services
About the Author
Viprasol Tech Team
Custom Software Development Specialists
The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.
Need a Modern Web Application?
From landing pages to complex SaaS platforms — we build it all with Next.js and React.
Free consultation • No commitment • Response within 24 hours
Need a custom web application built?
We build React and Next.js web applications with Lighthouse ≥90 scores, mobile-first design, and full source code ownership. Senior engineers only — from architecture through deployment.