Database Schema Migration

Q: What is Database Schema Migration?

> Quick answer. Schema migrations are the highest-risk operation in production systems, since a poorly planned one can lock tables for minutes, take a read replica offline, or corrupt data. Most outages come from exclusive table locks (like ALTER TABLE ADD COLUMN NOT NULL) and long-running updates. Zero-downtime patterns, expand/contract, online index builds, and safe column operations, prevent them in PostgreSQL.

Database Schema Migration: Zero-Downtime Patterns for Production PostgreSQL

Quick answer. Schema migrations are the highest-risk operation in production systems, since a poorly planned one can lock tables for minutes, take a read replica offline, or corrupt data. Most outages come from exclusive table locks (like ALTER TABLE ADD COLUMN NOT NULL) and long-running updates. Zero-downtime patterns, expand/contract, online index builds, and safe column operations, prevent them in PostgreSQL.

Database schema migrations are the highest-risk operation in most production systems. A poorly planned migration can lock tables for minutes, take a read replica offline, or corrupt data in ways that aren't immediately obvious.

The techniques for zero-downtime migrations are well-established but not widely practiced. This guide covers the patterns that work — expand/contract, online index builds, safe column operations, and table restructuring — with production-grade PostgreSQL examples.

Why Migrations Cause Downtime

Most migration-related outages come from:

Exclusive table locks — ALTER TABLE ADD COLUMN NOT NULL takes an ACCESS EXCLUSIVE lock that blocks all reads and writes
Long-running migrations blocking — A large UPDATE that touches millions of rows holds a transaction lock while it runs
Version incompatibility — New code deployed before migration runs (or after migration runs, but before old code is retired)
Missing indexes — Adding a NOT NULL constraint on an unindexed column triggers a full sequential scan

Understanding PostgreSQL's locking model is the foundation:

-- What locks are active right now?
SELECT
  pid,
  relation::regclass AS table,
  mode,
  granted,
  query_start,
  query
FROM pg_locks l
JOIN pg_stat_activity a ON a.pid = l.pid
WHERE relation IS NOT NULL
ORDER BY query_start;

The Expand/Contract Pattern

The core pattern for zero-downtime schema changes. Split every breaking change into three phases across three separate deployments.

Phase 1: EXPAND     — Add new structure, keep old structure
Phase 2: MIGRATE    — Backfill data, update application to use new structure
Phase 3: CONTRACT   — Remove old structure

At each phase boundary, a full deployment cycle occurs. Old and new application versions run simultaneously. The schema must be compatible with both.

Worked Example: Renaming a Column

Renaming a column (user_name → username) is one of the most common "simple" operations that breaks production:

-- ❌ WRONG: Single-step rename — breaks any running app instances using old name
ALTER TABLE users RENAME COLUMN user_name TO username;

-- ✅ CORRECT: Three phases

-- PHASE 1: EXPAND — Add new column, keep old column
ALTER TABLE users ADD COLUMN username TEXT;

-- Sync via trigger: writes to user_name automatically propagate to username
CREATE OR REPLACE FUNCTION sync_username()
RETURNS TRIGGER AS $$
BEGIN
  IF TG_OP = 'INSERT' OR NEW.user_name IS DISTINCT FROM OLD.user_name THEN
    NEW.username := NEW.user_name;
  END IF;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_sync_username
BEFORE INSERT OR UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION sync_username();

-- Application: write to user_name (old behavior preserved), can read from either

-- PHASE 2: MIGRATE — Backfill + update application
-- Backfill existing rows (run in batches to avoid long transactions)
DO $$
DECLARE
  batch_size INT := 10000;
  last_id BIGINT := 0;
  max_id BIGINT;
BEGIN
  SELECT MAX(id) INTO max_id FROM users;
  
  WHILE last_id < max_id LOOP
    UPDATE users
    SET username = user_name
    WHERE id > last_id
      AND id <= last_id + batch_size
      AND username IS NULL;
    
    last_id := last_id + batch_size;
    PERFORM pg_sleep(0.1);  -- Brief pause to reduce lock contention
  END LOOP;
END $$;

-- Add NOT NULL after backfill
ALTER TABLE users ALTER COLUMN username SET NOT NULL;

-- Application: now reads from username, writes to both (still backward compatible)

-- PHASE 3: CONTRACT — Remove old structure
-- Drop trigger first
DROP TRIGGER trg_sync_username ON users;
DROP FUNCTION sync_username();

-- Drop old column
ALTER TABLE users DROP COLUMN user_name;

-- Application: writes only to username

🌐 Looking for a Dev Team That Actually Delivers?

Most agencies sell you a project manager and assign juniors. Viprasol is different — senior engineers only, direct Slack access, and a 5.0★ Upwork record across 1000+ projects.

React, Next.js, Node.js, TypeScript — production-grade stack
Fixed-price contracts — no surprise invoices
Full source code ownership from day one
90-day post-launch support included

Get a Free Scope Review WhatsApp

Safe Index Creation

CREATE INDEX takes a SHARE lock, which blocks writes. On a large table, this is unacceptable in production.

-- ❌ Blocks writes during index creation
CREATE INDEX idx_orders_customer_id ON orders(customer_id);

-- ✅ CONCURRENT build — no write lock, runs in background
CREATE INDEX CONCURRENTLY idx_orders_customer_id ON orders(customer_id);

CONCURRENTLY is safe for production but:

Takes 2–3x longer to build
Cannot be run inside a transaction
May fail if there are long-running transactions (retryable)

-- If CONCURRENT index creation fails, clean up and retry
DROP INDEX CONCURRENTLY IF EXISTS idx_orders_customer_id;
CREATE INDEX CONCURRENTLY idx_orders_customer_id ON orders(customer_id);

-- Verify index was created successfully
SELECT
  indexname,
  indexdef,
  indisvalid
FROM pg_indexes
JOIN pg_index ON pg_index.indexrelid = (
  SELECT oid FROM pg_class WHERE relname = indexname
)
WHERE tablename = 'orders';

Adding a Unique Constraint

Same pattern — build index first, then add constraint:

-- Phase 1: Build unique index concurrently
CREATE UNIQUE INDEX CONCURRENTLY idx_users_email_unique ON users(email);

-- Phase 2: Add constraint using the pre-built index (fast — no scan required)
ALTER TABLE users ADD CONSTRAINT uq_users_email UNIQUE USING INDEX idx_users_email_unique;

Adding NOT NULL Columns

The classic trap: adding a NOT NULL column to a large table causes PostgreSQL to scan and rewrite the entire table.

-- ❌ WRONG: Full table rewrite + lock
ALTER TABLE events ADD COLUMN processed BOOLEAN NOT NULL DEFAULT FALSE;

-- ✅ CORRECT: Three steps (PostgreSQL 11+)

-- Step 1: Add nullable column with default (fast — metadata change only in PG11+)
ALTER TABLE events ADD COLUMN processed BOOLEAN DEFAULT FALSE;

-- Step 2: Backfill (already has default, but explicit update for older rows)
-- Usually not needed if DEFAULT handles it, but explicitly confirm:
-- SELECT COUNT(*) FROM events WHERE processed IS NULL;

-- Step 3: Add NOT NULL constraint (PG 12+ can validate without full lock if backfill is complete)
ALTER TABLE events ALTER COLUMN processed SET NOT NULL;

PostgreSQL 11+ optimization: For columns with a constant default value, adding the column is a metadata-only operation (no table rewrite). For expressions or function defaults, it still requires a rewrite.

🚀 Senior Engineers. No Junior Handoffs. Ever.

You get the senior developer, not a project manager who relays your requirements to someone you never meet. Every Viprasol project has a senior lead from kickoff to launch.

MVPs in 4–8 weeks, full platforms in 3–5 months
Lighthouse 90+ performance scores standard
Works across US, UK, AU timezones
Free 30-min architecture review, no commitment

Start My Project WhatsApp

Table Splitting

When a table has grown too large and needs to be decomposed:

-- Original: users table with profile + auth data mixed
-- Target: users (auth) + user_profiles (display data)

-- PHASE 1: Create new table + sync trigger
CREATE TABLE user_profiles (
  user_id UUID PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
  display_name TEXT,
  avatar_url TEXT,
  bio TEXT,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Sync trigger: new/updated users automatically propagate to profiles
CREATE OR REPLACE FUNCTION sync_user_profile()
RETURNS TRIGGER AS $$
BEGIN
  INSERT INTO user_profiles (user_id, display_name, avatar_url, bio)
  VALUES (NEW.id, NEW.display_name, NEW.avatar_url, NEW.bio)
  ON CONFLICT (user_id) DO UPDATE SET
    display_name = EXCLUDED.display_name,
    avatar_url = EXCLUDED.avatar_url,
    bio = EXCLUDED.bio;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_sync_user_profile
AFTER INSERT OR UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION sync_user_profile();

-- PHASE 2: Backfill profiles table
INSERT INTO user_profiles (user_id, display_name, avatar_url, bio)
SELECT id, display_name, avatar_url, bio FROM users
ON CONFLICT (user_id) DO NOTHING;

-- Application updated to JOIN users + user_profiles for reads
-- Application writes to both tables

-- PHASE 3: Remove old columns from users table
DROP TRIGGER trg_sync_user_profile ON users;
DROP FUNCTION sync_user_profile();

ALTER TABLE users DROP COLUMN display_name;
ALTER TABLE users DROP COLUMN avatar_url;
ALTER TABLE users DROP COLUMN bio;

Migration Tooling Comparison

Tool	Language	Strengths	Limitations
Flyway	Java/JVM	Mature, enterprise-ready, strong versioning	Java dependency, less flexible
Liquibase	Java/JVM	XML/YAML/JSON migrations, diff support	Complexity, XML verbosity
Prisma Migrate	Node.js	Auto-generated from schema, dev-friendly	Less control for complex migrations
golang-migrate	Go	Lightweight, CLI-friendly, multiple drivers	Minimal — no rollback support
Alembic	Python	SQLAlchemy integration, autogenerate	Python-specific
Sqitch	Perl/SQL	Pure SQL, dependency-aware, change-based	Less adoption, Perl runtime

For most Node.js/TypeScript projects: Prisma Migrate for development, raw SQL migrations for complex zero-downtime operations.

// prisma/migrations/20260323_add_username.sql
-- Custom migration file for complex zero-downtime patterns
-- Prisma runs these in order

-- Phase 1 of username migration (deploy separately)
ALTER TABLE users ADD COLUMN username TEXT;
CREATE INDEX CONCURRENTLY idx_users_username ON users(username);

For Python projects: Alembic with SQLAlchemy.

# alembic/versions/2026_03_23_add_username.py
from alembic import op
import sqlalchemy as sa

def upgrade():
    # Expand phase — add nullable column
    op.add_column('users', sa.Column('username', sa.Text(), nullable=True))
    # Note: index creation CONCURRENTLY can't be in a transaction
    # Run via op.execute() with connection.execution_options(isolation_level="AUTOCOMMIT")

def downgrade():
    op.drop_column('users', 'username')

CI/CD Integration

Migrations must run before application deployment in most cases:

# GitHub Actions deployment pipeline
name: Deploy
on:
  push:
    branches: [main]

jobs:
  migrate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run database migrations
        env:
          DATABASE_URL: ${{ secrets.PRODUCTION_DATABASE_URL }}
        run: |
          npx prisma migrate deploy
          # Or: flyway migrate

  deploy:
    needs: migrate   # Deploy only after migrations succeed
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Deploy application
        run: ./scripts/deploy.sh production

Exception: For expand/contract, the expand migration runs before the application deploy, but the contract migration runs after all old application versions are retired.

Cost of Getting It Wrong

Incident Type	Typical Impact
Table lock during business hours	5–30 minutes full outage
Failed migration mid-deploy	1–4 hours rollback + rework
Data corruption from racing migrations	1–7 days investigation + remediation
Missing index on new constraint	Ongoing performance degradation

Migration incidents are expensive but almost entirely preventable with proper technique.

Inside Viprasol

We handle complex database migrations for production systems — schema redesigns, table splits, column renames, and index rebuilds — all without downtime.

Our database migration services:

Migration strategy design and review
Expand/contract implementation
Backfill job development and execution
Post-migration performance validation

→ Talk to us about your database migration →
→ Software Development Services →

Database Schema Migration

Database Schema Migration: Zero-Downtime Patterns for Production PostgreSQL

Why Migrations Cause Downtime

The Expand/Contract Pattern

Worked Example: Renaming a Column

🌐 Looking for a Dev Team That Actually Delivers?

Safe Index Creation

Adding a Unique Constraint

Adding NOT NULL Columns

🚀 Senior Engineers. No Junior Handoffs. Ever.

Recommended Reading

Table Splitting

Migration Tooling Comparison

CI/CD Integration

Cost of Getting It Wrong

Inside Viprasol

More on This Topic

Sources

External Resources

Viprasol Tech Team

Need a Modern Web Application?

Need a custom web application built?

Related Articles

PostgreSQL EXPLAIN ANALYZE in 2026: Reading Query Plans

PostgreSQL Performance Tuning: EXPLAIN ANALYZE, Index Types, Vacuum

PostgreSQL Partitioning in 2026: Range, List

PostgreSQL Schema Migrations 2026: Zero-Downtime, Expand/Contract

PostgreSQL Materialized Views in 2026: Refresh Strategies

PostgreSQL Window Functions: ROW_NUMBER, RANK