Back to Blog

Zero-Downtime Database Migrations: Expand-Contract, Shadow Tables, and Safe Deploys

Run database migrations without downtime using expand-contract patterns, shadow tables, and safe deploy sequences. Covers Flyway, Liquibase, Prisma, and PostgreSQL-specific techniques.

Viprasol Tech Team
September 4, 2026
14 min read

The most dangerous moment in a software deploy is when the new application code and old database schema are simultaneously in use — or vice versa. A naive migration strategy locks tables, blocks reads, and turns a 5-minute deploy into a 45-minute outage window.

Zero-downtime migrations aren't magic. They're a set of sequenced steps that ensure the database and application code are always compatible, regardless of what's currently deployed.


Why Migrations Break Production

Three failure modes cause most migration-related outages:

1. Table locks on ALTER TABLE

Adding a NOT NULL column in PostgreSQL acquires an ACCESS EXCLUSIVE lock — the strongest lock, blocking all reads and writes for the entire table rewrite duration. On a 100GB table, that's 10–30 minutes of downtime.

2. Deploying code before the schema is ready

New code references users.phone_number before the column exists. Every request fails with a SQL error until migration completes.

3. Deploying schema changes that break old code

You drop the users.username column, but old app instances still running (blue/green deploy mid-cutover) immediately start throwing errors.


The Expand-Contract Pattern

The solution is to decouple schema changes from application changes across three phases:

Phase 1: EXPAND   → Add new structure, keep old structure
Phase 2: MIGRATE  → Backfill data, deploy new application code
Phase 3: CONTRACT → Remove old structure

Example: Renaming a Column

Scenario: rename users.fullnameusers.full_name

-- PHASE 1: EXPAND (deploy with old app running)
-- Add new column, keep old one
ALTER TABLE users ADD COLUMN full_name TEXT;

-- Copy existing data
UPDATE users SET full_name = fullname WHERE full_name IS NULL;

-- Trigger to keep both columns in sync during transition
CREATE OR REPLACE FUNCTION sync_fullname()
RETURNS TRIGGER AS $$
BEGIN
  IF NEW.fullname IS DISTINCT FROM OLD.fullname THEN
    NEW.full_name := NEW.fullname;
  ELSIF NEW.full_name IS DISTINCT FROM OLD.full_name THEN
    NEW.fullname := NEW.full_name;
  END IF;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER sync_user_name
  BEFORE INSERT OR UPDATE ON users
  FOR EACH ROW EXECUTE FUNCTION sync_fullname();
// PHASE 2: MIGRATE — deploy new app code that reads/writes full_name
// But still writes to both columns via the trigger
// Old app instances still work because fullname still exists

// New application code:
export async function getUser(id: string): Promise<User> {
  const { rows } = await db.query(
    "SELECT id, email, full_name, created_at FROM users WHERE id = $1",
    [id]
  );
  return rows[0];
}
-- PHASE 3: CONTRACT (after all old app instances are drained)
-- Drop the trigger and old column
DROP TRIGGER sync_user_name ON users;
DROP FUNCTION sync_fullname();
ALTER TABLE users DROP COLUMN fullname;

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

  • AWS, GCP, Azure certified engineers
  • Infrastructure as Code (Terraform, CDK)
  • Docker, Kubernetes, GitHub Actions CI/CD
  • Typical audit recovers $500–$3,000/month in savings

Adding NOT NULL Columns Safely

The naive approach locks the table:

-- ❌ DANGEROUS: full table rewrite with ACCESS EXCLUSIVE lock
ALTER TABLE orders ADD COLUMN processed_at TIMESTAMPTZ NOT NULL DEFAULT NOW();

The safe approach:

-- STEP 1: Add nullable column (no lock needed for nullable adds in PG 11+)
ALTER TABLE orders ADD COLUMN processed_at TIMESTAMPTZ;

-- STEP 2: Backfill in batches (no lock, runs online)
DO $$
DECLARE
  batch_size INT := 5000;
  last_id BIGINT := 0;
  max_id BIGINT;
BEGIN
  SELECT MAX(id) INTO max_id FROM orders;
  
  WHILE last_id <= max_id LOOP
    UPDATE orders
    SET processed_at = created_at  -- or appropriate default
    WHERE id > last_id
      AND id <= last_id + batch_size
      AND processed_at IS NULL;
    
    last_id := last_id + batch_size;
    PERFORM pg_sleep(0.1);  -- Throttle to avoid replication lag
  END LOOP;
END $$;

-- STEP 3: Add NOT NULL constraint using NOT VALID (validates without full scan)
ALTER TABLE orders
  ADD CONSTRAINT orders_processed_at_not_null
  CHECK (processed_at IS NOT NULL) NOT VALID;

-- STEP 4: Validate the constraint (runs as ShareUpdateExclusiveLock — allows reads/writes)
ALTER TABLE orders VALIDATE CONSTRAINT orders_processed_at_not_null;

-- STEP 5 (optional): Convert to actual NOT NULL after validation
-- This is now instant because the constraint is already validated
ALTER TABLE orders ALTER COLUMN processed_at SET NOT NULL;
ALTER TABLE orders DROP CONSTRAINT orders_processed_at_not_null;

Adding Indexes Without Locking

-- ❌ Locks the table for the entire index build
CREATE INDEX idx_orders_user_id ON orders(user_id);

-- ✅ Builds concurrently — only takes ShareUpdateExclusiveLock (reads/writes continue)
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);

-- Note: CONCURRENTLY is slower and can't run inside a transaction
-- If it fails midway, clean up with:
DROP INDEX CONCURRENTLY IF EXISTS idx_orders_user_id;
-- Then retry

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

  • Staging + production environments with feature flags
  • Automated security scanning in the pipeline
  • Uptime monitoring + alerting + runbook automation
  • On-call support handover docs included

Shadow Table Pattern for Structural Changes

When you need to change a table's structure dramatically (partition it, change primary key type, add a new sort column), shadow tables let you do it online:

-- Scenario: migrate users table from BIGINT id to UUID id

-- STEP 1: Create new table with target structure
CREATE TABLE users_new (
  id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  legacy_id   BIGINT UNIQUE NOT NULL,  -- keep for foreign key mapping
  email       TEXT UNIQUE NOT NULL,
  name        TEXT NOT NULL,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- STEP 2: Install CDC trigger on old table
CREATE OR REPLACE FUNCTION mirror_to_users_new()
RETURNS TRIGGER AS $$
BEGIN
  IF TG_OP = 'INSERT' THEN
    INSERT INTO users_new (legacy_id, email, name, created_at)
    VALUES (NEW.id, NEW.email, NEW.name, NEW.created_at)
    ON CONFLICT (legacy_id) DO NOTHING;
  ELSIF TG_OP = 'UPDATE' THEN
    UPDATE users_new SET email = NEW.email, name = NEW.name
    WHERE legacy_id = NEW.id;
  ELSIF TG_OP = 'DELETE' THEN
    DELETE FROM users_new WHERE legacy_id = OLD.id;
  END IF;
  RETURN NULL;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER mirror_users
  AFTER INSERT OR UPDATE OR DELETE ON users
  FOR EACH ROW EXECUTE FUNCTION mirror_to_users_new();

-- STEP 3: Backfill historical data (batched)
INSERT INTO users_new (legacy_id, email, name, created_at)
SELECT id, email, name, created_at FROM users
ON CONFLICT (legacy_id) DO NOTHING;

-- STEP 4: Verify row counts match
SELECT
  (SELECT COUNT(*) FROM users) AS old_count,
  (SELECT COUNT(*) FROM users_new) AS new_count;

-- STEP 5: Cutover (brief lock window)
BEGIN;
  LOCK TABLE users IN ACCESS EXCLUSIVE MODE;
  -- Final sync of any rows that came in during backfill
  INSERT INTO users_new (legacy_id, email, name, created_at)
  SELECT id, email, name, created_at FROM users
  ON CONFLICT (legacy_id) DO NOTHING;
  -- Swap tables
  ALTER TABLE users RENAME TO users_old;
  ALTER TABLE users_new RENAME TO users;
  -- Update sequences, views, FK references here
COMMIT;

-- STEP 6: After full validation, drop old table
-- DROP TABLE users_old;  -- Wait at least 1 week to be safe

Migration Tool Comparison

ToolLanguageApproachVersion ControlCloud Native
FlywaySQL/JavaVersioned (V1__name.sql)Git-friendlyAWS RDS support
LiquibaseXML/SQL/YAMLChangeset-basedChecksum validationDocker-native
PrismaTypeScriptAuto-generatedMigration filesFull-stack JS
golang-migrateSQLSequential versionedGit-friendlyMinimal deps
AlembicPythonAuto + manualGit-friendlySQLAlchemy native

Prisma Migration Safety

// schema.prisma — Prisma generates migrations from schema diffs
model User {
  id          String   @id @default(uuid())
  email       String   @unique
  fullName    String?  // Nullable first (expand phase)
  legacyName  String?  // Old column kept during transition
  createdAt   DateTime @default(now())
}
# Generate migration (never use --create-only in production without review)
npx prisma migrate dev --name add_full_name_column

# Review the generated SQL before applying
cat prisma/migrations/20260904_add_full_name_column/migration.sql

# Apply in production (no prompts)
npx prisma migrate deploy
// prisma/migrations/20260904_add_full_name_column/migration.sql
-- Prisma generates this; review and modify if needed
ALTER TABLE "User" ADD COLUMN "fullName" TEXT;

-- Add your backfill manually before making it NOT NULL:
-- UPDATE "User" SET "fullName" = "legacyName" WHERE "fullName" IS NULL;

Deploy Sequence for Zero Downtime

// scripts/safe-deploy.ts
// Enforces correct migration + deploy ordering

interface DeployStep {
  name: string;
  phase: "pre-deploy" | "deploy" | "post-deploy";
  run: () => Promise<void>;
  rollback?: () => Promise<void>;
}

const deploySteps: DeployStep[] = [
  {
    name: "Run expand migrations",
    phase: "pre-deploy",
    run: async () => {
      // Only additive migrations: new tables, new nullable columns, new indexes
      await runMigrations({ tags: ["expand"] });
    },
    rollback: async () => {
      await rollbackMigrations({ tags: ["expand"] });
    },
  },
  {
    name: "Deploy application",
    phase: "deploy",
    run: async () => {
      await deployToECS({ desiredCount: 2, maxSurge: "100%", minHealthy: "50%" });
      await waitForHealthCheck({ timeout: 300_000 });
    },
    rollback: async () => {
      await rollbackECS();
    },
  },
  {
    name: "Run contract migrations",
    phase: "post-deploy",
    run: async () => {
      // Only after all old instances are drained
      await waitForOldInstanceDrain({ timeout: 60_000 });
      await runMigrations({ tags: ["contract"] });
    },
  },
];

async function safeDeploy() {
  const executed: DeployStep[] = [];

  for (const step of deploySteps) {
    console.log(`[${step.phase}] ${step.name}`);
    try {
      await step.run();
      executed.push(step);
    } catch (error) {
      console.error(`Step failed: ${step.name}`, error);

      // Rollback in reverse order
      for (const completedStep of executed.reverse()) {
        if (completedStep.rollback) {
          console.log(`Rolling back: ${completedStep.name}`);
          await completedStep.rollback().catch(console.error);
        }
      }
      throw error;
    }
  }
}

Migration Tagging with Flyway

-- V1__create_users.sql (baseline)
CREATE TABLE users (
  id BIGSERIAL PRIMARY KEY,
  email TEXT UNIQUE NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- V2__add_full_name_expand.sql (tagged: expand)
-- flyway:tag=expand
ALTER TABLE users ADD COLUMN full_name TEXT;
ALTER TABLE users ADD COLUMN legacy_name TEXT;
UPDATE users SET legacy_name = email;  -- preserve old data

-- V3__add_full_name_index.sql
CREATE INDEX CONCURRENTLY idx_users_full_name ON users(full_name);

-- V4__drop_legacy_name_contract.sql (tagged: contract — run after deploy)
-- flyway:tag=contract
ALTER TABLE users DROP COLUMN legacy_name;

Cost of Migration Downtime

Table SizeNaive ALTER TABLEpg_repackShadow TableCONCURRENTLY index
1 GB30–120s downtime~5 min online~10 min online~3 min online
10 GB5–30 min downtime~45 min online~90 min online~25 min online
100 GB1–6h downtime~8h online~15h online~4h online
1 TBDays of downtime3–5 days online5–7 days online2–3 days online

pg_repack is the best tool for large table rewrites — it creates a shadow table internally, syncs via triggers, then does a brief lock swap.

# Install pg_repack (requires extension on server)
psql -c "CREATE EXTENSION pg_repack;"

# Repack a table online (no downtime)
pg_repack --host=db.prod.internal --dbname=myapp --table=orders --jobs=2

Monitoring During Migrations

-- Monitor long-running migrations in real time
SELECT
  pid,
  now() - pg_stat_activity.query_start AS duration,
  query,
  state,
  wait_event_type,
  wait_event
FROM pg_stat_activity
WHERE state != 'idle'
  AND query_start < NOW() - INTERVAL '30 seconds'
ORDER BY duration DESC;

-- Check lock contention
SELECT
  blocked_locks.pid AS blocked_pid,
  blocked_activity.usename AS blocked_user,
  blocking_locks.pid AS blocking_pid,
  blocking_activity.usename AS blocking_user,
  blocked_activity.query AS blocked_statement,
  blocking_activity.query AS blocking_statement
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks
  ON blocking_locks.locktype = blocked_locks.locktype
  AND blocking_locks.relation = blocked_locks.relation
  AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;

See Also


Working With Viprasol

Migration accidents are among the most costly engineering incidents — and they're almost entirely preventable with the right sequencing strategy. Our team has run zero-downtime migrations on PostgreSQL databases ranging from 10GB to 8TB, using expand-contract, shadow tables, and pg_repack.

Cloud & DevOps services → | Talk to our engineers →

Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need DevOps & Cloud Expertise?

Scale your infrastructure with confidence. AWS, GCP, Azure certified team.

Free consultation • No commitment • Response within 24 hours

Viprasol · Big Data & Analytics

Making sense of your data at scale?

Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.