Back to Blog

Software Scalability: Horizontal Scaling Patterns for Web Applications

Software scalability patterns in 2026 — horizontal vs vertical scaling, database sharding, caching strategies, async job queues, and how to architect web applic

Viprasol Tech Team
March 27, 2026
13 min read

Software Scalability: Horizontal Scaling Patterns for Web Applications

Most applications don't fail under load because of bad code. They fail because of architectural decisions made when traffic was 100 users/day — decisions that work perfectly until they don't.

Scalability isn't about rewriting everything in a faster language. It's about identifying where your system breaks under load, then addressing those bottlenecks systematically. The same techniques that handle 10x traffic usually handle 100x with the same approach applied more aggressively.

This guide covers the practical patterns: stateless services, caching layers, database read scaling, async job processing, and the signals that tell you what to fix next.


The Scalability Stack

Every web application has the same basic scaling stack:

Load Balancer (distributes requests)
        ↓
Application Servers (stateless, horizontally scalable)
        ↓
Cache Layer (Redis — avoid hitting DB for common reads)
        ↓
Database (primary for writes, replicas for reads)
        ↓
Job Queue (async work — don't block HTTP requests)
        ↓
Object Storage (S3 — files, large assets, never on disk)

Scaling any layer is straightforward once the architecture is right. Scaling the wrong layer wastes money and doesn't solve the problem.


Principle 1: Stateless Application Servers

The foundation of horizontal scaling. If your application server stores any state in memory (sessions, uploads, local file cache), you can't add more servers — requests will go to different instances and miss the state.

Common stateful patterns that block scaling:

// ❌ BAD: Session stored in server memory
app.use(session({
  secret: 'mysecret',
  resave: false,
  saveUninitialized: false,
  // No store configured = in-memory = not scalable
}));

// ✅ GOOD: Session stored in Redis — works across any number of instances
import RedisStore from 'connect-redis';
import { createClient } from 'redis';

const redisClient = createClient({ url: process.env.REDIS_URL });
await redisClient.connect();

app.use(session({
  store: new RedisStore({ client: redisClient }),
  secret: process.env.SESSION_SECRET!,
  resave: false,
  saveUninitialized: false,
  cookie: { secure: true, httpOnly: true, maxAge: 86400000 },
}));
// ❌ BAD: File uploads stored on local filesystem
app.post('/upload', upload.single('file'), (req, res) => {
  // req.file.path points to local disk — inaccessible to other servers
  res.json({ path: req.file.path });
});

// ✅ GOOD: Files go directly to S3
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';
import multer from 'multer';

const s3 = new S3Client({ region: 'us-east-1' });
const upload = multer({ storage: multer.memoryStorage() }); // buffer, not disk

app.post('/upload', upload.single('file'), async (req, res) => {
  const key = `uploads/${Date.now()}-${req.file.originalname}`;
  
  await s3.send(new PutObjectCommand({
    Bucket: process.env.S3_BUCKET!,
    Key: key,
    Body: req.file.buffer,
    ContentType: req.file.mimetype,
  }));

  res.json({ url: `https://${process.env.CDN_DOMAIN}/${key}` });
});

🌐 Looking for a Dev Team That Actually Delivers?

Most agencies sell you a project manager and assign juniors. Viprasol is different — senior engineers only, direct Slack access, and a 5.0★ Upwork record across 100+ projects.

  • React, Next.js, Node.js, TypeScript — production-grade stack
  • Fixed-price contracts — no surprise invoices
  • Full source code ownership from day one
  • 90-day post-launch support included

Principle 2: Caching Strategy

The fastest query is the one you don't make. A well-designed cache layer can eliminate 80–95% of database reads for read-heavy workloads.

Cache-Aside Pattern (Lazy Loading)

class UserService {
  private readonly CACHE_TTL = 300; // 5 minutes

  async getUser(userId: string): Promise<User | null> {
    const cacheKey = `user:${userId}`;
    
    // 1. Check cache first
    const cached = await redis.get(cacheKey);
    if (cached) {
      return JSON.parse(cached) as User;
    }

    // 2. Cache miss — query database
    const user = await db('users').where({ id: userId }).first();
    
    if (user) {
      // 3. Populate cache for next request
      await redis.setex(cacheKey, this.CACHE_TTL, JSON.stringify(user));
    }

    return user ?? null;
  }

  async updateUser(userId: string, data: Partial<User>): Promise<User> {
    const user = await db('users').where({ id: userId }).update(data).returning('*');
    
    // Invalidate cache on write
    await redis.del(`user:${userId}`);
    
    return user[0];
  }
}

Write-Through Cache

For data that must be consistent immediately after writes:

async updateUserProfile(userId: string, profile: ProfileUpdate): Promise<void> {
  // Write to DB and cache simultaneously
  await Promise.all([
    db('user_profiles').where({ user_id: userId }).update(profile),
    redis.setex(`profile:${userId}`, 300, JSON.stringify({
      ...await this.getProfile(userId),
      ...profile,
    })),
  ]);
}

What to Cache (and What Not To)

CacheYesNo
User profile data
Session data
Expensive aggregations
Rate limit counters
Feature flag state
Financial balances❌ (must be consistent)
Inventory counts❌ (stale = overselling)
Actively written records❌ (invalidation complexity)

Principle 3: Database Read Scaling

The database is almost always the first bottleneck. Two strategies work together:

Read Replicas

Route read queries to replicas, writes to primary:

// Two separate Knex connections — primary for writes, replica for reads
import Knex from 'knex';

const dbPrimary = Knex({
  client: 'pg',
  connection: process.env.DATABASE_PRIMARY_URL,
});

const dbReplica = Knex({
  client: 'pg',
  connection: process.env.DATABASE_REPLICA_URL,
  // Add read-only timeout — replicas lag behind primary
  pool: { acquireTimeoutMillis: 5000 },
});

class OrderService {
  // Reads from replica
  async getOrders(userId: string): Promise<Order[]> {
    return dbReplica('orders').where({ user_id: userId }).orderBy('created_at', 'desc');
  }

  // Writes to primary
  async createOrder(data: CreateOrderData): Promise<Order> {
    const [order] = await dbPrimary('orders').insert(data).returning('*');
    return order;
  }
}

Replication lag consideration: Replica lag is typically 10–500ms. For reads immediately after a write (e.g., "show me the order I just placed"), read from primary or add a brief delay.

Index Optimization

Missing indexes are the most common cause of database performance problems. Every foreign key and every WHERE clause column should be indexed unless you've explicitly decided not to.

-- Find slow queries (requires pg_stat_statements extension)
SELECT
  query,
  calls,
  total_exec_time / calls AS avg_ms,
  rows / calls AS avg_rows
FROM pg_stat_statements
WHERE calls > 100
ORDER BY avg_ms DESC
LIMIT 20;

-- Find tables with sequential scans (usually means missing index)
SELECT
  relname AS table,
  seq_scan,
  idx_scan,
  ROUND(seq_scan::numeric / NULLIF(seq_scan + idx_scan, 0) * 100, 1) AS seq_scan_pct
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_scan DESC;

-- Check existing indexes on a table
SELECT
  indexname,
  indexdef,
  pg_size_pretty(pg_relation_size(indexname::regclass)) AS index_size
FROM pg_indexes
WHERE tablename = 'orders';

🚀 Senior Engineers. No Junior Handoffs. Ever.

You get the senior developer, not a project manager who relays your requirements to someone you never meet. Every Viprasol project has a senior lead from kickoff to launch.

  • MVPs in 4–8 weeks, full platforms in 3–5 months
  • Lighthouse 90+ performance scores standard
  • Works across US, UK, AU timezones
  • Free 30-min architecture review, no commitment

Principle 4: Async Job Processing

Any operation that takes more than 100ms should not run synchronously in an HTTP request. This includes:

  • Sending emails / SMS
  • Generating reports or exports
  • Processing images or files
  • Calling external APIs
  • Running machine learning inference
  • Sending webhooks
// ❌ BAD: Email sent synchronously — user waits for SMTP roundtrip
app.post('/signup', async (req, res) => {
  const user = await createUser(req.body);
  await sendWelcomeEmail(user.email); // blocks response for 200–800ms
  res.json({ user });
});

// ✅ GOOD: Email queued, response returned immediately
import Bull from 'bull';

const emailQueue = new Bull('email', { redis: { url: process.env.REDIS_URL } });

app.post('/signup', async (req, res) => {
  const user = await createUser(req.body);
  
  // Queue job — returns in <5ms
  await emailQueue.add('welcome', { userId: user.id, email: user.email });
  
  res.json({ user }); // Response sent immediately
});

// Worker process (separate dyno/container)
emailQueue.process('welcome', async (job) => {
  const { userId, email } = job.data;
  await sendWelcomeEmail(email);
  await db('users').where({ id: userId }).update({ welcome_email_sent: true });
});

Job Queue Options (2026)

ToolBest ForScaling
BullMQ (Redis)Node.js, high throughputHorizontal workers
Celery (Python)Python, complex workflowsHorizontal workers
Sidekiq (Ruby)Ruby/Rails ecosystemHorizontal workers
AWS SQS + LambdaServerless, event-drivenAuto-scales
AWS SQS + ECSControlled scaling, costManual worker scaling
TemporalComplex workflows, durabilityManaged or self-hosted

Principle 5: Connection Pooling

Database connections are expensive to create (10–50ms each). Without pooling, high-concurrency applications exhaust database connections.

// PostgreSQL with PgBouncer (connection pooler) in transaction mode
// Or use pg-pool directly for moderate scale

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,           // Max connections in pool
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

// Monitor pool health
setInterval(() => {
  console.log({
    totalCount: pool.totalCount,
    idleCount: pool.idleCount,
    waitingCount: pool.waitingCount,
  });
}, 60000);

For serverless (Lambda, Vercel Edge) — use RDS Proxy or PgBouncer. Serverless functions can create thousands of connections simultaneously; without a proxy, you'll hit PostgreSQL's connection limit.


Knowing What to Scale Next

Use these signals to identify your next bottleneck:

# Application layer bottleneck signals:
# - CPU > 80% on app servers while DB is idle
# - Response times increase linearly with concurrent users
# Solution: Add more app server instances (horizontal scale)

# Database bottleneck signals:
# - DB CPU > 70%
# - Slow query log filling up
# - Connection wait times increasing
# Solution: Add read replicas, optimize indexes, add caching

# Cache bottleneck signals:
# - Redis memory > 80% used
# - Cache hit rate < 70%
# - Redis CPU spikes
# Solution: Increase Redis memory, review eviction policy, add Redis cluster

# Network bottleneck signals:
# - Large response payloads (> 500KB per request)
# - Many small requests to same service
# Solution: Pagination, compression, HTTP/2, CDN for static assets

Scalability Cost Ranges (AWS, 2026)

TierMonthly TrafficArchitectureMonthly Cost
Starter<100K req/day1 ECS task + RDS t3.small$80–$150
Growth1M req/day3 ECS tasks + RDS t3.medium + ElastiCache$300–$600
Scale10M req/day5–10 ECS tasks + RDS r6g.large + replicas + ElastiCache$1,200–$2,500
Enterprise100M+ req/dayMulti-region, auto-scaling, Aurora + Redis cluster$8,000–$25,000

Working With Viprasol

We architect and implement scalable backend systems — from stateless service refactors through caching layers, read replica setups, and async job pipelines.

Architecture review →
Cloud Solutions →


See Also


Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need a Modern Web Application?

From landing pages to complex SaaS platforms — we build it all with Next.js and React.

Free consultation • No commitment • Response within 24 hours

Viprasol · Web Development

Need a custom web application built?

We build React and Next.js web applications with Lighthouse ≥90 scores, mobile-first design, and full source code ownership. Senior engineers only — from architecture through deployment.