API Gateway Patterns: Rate Limiting, Auth, Routing, and When to Build Your Own

An API gateway is the front door to your backend. Every request your clients make passes through it — which means it's also your best opportunity to enforce security, control traffic, and decouple services without scattering that logic across every microservice.

Done right, an API gateway handles authentication, rate limiting, request transformation, and routing in one place. Done wrong, it becomes a monolithic bottleneck that's harder to maintain than the services behind it.

This guide covers the patterns that work, the tradeoffs between managed and custom solutions, and the code to implement them.

What an API Gateway Actually Does

Before comparing tools, it helps to be precise about responsibilities. An API gateway sits between clients and backend services and handles:

Concern	Gateway Handles	Service Should Not Handle
Authentication	Verify JWT/API key	Re-implement token validation
Rate Limiting	Per-client request quotas	Per-endpoint throttling logic
Request Routing	Path → service mapping	Knowing about other services
SSL Termination	HTTPS → HTTP internally	Certificate management
Request/Response Transform	Header injection, body rewrite	Client-specific formatting
Observability	Centralized access logs, metrics	Per-service request logging
Circuit Breaking	Fail fast on downstream errors	Cascading retry storms

The key insight: cross-cutting concerns belong at the gateway, not inside every service.

Core Pattern 1: JWT Authentication at the Edge

The most common mistake is validating JWTs inside each microservice. That means every service needs the signing key, every service imports a JWT library, and a key rotation requires updating every service.

Push auth to the gateway instead:

// gateway/middleware/auth.ts
import { FastifyRequest, FastifyReply } from 'fastify';
import { jwtVerify, importJWK } from 'jose';

const JWKS_URL = process.env.JWKS_URL!; // e.g. https://auth.example.com/.well-known/jwks.json

let cachedJWKS: any = null;
let jwksExpiry = 0;

async function getJWKS() {
  if (Date.now() < jwksExpiry) return cachedJWKS;
  const res = await fetch(JWKS_URL);
  cachedJWKS = await res.json();
  jwksExpiry = Date.now() + 3_600_000; // cache 1 hour
  return cachedJWKS;
}

export async function authMiddleware(
  request: FastifyRequest,
  reply: FastifyReply
) {
  const authHeader = request.headers.authorization;
  if (!authHeader?.startsWith('Bearer ')) {
    return reply.code(401).send({ error: 'Missing authorization header' });
  }

  const token = authHeader.slice(7);
  try {
    const jwks = await getJWKS();
    const key = await importJWK(jwks.keys[0]);
    const { payload } = await jwtVerify(token, key, {
      issuer: process.env.JWT_ISSUER,
      audience: process.env.JWT_AUDIENCE,
    });

    // Inject identity headers for downstream services
    request.headers['x-user-id'] = payload.sub as string;
    request.headers['x-user-email'] = payload.email as string;
    request.headers['x-user-roles'] = JSON.stringify(payload.roles);
  } catch (err) {
    return reply.code(401).send({ error: 'Invalid or expired token' });
  }
}

The downstream services receive x-user-id and x-user-roles headers — they trust the gateway already validated the token. No JWT library needed inside each service.

🌐 Looking for a Dev Team That Actually Delivers?

Most agencies sell you a project manager and assign juniors. Viprasol is different — senior engineers only, direct Slack access, and a 5.0★ Upwork record across 100+ projects.

React, Next.js, Node.js, TypeScript — production-grade stack
Fixed-price contracts — no surprise invoices
Full source code ownership from day one
90-day post-launch support included

Get a Free Scope Review WhatsApp

Core Pattern 2: Rate Limiting with Redis Sliding Window

Per-client rate limiting prevents abuse without punishing everyone when one bad actor spikes traffic. The sliding window algorithm is more accurate than fixed windows (which can allow 2× burst at window boundaries).

// gateway/middleware/rateLimiter.ts
import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL!);

interface RateLimitOptions {
  windowMs: number;   // e.g. 60_000 for 1 minute
  maxRequests: number; // e.g. 100
}

export async function rateLimiter(
  clientId: string,
  options: RateLimitOptions
): Promise<{ allowed: boolean; remaining: number; resetAt: number }> {
  const now = Date.now();
  const windowStart = now - options.windowMs;
  const key = `rl:${clientId}`;

  const pipeline = redis.pipeline();
  pipeline.zremrangebyscore(key, '-inf', windowStart);        // Remove old entries
  pipeline.zadd(key, now.toString(), `${now}-${Math.random()}`); // Add current request
  pipeline.zcard(key);                                         // Count requests in window
  pipeline.expire(key, Math.ceil(options.windowMs / 1000));   // TTL cleanup

  const results = await pipeline.exec();
  const requestCount = results![2][1] as number;

  const allowed = requestCount <= options.maxRequests;
  const remaining = Math.max(0, options.maxRequests - requestCount);
  const resetAt = now + options.windowMs;

  return { allowed, remaining, resetAt };
}

// Usage in route handler
export async function rateLimitMiddleware(
  request: FastifyRequest,
  reply: FastifyReply
) {
  const clientId = request.headers['x-user-id'] as string 
    ?? request.ip;

  const { allowed, remaining, resetAt } = await rateLimiter(clientId, {
    windowMs: 60_000,
    maxRequests: 100,
  });

  reply.header('X-RateLimit-Limit', '100');
  reply.header('X-RateLimit-Remaining', remaining.toString());
  reply.header('X-RateLimit-Reset', resetAt.toString());

  if (!allowed) {
    return reply.code(429).send({
      error: 'Rate limit exceeded',
      retryAfter: Math.ceil((resetAt - Date.now()) / 1000),
    });
  }
}

For per-route limits (e.g., login endpoints get stricter limits), pass different maxRequests values per path prefix.

Core Pattern 3: Request Routing with Circuit Breaking

Routing maps incoming paths to upstream services. A circuit breaker prevents cascading failures when a downstream service is unhealthy.

// gateway/router.ts
import { FastifyInstance } from 'fastify';
import CircuitBreaker from 'opossum';
import httpProxy from '@fastify/http-proxy';

interface ServiceConfig {
  prefix: string;
  upstream: string;
  timeout: number;
  rateLimit?: { windowMs: number; maxRequests: number };
  auth?: boolean;
}

const services: ServiceConfig[] = [
  { prefix: '/api/users',    upstream: 'http://user-service:3001',    timeout: 5000, auth: true },
  { prefix: '/api/orders',   upstream: 'http://order-service:3002',   timeout: 10000, auth: true },
  { prefix: '/api/products', upstream: 'http://product-service:3003', timeout: 5000, auth: false },
  { prefix: '/api/payments', upstream: 'http://payment-service:3004', timeout: 15000, auth: true },
];

function makeCircuitBreaker(upstream: string, timeout: number) {
  const breaker = new CircuitBreaker(
    async (req: Request) => {
      const res = await fetch(upstream, { signal: AbortSignal.timeout(timeout) });
      if (!res.ok) throw new Error(`Upstream ${upstream} returned ${res.status}`);
      return res;
    },
    {
      timeout,
      errorThresholdPercentage: 50, // Open after 50% failures
      resetTimeout: 30000,           // Try again after 30s
      volumeThreshold: 5,            // Need 5 requests before counting
    }
  );

  breaker.on('open', () => {
    console.warn(`Circuit OPEN for ${upstream}`);
  });
  breaker.on('halfOpen', () => {
    console.info(`Circuit HALF-OPEN for ${upstream} — testing`);
  });

  return breaker;
}

export function registerRoutes(app: FastifyInstance) {
  for (const service of services) {
    const breaker = makeCircuitBreaker(service.upstream, service.timeout);

    app.register(httpProxy, {
      upstream: service.upstream,
      prefix: service.prefix,
      preHandler: async (request, reply) => {
        if (service.auth) {
          await authMiddleware(request, reply);
        }
        await rateLimitMiddleware(request, reply);
      },
      replyOptions: {
        onError: (reply, error) => {
          if (breaker.opened) {
            reply.code(503).send({
              error: 'Service temporarily unavailable',
              service: service.prefix,
            });
          } else {
            reply.code(502).send({ error: 'Bad gateway' });
          }
        },
      },
    });
  }
}

🚀 Senior Engineers. No Junior Handoffs. Ever.

You get the senior developer, not a project manager who relays your requirements to someone you never meet. Every Viprasol project has a senior lead from kickoff to launch.

MVPs in 4–8 weeks, full platforms in 3–5 months
Lighthouse 90+ performance scores standard
Works across US, UK, AU timezones
Free 30-min architecture review, no commitment

Start My Project WhatsApp

Managed vs Custom: When Each Makes Sense

Option	Best For	Avoid When
AWS API Gateway	AWS-native apps, serverless backends, pay-per-request	High throughput (costs spike), complex routing logic
Kong Gateway	Multi-cloud, rich plugin ecosystem, team wants GUI	Small teams, overhead of managing Kong itself
Nginx	Simple reverse proxy, SSL termination, static rules	Dynamic routing, per-user rate limits
Envoy	Service mesh sidecars, Kubernetes-native	Standalone gateway use case
Custom (Fastify/Express)	Full control, unique auth flows, embedded business logic	Teams without backend expertise to maintain it
Traefik	Docker/Kubernetes auto-discovery, simple setup	Enterprise-grade auth/rate limiting needs

The honest recommendation: Most teams should start with AWS API Gateway or Kong before building custom. The exception is when your auth or routing logic is complex enough that configuring it in a managed tool costs more engineering time than writing it.

AWS API Gateway Cost Model

AWS API Gateway charges per request plus data transfer:

Tier	HTTP API	REST API
First 300M requests/month	$1.00/million	$3.50/million
300M–1B requests/month	$0.90/million	$2.80/million
Data transfer out	$0.09/GB	$0.09/GB
Custom domain	Free	$0.025/hour

Example: 10M requests/month on HTTP API = ~$10/month. At 1B requests/month = ~$960/month. At that scale, a self-managed Kong or Nginx on ECS becomes cheaper.

Kong on ECS Fargate (2 vCPU, 4GB): ~$70/month for the container, unlimited requests.

Production Gateway Checklist

Before going live with any API gateway:

JWT validation at the edge, user identity injected as headers
Per-client rate limiting with Redis (not in-memory — won't survive restarts or multi-instance)
Circuit breakers on all upstream service calls
Request timeouts set per service (payments need longer than reads)
Structured access logs with request_id, user_id, upstream, latency_ms
Health check endpoint at /health bypasses auth + rate limiting
CORS headers configured once at gateway (not in each service)
Security headers (Strict-Transport-Security, X-Content-Type-Options, etc.)
mTLS for internal service-to-service (optional but recommended for sensitive services)

Cost to Build and Operate

Approach	Setup Cost	Monthly Infra	Maintenance
AWS API Gateway	$0 setup	$10–960 (request-based)	Low
Kong on ECS (managed)	$2,000–5,000 config	$70–200	Medium
Custom Fastify gateway	$8,000–20,000 dev	$50–150 (ECS)	High
Nginx reverse proxy	$500–2,000 config	$20–80	Low–Medium

For most startups, AWS API Gateway HTTP API + Lambda authorizer covers the first $1M ARR with minimal ops overhead. For scale-ups handling >100M requests/month with complex routing, self-managed Kong or a custom gateway pays for itself quickly.

Working With Viprasol

We design and implement production API gateways as part of backend architecture engagements. Whether you need a managed solution configured correctly or a custom gateway built around your auth and routing requirements, our team handles the full implementation — including Redis-backed rate limiting, circuit breakers, observability, and zero-downtime deploys.

Our clients typically start seeing reduced API abuse and faster auth within the first week of deployment.

→ Talk to our backend team about your API architecture.

API Gateway Patterns: Rate Limiting, Auth, Routing, and When to Build Your Own

API Gateway Patterns: Rate Limiting, Auth, Routing, and When to Build Your Own

What an API Gateway Actually Does

Core Pattern 1: JWT Authentication at the Edge

🌐 Looking for a Dev Team That Actually Delivers?

Core Pattern 2: Rate Limiting with Redis Sliding Window

Core Pattern 3: Request Routing with Circuit Breaking

🚀 Senior Engineers. No Junior Handoffs. Ever.

Managed vs Custom: When Each Makes Sense

AWS API Gateway Cost Model

Production Gateway Checklist

Cost to Build and Operate

Working With Viprasol

See Also

Viprasol Tech Team

Need a Modern Web Application?

Need a custom web application built?

Related Articles

API Gateway: Rate Limiting, Auth, and Routing Best Practices (2026)

API Rate Limiting: Token Bucket, Sliding Window, and Production Implementation

Web Authentication: JWT, OAuth, and Session-Based Auth