Advanced API Rate Limiting in 2026: Token Bucket, Sliding Window, Redis Lua, and Tier-Based Limits

Rate limiting protects your API from abuse, prevents one customer from affecting others, and gives you a lever for monetization (tier-based limits). Getting it wrong costs you in three directions: too strict limits frustrate legitimate users, too loose limits enable abuse, and incorrect implementation creates race conditions that bypass limits entirely.

This post covers the four main algorithms, their tradeoffs, atomic Redis implementation that eliminates race conditions, and the tier-based system that aligns limits with your pricing model.

Algorithm Comparison

Algorithm	Memory	Smoothness	Burst Handling	Implementation
Fixed window	O(1)	Poor (edge spikes)	Allows 2× burst at window edge	Simple
Sliding window log	O(requests)	Perfect	No burst	Complex, high memory
Sliding window counter	O(1)	Good (approx.)	Minimal burst	Medium
Token bucket	O(1)	Excellent	Controlled burst	Medium
Leaky bucket	O(1)	Perfect	No burst	Simple

The fixed window problem: If your limit is 100 requests/minute, a user can make 100 requests at 12:00:59 and 100 more at 12:01:00 — 200 requests in 2 seconds.

For most APIs: Token bucket (allows burst, smooth refill) or sliding window counter (simple approximation of sliding log).

Fixed Window (Simple Baseline)

// src/middleware/rate-limit-fixed.ts
import { redis } from '@/lib/redis';

export async function fixedWindowRateLimit(
  key: string,
  limit: number,
  windowSeconds: number,
): Promise<{ allowed: boolean; remaining: number; resetAt: number }> {
  const windowKey = `rl:fixed:${key}:${Math.floor(Date.now() / 1000 / windowSeconds)}`;

  const pipeline = redis.pipeline();
  pipeline.incr(windowKey);
  pipeline.expire(windowKey, windowSeconds);
  const [[, count]] = await pipeline.exec() as [[null, number]];

  const remaining = Math.max(0, limit - count);
  const resetAt = (Math.floor(Date.now() / 1000 / windowSeconds) + 1) * windowSeconds;

  return {
    allowed: count <= limit,
    remaining,
    resetAt,
  };
}

🌐 Looking for a Dev Team That Actually Delivers?

Most agencies sell you a project manager and assign juniors. Viprasol is different — senior engineers only, direct Slack access, and a 5.0★ Upwork record across 100+ projects.

React, Next.js, Node.js, TypeScript — production-grade stack
Fixed-price contracts — no surprise invoices
Full source code ownership from day one
90-day post-launch support included

Get a Free Scope Review WhatsApp

Sliding Window Counter (Redis Lua — Atomic)

The sliding window counter approximates a true sliding window using two adjacent fixed windows weighted by how far through the current window we are. Critically, the check-and-increment must be atomic to prevent race conditions:

-- scripts/sliding-window.lua
-- KEYS[1] = rate limit key prefix
-- ARGV[1] = current window timestamp (floor of now/windowSeconds)
-- ARGV[2] = previous window timestamp (current - 1)
-- ARGV[3] = window duration in seconds
-- ARGV[4] = limit
-- ARGV[5] = current time in milliseconds

local current_window = KEYS[1] .. ":" .. ARGV[1]
local prev_window    = KEYS[1] .. ":" .. ARGV[2]
local window_ms      = tonumber(ARGV[3]) * 1000
local limit          = tonumber(ARGV[4])
local now_ms         = tonumber(ARGV[5])

-- How far through the current window are we? (0.0 to 1.0)
local current_window_start_ms = tonumber(ARGV[1]) * window_ms
local elapsed_fraction = (now_ms - current_window_start_ms) / window_ms

-- Weighted count: prev_window * (1 - elapsed) + current_window
local prev_count    = tonumber(redis.call('GET', prev_window) or 0)
local current_count = tonumber(redis.call('GET', current_window) or 0)
local weighted      = math.floor(prev_count * (1 - elapsed_fraction)) + current_count

if weighted >= limit then
  -- Rate limited — don't increment
  return {0, weighted, limit - weighted}
end

-- Increment current window
local new_count = redis.call('INCR', current_window)
redis.call('EXPIRE', current_window, tonumber(ARGV[3]) * 2)

return {1, weighted + 1, limit - weighted - 1}
-- Returns: {allowed (1/0), count, remaining}

// src/lib/rate-limiter.ts
import { redis } from './redis';
import { readFileSync } from 'fs';
import { createHash } from 'crypto';

// Load and cache Lua script SHA
const luaScript = readFileSync('./scripts/sliding-window.lua', 'utf-8');
let scriptSha: string | null = null;

async function getScriptSha(): Promise<string> {
  if (scriptSha) return scriptSha;
  scriptSha = await redis.script('LOAD', luaScript) as string;
  return scriptSha;
}

export interface RateLimitResult {
  allowed: boolean;
  count: number;
  remaining: number;
  limit: number;
  resetAt: number;     // Unix timestamp when window resets
  retryAfter?: number; // Seconds until next allowed request
}

export async function slidingWindowRateLimit(
  identifier: string,  // e.g., "user:123" or "api_key:abc"
  limit: number,
  windowSeconds: number,
): Promise<RateLimitResult> {
  const now = Date.now();
  const windowStart = Math.floor(now / 1000 / windowSeconds);
  const sha = await getScriptSha();

  try {
    const [allowed, count, remaining] = await redis.evalsha(
      sha,
      1,                           // numkeys
      `rl:sliding:${identifier}`,  // KEYS[1]
      String(windowStart),         // ARGV[1]: current window
      String(windowStart - 1),     // ARGV[2]: previous window
      String(windowSeconds),       // ARGV[3]: window duration
      String(limit),               // ARGV[4]: limit
      String(now),                 // ARGV[5]: current time ms
    ) as [number, number, number];

    const resetAt = (windowStart + 1) * windowSeconds;

    return {
      allowed: allowed === 1,
      count,
      remaining: Math.max(0, remaining),
      limit,
      resetAt,
      retryAfter: allowed === 0 ? resetAt - Math.floor(now / 1000) : undefined,
    };
  } catch (err: any) {
    // Script not loaded (after Redis restart) — reload and retry once
    if (err.message?.includes('NOSCRIPT')) {
      scriptSha = null;
      return slidingWindowRateLimit(identifier, limit, windowSeconds);
    }
    throw err;
  }
}

Token Bucket (Allows Controlled Burst)

-- scripts/token-bucket.lua
-- Refills at `refillRate` tokens per second, max `capacity` tokens
-- KEYS[1] = bucket key
-- ARGV[1] = capacity (max tokens)
-- ARGV[2] = refill rate (tokens per second)
-- ARGV[3] = requested tokens (usually 1)
-- ARGV[4] = current time in milliseconds

local key           = KEYS[1]
local capacity      = tonumber(ARGV[1])
local refill_rate   = tonumber(ARGV[2])
local requested     = tonumber(ARGV[3])
local now_ms        = tonumber(ARGV[4])

local bucket = redis.call('HMGET', key, 'tokens', 'last_refill_ms')
local tokens         = tonumber(bucket[1] or capacity)
local last_refill_ms = tonumber(bucket[2] or now_ms)

-- Refill tokens based on elapsed time
local elapsed_seconds = (now_ms - last_refill_ms) / 1000
local refilled = math.min(capacity, tokens + elapsed_seconds * refill_rate)

if refilled < requested then
  -- Not enough tokens — update state but deny
  redis.call('HMSET', key, 'tokens', refilled, 'last_refill_ms', now_ms)
  redis.call('EXPIRE', key, math.ceil(capacity / refill_rate) + 10)
  -- Seconds until enough tokens accumulate:
  local wait = (requested - refilled) / refill_rate
  return {0, math.floor(refilled), math.ceil(wait)}
end

local new_tokens = refilled - requested
redis.call('HMSET', key, 'tokens', new_tokens, 'last_refill_ms', now_ms)
redis.call('EXPIRE', key, math.ceil(capacity / refill_rate) + 10)

return {1, math.floor(new_tokens), 0}

🚀 Senior Engineers. No Junior Handoffs. Ever.

You get the senior developer, not a project manager who relays your requirements to someone you never meet. Every Viprasol project has a senior lead from kickoff to launch.

MVPs in 4–8 weeks, full platforms in 3–5 months
Lighthouse 90+ performance scores standard
Works across US, UK, AU timezones
Free 30-min architecture review, no commitment

Start My Project WhatsApp

Tier-Based Rate Limits

// src/lib/tier-rate-limits.ts
// Different limits per subscription tier

export type Tier = 'free' | 'pro' | 'enterprise';

interface TierLimits {
  requestsPerMinute: number;
  requestsPerHour: number;
  requestsPerDay: number;
  burstCapacity: number;       // Token bucket capacity (burst allowance)
  burstRefillPerSecond: number; // Token bucket refill rate
}

export const TIER_LIMITS: Record<Tier, TierLimits> = {
  free: {
    requestsPerMinute: 30,
    requestsPerHour: 500,
    requestsPerDay: 2_000,
    burstCapacity: 50,
    burstRefillPerSecond: 0.5,
  },
  pro: {
    requestsPerMinute: 200,
    requestsPerHour: 5_000,
    requestsPerDay: 50_000,
    burstCapacity: 300,
    burstRefillPerSecond: 3,
  },
  enterprise: {
    requestsPerMinute: 2_000,
    requestsPerHour: 50_000,
    requestsPerDay: 1_000_000,
    burstCapacity: 3_000,
    burstRefillPerSecond: 33,
  },
};

// Multi-layer rate limit check (minute + hour + day)
export async function checkTierRateLimit(
  apiKey: string,
  tier: Tier,
): Promise<RateLimitResult> {
  const limits = TIER_LIMITS[tier];

  // Check all windows in parallel
  const [minute, hour, day] = await Promise.all([
    slidingWindowRateLimit(`apikey:${apiKey}:min`, limits.requestsPerMinute, 60),
    slidingWindowRateLimit(`apikey:${apiKey}:hour`, limits.requestsPerHour, 3600),
    slidingWindowRateLimit(`apikey:${apiKey}:day`, limits.requestsPerDay, 86400),
  ]);

  // Most restrictive limit wins
  const binding = [minute, hour, day].reduce((most, current) =>
    current.remaining < most.remaining ? current : most,
  );

  const allowed = minute.allowed && hour.allowed && day.allowed;

  return {
    ...binding,
    allowed,
    limit: binding.limit,
  };
}

Fastify Rate Limit Middleware with Headers

// src/middleware/rate-limit.ts
import { FastifyRequest, FastifyReply } from 'fastify';
import { getApiKeyFromRequest, getApiKeyTier } from '@/lib/auth';
import { checkTierRateLimit } from '@/lib/tier-rate-limits';

export async function rateLimitMiddleware(
  req: FastifyRequest,
  reply: FastifyReply,
): Promise<void> {
  const apiKey = getApiKeyFromRequest(req);

  if (!apiKey) {
    // Unauthenticated: apply IP-based limit (strict)
    const result = await slidingWindowRateLimit(`ip:${req.ip}`, 10, 60);
    setRateLimitHeaders(reply, result);
    if (!result.allowed) {
      reply.status(429).send(rateLimitResponse(result));
      return;
    }
    return;
  }

  const tier = await getApiKeyTier(apiKey);
  const result = await checkTierRateLimit(apiKey, tier);

  // Always set rate limit headers (even on success)
  setRateLimitHeaders(reply, result);

  if (!result.allowed) {
    return reply.status(429).send(rateLimitResponse(result));
  }
}

function setRateLimitHeaders(reply: FastifyReply, result: RateLimitResult): void {
  reply.header('X-RateLimit-Limit',     result.limit);
  reply.header('X-RateLimit-Remaining', result.remaining);
  reply.header('X-RateLimit-Reset',     result.resetAt);
  reply.header('X-RateLimit-Policy',    `${result.limit};w=60`);

  if (!result.allowed && result.retryAfter !== undefined) {
    reply.header('Retry-After', result.retryAfter);
  }
}

function rateLimitResponse(result: RateLimitResult) {
  return {
    error: 'Too Many Requests',
    code: 'RATE_LIMIT_EXCEEDED',
    limit: result.limit,
    remaining: 0,
    resetAt: result.resetAt,
    retryAfter: result.retryAfter,
    message: `Rate limit exceeded. Retry after ${result.retryAfter} seconds.`,
    upgradeUrl: 'https://myapp.com/pricing',
  };
}

Rate Limit Testing

// src/__tests__/rate-limit.test.ts
import { slidingWindowRateLimit } from '@/lib/rate-limiter';
import { redis } from '@/lib/redis';

beforeEach(async () => {
  await redis.flushdb();  // Clean state for each test
});

describe('slidingWindowRateLimit', () => {
  it('allows requests under the limit', async () => {
    for (let i = 0; i < 5; i++) {
      const result = await slidingWindowRateLimit('test:user', 10, 60);
      expect(result.allowed).toBe(true);
      expect(result.remaining).toBe(10 - (i + 1));
    }
  });

  it('blocks requests over the limit', async () => {
    for (let i = 0; i < 10; i++) {
      await slidingWindowRateLimit('test:over', 10, 60);
    }
    const result = await slidingWindowRateLimit('test:over', 10, 60);
    expect(result.allowed).toBe(false);
    expect(result.remaining).toBe(0);
    expect(result.retryAfter).toBeGreaterThan(0);
  });

  it('isolates rate limits by key', async () => {
    for (let i = 0; i < 10; i++) {
      await slidingWindowRateLimit('user:1', 10, 60);
    }
    // user:2 should be unaffected
    const result = await slidingWindowRateLimit('user:2', 10, 60);
    expect(result.allowed).toBe(true);
  });
});

Working With Viprasol

We implement production-grade rate limiting for APIs — from Lua script design through tier-based limits, header standards, and monitoring dashboards.

What we deliver:

Sliding window and token bucket Lua scripts (atomic, race-condition-free)
Tier-based rate limits aligned with your pricing model
Fastify/Express middleware with RFC-compliant rate limit headers
Rate limit monitoring dashboard (requests blocked by tier, top consumers)
Webhook rate limit handling guidance for customers

→ Discuss your API rate limiting needs → API development services

Advanced API Rate Limiting in 2026: Token Bucket, Sliding Window, Redis Lua, and Tier-Based Limits

Advanced API Rate Limiting in 2026: Token Bucket, Sliding Window, Redis Lua, and Tier-Based Limits

Algorithm Comparison

Fixed Window (Simple Baseline)

🌐 Looking for a Dev Team That Actually Delivers?

Sliding Window Counter (Redis Lua — Atomic)

Token Bucket (Allows Controlled Burst)

🚀 Senior Engineers. No Junior Handoffs. Ever.

Tier-Based Rate Limits

Fastify Rate Limit Middleware with Headers

Rate Limit Testing

Working With Viprasol

See Also

Viprasol Tech Team

Need a Modern Web Application?

Need a custom web application built?

Related Articles

Next.js API Rate Limiting with Upstash Redis: Per-User, Per-IP, and Sliding Window Algorithms

GraphQL DataLoader: Batch Loading, Caching, and N+1 Prevention in TypeScript

GraphQL Persisted Queries: APQ, Query Whitelisting, and CDN Caching