Advanced API Rate Limiting in 2026: Token Bucket, Sliding Window, Redis Lua, and Tier-Based Limits
Implement production API rate limiting: token bucket vs sliding window comparison, Redis Lua atomic scripts, tier-based limits per API key, burst allowance, and rate limit headers.
Advanced API Rate Limiting in 2026: Token Bucket, Sliding Window, Redis Lua, and Tier-Based Limits
Rate limiting protects your API from abuse, prevents one customer from affecting others, and gives you a lever for monetization (tier-based limits). Getting it wrong costs you in three directions: too strict limits frustrate legitimate users, too loose limits enable abuse, and incorrect implementation creates race conditions that bypass limits entirely.
This post covers the four main algorithms, their tradeoffs, atomic Redis implementation that eliminates race conditions, and the tier-based system that aligns limits with your pricing model.
Algorithm Comparison
| Algorithm | Memory | Smoothness | Burst Handling | Implementation |
|---|---|---|---|---|
| Fixed window | O(1) | Poor (edge spikes) | Allows 2ร burst at window edge | Simple |
| Sliding window log | O(requests) | Perfect | No burst | Complex, high memory |
| Sliding window counter | O(1) | Good (approx.) | Minimal burst | Medium |
| Token bucket | O(1) | Excellent | Controlled burst | Medium |
| Leaky bucket | O(1) | Perfect | No burst | Simple |
The fixed window problem: If your limit is 100 requests/minute, a user can make 100 requests at 12:00:59 and 100 more at 12:01:00 โ 200 requests in 2 seconds.
For most APIs: Token bucket (allows burst, smooth refill) or sliding window counter (simple approximation of sliding log).
Fixed Window (Simple Baseline)
// src/middleware/rate-limit-fixed.ts
import { redis } from '@/lib/redis';
export async function fixedWindowRateLimit(
key: string,
limit: number,
windowSeconds: number,
): Promise<{ allowed: boolean; remaining: number; resetAt: number }> {
const windowKey = `rl:fixed:${key}:${Math.floor(Date.now() / 1000 / windowSeconds)}`;
const pipeline = redis.pipeline();
pipeline.incr(windowKey);
pipeline.expire(windowKey, windowSeconds);
const [[, count]] = await pipeline.exec() as [[null, number]];
const remaining = Math.max(0, limit - count);
const resetAt = (Math.floor(Date.now() / 1000 / windowSeconds) + 1) * windowSeconds;
return {
allowed: count <= limit,
remaining,
resetAt,
};
}
๐ Looking for a Dev Team That Actually Delivers?
Most agencies sell you a project manager and assign juniors. Viprasol is different โ senior engineers only, direct Slack access, and a 5.0โ Upwork record across 100+ projects.
- React, Next.js, Node.js, TypeScript โ production-grade stack
- Fixed-price contracts โ no surprise invoices
- Full source code ownership from day one
- 90-day post-launch support included
Sliding Window Counter (Redis Lua โ Atomic)
The sliding window counter approximates a true sliding window using two adjacent fixed windows weighted by how far through the current window we are. Critically, the check-and-increment must be atomic to prevent race conditions:
-- scripts/sliding-window.lua
-- KEYS[1] = rate limit key prefix
-- ARGV[1] = current window timestamp (floor of now/windowSeconds)
-- ARGV[2] = previous window timestamp (current - 1)
-- ARGV[3] = window duration in seconds
-- ARGV[4] = limit
-- ARGV[5] = current time in milliseconds
local current_window = KEYS[1] .. ":" .. ARGV[1]
local prev_window = KEYS[1] .. ":" .. ARGV[2]
local window_ms = tonumber(ARGV[3]) * 1000
local limit = tonumber(ARGV[4])
local now_ms = tonumber(ARGV[5])
-- How far through the current window are we? (0.0 to 1.0)
local current_window_start_ms = tonumber(ARGV[1]) * window_ms
local elapsed_fraction = (now_ms - current_window_start_ms) / window_ms
-- Weighted count: prev_window * (1 - elapsed) + current_window
local prev_count = tonumber(redis.call('GET', prev_window) or 0)
local current_count = tonumber(redis.call('GET', current_window) or 0)
local weighted = math.floor(prev_count * (1 - elapsed_fraction)) + current_count
if weighted >= limit then
-- Rate limited โ don't increment
return {0, weighted, limit - weighted}
end
-- Increment current window
local new_count = redis.call('INCR', current_window)
redis.call('EXPIRE', current_window, tonumber(ARGV[3]) * 2)
return {1, weighted + 1, limit - weighted - 1}
-- Returns: {allowed (1/0), count, remaining}
// src/lib/rate-limiter.ts
import { redis } from './redis';
import { readFileSync } from 'fs';
import { createHash } from 'crypto';
// Load and cache Lua script SHA
const luaScript = readFileSync('./scripts/sliding-window.lua', 'utf-8');
let scriptSha: string | null = null;
async function getScriptSha(): Promise<string> {
if (scriptSha) return scriptSha;
scriptSha = await redis.script('LOAD', luaScript) as string;
return scriptSha;
}
export interface RateLimitResult {
allowed: boolean;
count: number;
remaining: number;
limit: number;
resetAt: number; // Unix timestamp when window resets
retryAfter?: number; // Seconds until next allowed request
}
export async function slidingWindowRateLimit(
identifier: string, // e.g., "user:123" or "api_key:abc"
limit: number,
windowSeconds: number,
): Promise<RateLimitResult> {
const now = Date.now();
const windowStart = Math.floor(now / 1000 / windowSeconds);
const sha = await getScriptSha();
try {
const [allowed, count, remaining] = await redis.evalsha(
sha,
1, // numkeys
`rl:sliding:${identifier}`, // KEYS[1]
String(windowStart), // ARGV[1]: current window
String(windowStart - 1), // ARGV[2]: previous window
String(windowSeconds), // ARGV[3]: window duration
String(limit), // ARGV[4]: limit
String(now), // ARGV[5]: current time ms
) as [number, number, number];
const resetAt = (windowStart + 1) * windowSeconds;
return {
allowed: allowed === 1,
count,
remaining: Math.max(0, remaining),
limit,
resetAt,
retryAfter: allowed === 0 ? resetAt - Math.floor(now / 1000) : undefined,
};
} catch (err: any) {
// Script not loaded (after Redis restart) โ reload and retry once
if (err.message?.includes('NOSCRIPT')) {
scriptSha = null;
return slidingWindowRateLimit(identifier, limit, windowSeconds);
}
throw err;
}
}
Token Bucket (Allows Controlled Burst)
-- scripts/token-bucket.lua
-- Refills at `refillRate` tokens per second, max `capacity` tokens
-- KEYS[1] = bucket key
-- ARGV[1] = capacity (max tokens)
-- ARGV[2] = refill rate (tokens per second)
-- ARGV[3] = requested tokens (usually 1)
-- ARGV[4] = current time in milliseconds
local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])
local requested = tonumber(ARGV[3])
local now_ms = tonumber(ARGV[4])
local bucket = redis.call('HMGET', key, 'tokens', 'last_refill_ms')
local tokens = tonumber(bucket[1] or capacity)
local last_refill_ms = tonumber(bucket[2] or now_ms)
-- Refill tokens based on elapsed time
local elapsed_seconds = (now_ms - last_refill_ms) / 1000
local refilled = math.min(capacity, tokens + elapsed_seconds * refill_rate)
if refilled < requested then
-- Not enough tokens โ update state but deny
redis.call('HMSET', key, 'tokens', refilled, 'last_refill_ms', now_ms)
redis.call('EXPIRE', key, math.ceil(capacity / refill_rate) + 10)
-- Seconds until enough tokens accumulate:
local wait = (requested - refilled) / refill_rate
return {0, math.floor(refilled), math.ceil(wait)}
end
local new_tokens = refilled - requested
redis.call('HMSET', key, 'tokens', new_tokens, 'last_refill_ms', now_ms)
redis.call('EXPIRE', key, math.ceil(capacity / refill_rate) + 10)
return {1, math.floor(new_tokens), 0}
๐ Senior Engineers. No Junior Handoffs. Ever.
You get the senior developer, not a project manager who relays your requirements to someone you never meet. Every Viprasol project has a senior lead from kickoff to launch.
- MVPs in 4โ8 weeks, full platforms in 3โ5 months
- Lighthouse 90+ performance scores standard
- Works across US, UK, AU timezones
- Free 30-min architecture review, no commitment
Tier-Based Rate Limits
// src/lib/tier-rate-limits.ts
// Different limits per subscription tier
export type Tier = 'free' | 'pro' | 'enterprise';
interface TierLimits {
requestsPerMinute: number;
requestsPerHour: number;
requestsPerDay: number;
burstCapacity: number; // Token bucket capacity (burst allowance)
burstRefillPerSecond: number; // Token bucket refill rate
}
export const TIER_LIMITS: Record<Tier, TierLimits> = {
free: {
requestsPerMinute: 30,
requestsPerHour: 500,
requestsPerDay: 2_000,
burstCapacity: 50,
burstRefillPerSecond: 0.5,
},
pro: {
requestsPerMinute: 200,
requestsPerHour: 5_000,
requestsPerDay: 50_000,
burstCapacity: 300,
burstRefillPerSecond: 3,
},
enterprise: {
requestsPerMinute: 2_000,
requestsPerHour: 50_000,
requestsPerDay: 1_000_000,
burstCapacity: 3_000,
burstRefillPerSecond: 33,
},
};
// Multi-layer rate limit check (minute + hour + day)
export async function checkTierRateLimit(
apiKey: string,
tier: Tier,
): Promise<RateLimitResult> {
const limits = TIER_LIMITS[tier];
// Check all windows in parallel
const [minute, hour, day] = await Promise.all([
slidingWindowRateLimit(`apikey:${apiKey}:min`, limits.requestsPerMinute, 60),
slidingWindowRateLimit(`apikey:${apiKey}:hour`, limits.requestsPerHour, 3600),
slidingWindowRateLimit(`apikey:${apiKey}:day`, limits.requestsPerDay, 86400),
]);
// Most restrictive limit wins
const binding = [minute, hour, day].reduce((most, current) =>
current.remaining < most.remaining ? current : most,
);
const allowed = minute.allowed && hour.allowed && day.allowed;
return {
...binding,
allowed,
limit: binding.limit,
};
}
Fastify Rate Limit Middleware with Headers
// src/middleware/rate-limit.ts
import { FastifyRequest, FastifyReply } from 'fastify';
import { getApiKeyFromRequest, getApiKeyTier } from '@/lib/auth';
import { checkTierRateLimit } from '@/lib/tier-rate-limits';
export async function rateLimitMiddleware(
req: FastifyRequest,
reply: FastifyReply,
): Promise<void> {
const apiKey = getApiKeyFromRequest(req);
if (!apiKey) {
// Unauthenticated: apply IP-based limit (strict)
const result = await slidingWindowRateLimit(`ip:${req.ip}`, 10, 60);
setRateLimitHeaders(reply, result);
if (!result.allowed) {
reply.status(429).send(rateLimitResponse(result));
return;
}
return;
}
const tier = await getApiKeyTier(apiKey);
const result = await checkTierRateLimit(apiKey, tier);
// Always set rate limit headers (even on success)
setRateLimitHeaders(reply, result);
if (!result.allowed) {
return reply.status(429).send(rateLimitResponse(result));
}
}
function setRateLimitHeaders(reply: FastifyReply, result: RateLimitResult): void {
reply.header('X-RateLimit-Limit', result.limit);
reply.header('X-RateLimit-Remaining', result.remaining);
reply.header('X-RateLimit-Reset', result.resetAt);
reply.header('X-RateLimit-Policy', `${result.limit};w=60`);
if (!result.allowed && result.retryAfter !== undefined) {
reply.header('Retry-After', result.retryAfter);
}
}
function rateLimitResponse(result: RateLimitResult) {
return {
error: 'Too Many Requests',
code: 'RATE_LIMIT_EXCEEDED',
limit: result.limit,
remaining: 0,
resetAt: result.resetAt,
retryAfter: result.retryAfter,
message: `Rate limit exceeded. Retry after ${result.retryAfter} seconds.`,
upgradeUrl: 'https://myapp.com/pricing',
};
}
Rate Limit Testing
// src/__tests__/rate-limit.test.ts
import { slidingWindowRateLimit } from '@/lib/rate-limiter';
import { redis } from '@/lib/redis';
beforeEach(async () => {
await redis.flushdb(); // Clean state for each test
});
describe('slidingWindowRateLimit', () => {
it('allows requests under the limit', async () => {
for (let i = 0; i < 5; i++) {
const result = await slidingWindowRateLimit('test:user', 10, 60);
expect(result.allowed).toBe(true);
expect(result.remaining).toBe(10 - (i + 1));
}
});
it('blocks requests over the limit', async () => {
for (let i = 0; i < 10; i++) {
await slidingWindowRateLimit('test:over', 10, 60);
}
const result = await slidingWindowRateLimit('test:over', 10, 60);
expect(result.allowed).toBe(false);
expect(result.remaining).toBe(0);
expect(result.retryAfter).toBeGreaterThan(0);
});
it('isolates rate limits by key', async () => {
for (let i = 0; i < 10; i++) {
await slidingWindowRateLimit('user:1', 10, 60);
}
// user:2 should be unaffected
const result = await slidingWindowRateLimit('user:2', 10, 60);
expect(result.allowed).toBe(true);
});
});
Working With Viprasol
We implement production-grade rate limiting for APIs โ from Lua script design through tier-based limits, header standards, and monitoring dashboards.
What we deliver:
- Sliding window and token bucket Lua scripts (atomic, race-condition-free)
- Tier-based rate limits aligned with your pricing model
- Fastify/Express middleware with RFC-compliant rate limit headers
- Rate limit monitoring dashboard (requests blocked by tier, top consumers)
- Webhook rate limit handling guidance for customers
โ Discuss your API rate limiting needs โ API development services
About the Author
Viprasol Tech Team
Custom Software Development Specialists
The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.
Need a Modern Web Application?
From landing pages to complex SaaS platforms โ we build it all with Next.js and React.
Free consultation โข No commitment โข Response within 24 hours
Need a custom web application built?
We build React and Next.js web applications with Lighthouse โฅ90 scores, mobile-first design, and full source code ownership. Senior engineers only โ from architecture through deployment.