Without rate limiting, a single misbehaving client can exhaust your database connections, run up your LLM API bill, or take down your service for everyone else. Rate limiting is not optional — it's a baseline requirement for any public-facing API.

Upstash Redis is the ideal choice for Next.js rate limiting: it's serverless (no persistent connection needed), globally distributed for edge deployments, and has first-class TypeScript support with an atomic sliding window implementation.

Installation

npm install @upstash/ratelimit @upstash/redis

Rate Limiter Setup

// lib/rate-limit.ts
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";

// Shared Redis client (reused across all limiters)
const redis = new Redis({
  url:   process.env.UPSTASH_REDIS_REST_URL!,
  token: process.env.UPSTASH_REDIS_REST_TOKEN!,
});

// Sliding window: requests distributed evenly within the window
// More accurate than fixed window (no burst at window boundary)

export const rateLimiters = {
  // General API: 100 requests per minute per user
  api: new Ratelimit({
    redis,
    limiter: Ratelimit.slidingWindow(100, "1 m"),
    analytics: true,      // Track usage in Upstash console
    prefix: "rl:api",
  }),

  // Auth endpoints: stricter — 10 attempts per 15 minutes per IP
  auth: new Ratelimit({
    redis,
    limiter: Ratelimit.slidingWindow(10, "15 m"),
    analytics: true,
    prefix: "rl:auth",
  }),

  // LLM / AI endpoints: cost-based — 20 requests per hour per user
  ai: new Ratelimit({
    redis,
    limiter: Ratelimit.slidingWindow(20, "1 h"),
    analytics: true,
    prefix: "rl:ai",
  }),

  // Exports / heavy operations: 5 per day per user
  exports: new Ratelimit({
    redis,
    limiter: Ratelimit.fixedWindow(5, "24 h"),
    analytics: true,
    prefix: "rl:exports",
  }),

  // IP-based: catch unauthenticated abuse
  ip: new Ratelimit({
    redis,
    limiter: Ratelimit.slidingWindow(200, "1 m"),
    prefix: "rl:ip",
  }),
};

// Helper: extract IP from Next.js request
export function getIp(req: Request): string {
  const forwarded = req.headers.get("x-forwarded-for");
  if (forwarded) return forwarded.split(",")[0].trim();
  return req.headers.get("x-real-ip") ?? "unknown";
}

🌐 Looking for a Dev Team That Actually Delivers?

Most agencies sell you a project manager and assign juniors. Viprasol is different — senior engineers only, direct Slack access, and a 5.0★ Upwork record across 100+ projects.

React, Next.js, Node.js, TypeScript — production-grade stack
Fixed-price contracts — no surprise invoices
Full source code ownership from day one
90-day post-launch support included

Get a Free Scope Review WhatsApp

Route Handler: Per-User Rate Limiting

// lib/with-rate-limit.ts — reusable wrapper for route handlers
import { NextRequest, NextResponse } from "next/server";
import { Ratelimit } from "@upstash/ratelimit";
import { auth } from "@/auth";
import { rateLimiters, getIp } from "./rate-limit";

interface RateLimitOptions {
  limiter: Ratelimit;
  // Identifier: "user" (auth required), "ip", or custom fn
  by?: "user" | "ip" | ((req: NextRequest) => string | Promise<string>);
}

export function withRateLimit(
  handler: (req: NextRequest, ...args: any[]) => Promise<NextResponse>,
  options: RateLimitOptions
) {
  return async (req: NextRequest, ...args: any[]): Promise<NextResponse> => {
    let identifier: string;

    if (!options.by || options.by === "user") {
      const session = await auth();
      if (!session?.user) {
        return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
      }
      identifier = session.user.id;
    } else if (options.by === "ip") {
      identifier = getIp(req);
    } else {
      identifier = await options.by(req);
    }

    const { success, limit, remaining, reset } = await options.limiter.limit(identifier);

    // Always include rate limit headers (helps clients implement backoff)
    const headers = {
      "X-RateLimit-Limit":     String(limit),
      "X-RateLimit-Remaining": String(remaining),
      "X-RateLimit-Reset":     String(Math.floor(reset / 1000)), // Unix seconds
      "Retry-After":           String(Math.ceil((reset - Date.now()) / 1000)),
    };

    if (!success) {
      return NextResponse.json(
        {
          error: "Rate limit exceeded",
          message: `Too many requests. Try again in ${Math.ceil((reset - Date.now()) / 1000)} seconds.`,
          retryAfter: Math.ceil((reset - Date.now()) / 1000),
        },
        { status: 429, headers }
      );
    }

    const response = await handler(req, ...args);

    // Attach headers to successful responses too
    Object.entries(headers).forEach(([key, value]) => {
      response.headers.set(key, value);
    });

    return response;
  };
}

Usage in Route Handlers

// app/api/ai/chat/route.ts — LLM endpoint with strict limits
import { NextRequest, NextResponse } from "next/server";
import { withRateLimit } from "@/lib/with-rate-limit";
import { rateLimiters } from "@/lib/rate-limit";
import { auth } from "@/auth";

async function chatHandler(req: NextRequest): Promise<NextResponse> {
  const session = await auth();
  // ... LLM logic
  return NextResponse.json({ message: "..." });
}

export const POST = withRateLimit(chatHandler, {
  limiter: rateLimiters.ai,
  by: "user",
});

// app/api/auth/login/route.ts — auth endpoint with IP-based limiting
import { NextRequest, NextResponse } from "next/server";
import { withRateLimit } from "@/lib/with-rate-limit";
import { rateLimiters } from "@/lib/rate-limit";

async function loginHandler(req: NextRequest): Promise<NextResponse> {
  // ... login logic
  return NextResponse.json({ ok: true });
}

export const POST = withRateLimit(loginHandler, {
  limiter: rateLimiters.auth,
  by: "ip",
});

🚀 Senior Engineers. No Junior Handoffs. Ever.

You get the senior developer, not a project manager who relays your requirements to someone you never meet. Every Viprasol project has a senior lead from kickoff to launch.

MVPs in 4–8 weeks, full platforms in 3–5 months
Lighthouse 90+ performance scores standard
Works across US, UK, AU timezones
Free 30-min architecture review, no commitment

Start My Project WhatsApp

Middleware-Based Global Rate Limiting

// middleware.ts — apply IP rate limit to all routes before handlers run
import { NextRequest, NextResponse } from "next/server";
import { rateLimiters, getIp } from "@/lib/rate-limit";
import createIntlMiddleware from "next-intl/middleware";

export async function middleware(req: NextRequest) {
  const ip = getIp(req);

  // Skip rate limiting for static files
  const pathname = req.nextUrl.pathname;
  if (pathname.startsWith("/_next") || pathname.startsWith("/images")) {
    return NextResponse.next();
  }

  // Only rate-limit API routes at middleware level
  if (pathname.startsWith("/api/")) {
    const { success, reset } = await rateLimiters.ip.limit(ip);

    if (!success) {
      return NextResponse.json(
        { error: "Rate limit exceeded" },
        {
          status: 429,
          headers: {
            "Retry-After": String(Math.ceil((reset - Date.now()) / 1000)),
          },
        }
      );
    }
  }

  return NextResponse.next();
}

export const config = {
  matcher: ["/((?!_next/static|_next/image|favicon.ico).*)"],
};

Custom Identifier: Per Workspace

// Rate limit by workspace instead of individual user
// Useful for team plans: shared quota across all team members
export const workspaceRateLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(1000, "1 m"),
  prefix: "rl:workspace",
});

// In a route handler:
export const POST = withRateLimit(handler, {
  limiter: workspaceRateLimiter,
  by: async (req) => {
    const session = await auth();
    return session?.user?.workspaceId ?? getIp(req);
  },
});

Token Bucket for Burst Allowance

// Token bucket: allows short bursts while enforcing long-term rate
// e.g., 10 requests burst, refill 1 token per second
export const burstLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.tokenBucket(10, "1 s", 10),
  // Args: refillRate (per interval), interval, maxTokens
  // User can burst up to 10 requests, then limited to 10/second
  prefix: "rl:burst",
});

Client-Side: Handle 429 Gracefully

// lib/api-client.ts — exponential backoff on 429
async function fetchWithRetry(
  url: string,
  options: RequestInit,
  maxRetries = 3
): Promise<Response> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status !== 429) return response;

    // Read Retry-After header
    const retryAfter = parseInt(response.headers.get("Retry-After") ?? "5", 10);
    const jitter = Math.random() * 1000; // Add jitter to prevent thundering herd

    if (attempt < maxRetries - 1) {
      await new Promise((resolve) =>
        setTimeout(resolve, retryAfter * 1000 + jitter)
      );
    }
  }

  throw new Error("Rate limit exceeded after retries");
}

React: Rate Limit Error Display

// components/rate-limit-toast.tsx
"use client";

import { useEffect, useState } from "react";
import { Clock } from "lucide-react";

interface RateLimitToastProps {
  retryAfterSeconds: number;
  onDismiss: () => void;
}

export function RateLimitToast({ retryAfterSeconds, onDismiss }: RateLimitToastProps) {
  const [secondsLeft, setSecondsLeft] = useState(retryAfterSeconds);

  useEffect(() => {
    if (secondsLeft <= 0) { onDismiss(); return; }
    const timer = setInterval(() => setSecondsLeft((s) => s - 1), 1000);
    return () => clearInterval(timer);
  }, [secondsLeft, onDismiss]);

  return (
    <div className="fixed bottom-4 right-4 z-50 flex items-center gap-3 bg-orange-50 border border-orange-200 rounded-xl px-4 py-3 shadow-lg max-w-sm">
      <Clock className="w-5 h-5 text-orange-500 flex-shrink-0" />
      <div>
        <p className="text-sm font-medium text-orange-900">Slow down</p>
        <p className="text-xs text-orange-700">
          Try again in {secondsLeft}s
        </p>
      </div>
    </div>
  );
}

Cost and Timeline Estimates

Scope	Team	Timeline	Cost Range
Basic per-user rate limit (3 endpoints)	1 dev	1 day	$300–600
Full rate limit system (per-user + per-IP + middleware)	1 dev	2–3 days	$600–1,200
+ Custom limits per plan tier + analytics	1 dev	3–5 days	$1,200–2,500

Upstash pricing (2026): Pay-per-request. Free tier: 10,000 requests/day. Pro: $0.20/100K requests. A SaaS with 1M API calls/day: ~$2/day.

Working With Viprasol

Rate limiting protects your infrastructure, your LLM budget, and your other users from a single bad actor. Our team implements multi-layer rate limiting — per-user, per-IP, per-workspace, and per-endpoint — with proper Retry-After headers, graceful client-side backoff, and plan-tier-aware limits that scale with your pricing model.

What we deliver:

Upstash Redis rate limiters: sliding window, fixed window, token bucket
withRateLimit higher-order function for route handlers
Middleware-level IP rate limiting for all /api/ routes
Rate limit headers on all responses (X-RateLimit-Limit/Remaining/Reset)
Client-side retry with exponential backoff and jitter

Talk to our team about your API security architecture →

Or explore our web development services.

Next.js API Rate Limiting with Upstash Redis: Per-User, Per-IP, and Sliding Window Algorithms

Installation

Rate Limiter Setup

🌐 Looking for a Dev Team That Actually Delivers?

Route Handler: Per-User Rate Limiting

Usage in Route Handlers

🚀 Senior Engineers. No Junior Handoffs. Ever.

Middleware-Based Global Rate Limiting

Custom Identifier: Per Workspace

Token Bucket for Burst Allowance

Client-Side: Handle 429 Gracefully

React: Rate Limit Error Display

Cost and Timeline Estimates

See Also

Working With Viprasol

Viprasol Tech Team

Need a Modern Web Application?

Need a custom web application built?

Related Articles

SaaS API Rate Limiting: Token Bucket, Sliding Window, Per-Plan Limits, and Stripe-Style Headers

Next.js Middleware Authentication in 2026: JWT Verification, Route Guards, and Edge Runtime

Next.js File Uploads: Direct-to-S3 Presigned URLs, Multipart, Progress Tracking, and Virus Scanning