Back to Blog

SaaS API Analytics: Usage Tracking, Rate Limit Dashboards, and Per-Endpoint Latency Metrics

Build production API analytics for SaaS. Covers API key usage tracking with PostgreSQL, per-endpoint latency percentiles, rate limit consumption dashboards, time-series rollup tables, TypeScript middleware, and Grafana-compatible metrics.

Viprasol Tech Team
June 10, 2027
12 min read

Developer-facing SaaS products live or die by their API quality. But most teams only find out the API is slow after a customer complaint. API analytics built into your platform surface the problem earlier: you can see that the /v1/reports/generate endpoint has a p99 latency of 12 seconds before your enterprise customer escalates. You can see which API keys are burning through rate limits before they hit the wall.

Database Schema

-- Raw API request log (write-optimized, partitioned by day)
CREATE TABLE api_requests (
  id            UUID         DEFAULT gen_random_uuid(),
  api_key_id    UUID         NOT NULL,
  workspace_id  UUID         NOT NULL,
  endpoint      TEXT         NOT NULL,  -- e.g. "POST /v1/reports/generate"
  status_code   INTEGER      NOT NULL,
  latency_ms    INTEGER      NOT NULL,
  request_size  INTEGER      NOT NULL DEFAULT 0,  -- bytes
  response_size INTEGER      NOT NULL DEFAULT 0,
  occurred_at   TIMESTAMPTZ  NOT NULL DEFAULT NOW(),
  PRIMARY KEY   (id, occurred_at)
) PARTITION BY RANGE (occurred_at);

-- Create daily partitions (automate with pg_partman in production)
CREATE TABLE api_requests_2027_06 PARTITION OF api_requests
  FOR VALUES FROM ('2027-06-01') TO ('2027-07-01');

-- Indexes on the partition
CREATE INDEX ON api_requests (api_key_id, occurred_at DESC);
CREATE INDEX ON api_requests (workspace_id, endpoint, occurred_at DESC);
CREATE INDEX ON api_requests (endpoint, occurred_at DESC);

-- Hourly rollup for fast dashboard queries
CREATE TABLE api_request_hourly (
  workspace_id  UUID    NOT NULL,
  api_key_id    UUID    NOT NULL,
  endpoint      TEXT    NOT NULL,
  hour          TIMESTAMPTZ NOT NULL,  -- Truncated to hour
  request_count INTEGER NOT NULL DEFAULT 0,
  error_count   INTEGER NOT NULL DEFAULT 0,  -- 4xx + 5xx
  total_latency BIGINT  NOT NULL DEFAULT 0,  -- Sum for avg calculation
  p50_latency   INTEGER,
  p95_latency   INTEGER,
  p99_latency   INTEGER,
  PRIMARY KEY (workspace_id, api_key_id, endpoint, hour)
);

CREATE INDEX ON api_request_hourly (workspace_id, hour DESC);

TypeScript Middleware: Request Logging

// middleware/api-analytics.ts
import { NextRequest, NextResponse } from "next/server";
import { db } from "@/lib/db";
import { getApiKeyFromRequest } from "@/lib/auth/api-keys";

export async function withApiAnalytics(
  request:  NextRequest,
  handler:  (req: NextRequest) => Promise<NextResponse>,
  endpoint: string
): Promise<NextResponse> {
  const startTime = Date.now();
  const apiKey = await getApiKeyFromRequest(request);

  let response: NextResponse;
  try {
    response = await handler(request);
  } catch (err) {
    response = NextResponse.json({ error: "Internal server error" }, { status: 500 });
  }

  const latencyMs = Date.now() - startTime;

  // Fire-and-forget: don't block response on analytics write
  void logApiRequest({
    apiKeyId:     apiKey?.id ?? "anonymous",
    workspaceId:  apiKey?.workspaceId ?? "unknown",
    endpoint,
    statusCode:   response.status,
    latencyMs,
  }).catch((err) => console.error("[api-analytics] Failed to log request:", err));

  // Add latency header for debugging
  response.headers.set("X-Response-Time", `${latencyMs}ms`);

  return response;
}

async function logApiRequest(data: {
  apiKeyId:    string;
  workspaceId: string;
  endpoint:    string;
  statusCode:  number;
  latencyMs:   number;
}): Promise<void> {
  await db.execute(
    `INSERT INTO api_requests (api_key_id, workspace_id, endpoint, status_code, latency_ms)
     VALUES ($1, $2, $3, $4, $5)`,
    [data.apiKeyId, data.workspaceId, data.endpoint, data.statusCode, data.latencyMs]
  );
}

πŸš€ SaaS MVP in 8 Weeks β€” Seriously

We have launched 50+ SaaS platforms. Multi-tenant architecture, Stripe billing, auth, role-based access, and cloud deployment β€” all handled by one senior team.

  • Week 1–2: Architecture design + wireframes
  • Week 3–6: Core features built + tested
  • Week 7–8: Launch-ready on AWS/Vercel with CI/CD
  • Post-launch: Maintenance plans from month 3

Hourly Rollup Job

// workers/api-analytics-rollup.ts β€” run every 15 minutes
import { db } from "@/lib/db";

export async function rollupApiRequests(): Promise<void> {
  // Roll up last 2 hours (overlap handles late-arriving rows)
  const twoHoursAgo = new Date(Date.now() - 2 * 3600_000);
  const hourBucket  = new Date(
    Math.floor(Date.now() / 3600_000) * 3600_000
  );

  await db.execute(`
    INSERT INTO api_request_hourly
      (workspace_id, api_key_id, endpoint, hour,
       request_count, error_count, total_latency,
       p50_latency, p95_latency, p99_latency)
    SELECT
      workspace_id,
      api_key_id,
      endpoint,
      date_trunc('hour', occurred_at) AS hour,
      COUNT(*)                                              AS request_count,
      COUNT(*) FILTER (WHERE status_code >= 400)           AS error_count,
      SUM(latency_ms)                                       AS total_latency,
      percentile_cont(0.50) WITHIN GROUP (ORDER BY latency_ms)::int AS p50_latency,
      percentile_cont(0.95) WITHIN GROUP (ORDER BY latency_ms)::int AS p95_latency,
      percentile_cont(0.99) WITHIN GROUP (ORDER BY latency_ms)::int AS p99_latency
    FROM api_requests
    WHERE occurred_at >= $1
    GROUP BY workspace_id, api_key_id, endpoint, date_trunc('hour', occurred_at)
    ON CONFLICT (workspace_id, api_key_id, endpoint, hour)
    DO UPDATE SET
      request_count = EXCLUDED.request_count,
      error_count   = EXCLUDED.error_count,
      total_latency = EXCLUDED.total_latency,
      p50_latency   = EXCLUDED.p50_latency,
      p95_latency   = EXCLUDED.p95_latency,
      p99_latency   = EXCLUDED.p99_latency
  `, [twoHoursAgo]);
}

Dashboard Queries

-- 1. Per-endpoint latency for a workspace (last 24h, from hourly rollup)
SELECT
  endpoint,
  SUM(request_count)                        AS total_requests,
  ROUND(SUM(error_count)::numeric / NULLIF(SUM(request_count), 0) * 100, 2) AS error_rate_pct,
  -- Weighted average p50/p95/p99 across hours
  ROUND(SUM(p50_latency * request_count) / NULLIF(SUM(request_count), 0)) AS avg_p50_ms,
  ROUND(SUM(p95_latency * request_count) / NULLIF(SUM(request_count), 0)) AS avg_p95_ms,
  ROUND(SUM(p99_latency * request_count) / NULLIF(SUM(request_count), 0)) AS avg_p99_ms
FROM api_request_hourly
WHERE workspace_id = $1
  AND hour >= NOW() - INTERVAL '24 hours'
GROUP BY endpoint
ORDER BY total_requests DESC;

-- 2. Rate limit consumption per API key (last 1 hour from raw table)
-- Assuming rate limit is 1000 requests/hour per key
SELECT
  ak.name          AS key_name,
  ak.id            AS api_key_id,
  ak.rate_limit    AS hourly_limit,
  COUNT(ar.id)     AS requests_used,
  ROUND(COUNT(ar.id)::numeric / ak.rate_limit * 100, 1) AS pct_used,
  ak.rate_limit - COUNT(ar.id) AS remaining
FROM api_keys ak
LEFT JOIN api_requests ar
  ON ar.api_key_id = ak.id
  AND ar.occurred_at >= date_trunc('hour', NOW())
WHERE ak.workspace_id = $1
GROUP BY ak.id, ak.name, ak.rate_limit
ORDER BY pct_used DESC NULLS LAST;

-- 3. Request volume time-series (hourly buckets for last 7 days)
SELECT
  hour,
  SUM(request_count) AS requests,
  SUM(error_count)   AS errors
FROM api_request_hourly
WHERE workspace_id = $1
  AND hour >= NOW() - INTERVAL '7 days'
GROUP BY hour
ORDER BY hour;

-- 4. Top endpoints by error rate (last 24h)
SELECT
  endpoint,
  SUM(request_count) AS total,
  SUM(error_count)   AS errors,
  ROUND(SUM(error_count)::numeric / NULLIF(SUM(request_count), 0) * 100, 2) AS error_rate_pct,
  MAX(p99_latency)   AS max_p99_ms
FROM api_request_hourly
WHERE workspace_id = $1
  AND hour >= NOW() - INTERVAL '24 hours'
  AND request_count > 0
GROUP BY endpoint
HAVING SUM(request_count) > 10  -- Exclude low-volume noise
ORDER BY error_rate_pct DESC
LIMIT 10;

πŸ’‘ The Difference Between a SaaS Demo and a SaaS Business

Anyone can build a demo. We build SaaS products that handle real load, real users, and real payments β€” with architecture that does not need to be rewritten at 1,000 users.

  • Multi-tenant PostgreSQL with row-level security
  • Stripe subscriptions, usage billing, annual plans
  • SOC2-ready infrastructure from day one
  • We own zero equity β€” you own everything

TypeScript: Dashboard Data API

// app/api/analytics/overview/route.ts
import { NextRequest, NextResponse } from "next/server";
import { prisma } from "@/lib/prisma";
import { requireWorkspaceAuth } from "@/lib/auth/middleware";

export async function GET(req: NextRequest) {
  const { workspaceId } = await requireWorkspaceAuth(req);

  const [endpointStats, rateLimitStatus, volumeSeries] = await Promise.all([
    // Per-endpoint latency summary
    prisma.$queryRaw<{
      endpoint:      string;
      total_requests: bigint;
      error_rate_pct: number;
      avg_p95_ms:    number;
    }[]>`
      SELECT
        endpoint,
        SUM(request_count) AS total_requests,
        ROUND(SUM(error_count)::numeric / NULLIF(SUM(request_count), 0) * 100, 2) AS error_rate_pct,
        ROUND(SUM(p95_latency * request_count) / NULLIF(SUM(request_count), 0)) AS avg_p95_ms
      FROM api_request_hourly
      WHERE workspace_id = ${workspaceId}::uuid
        AND hour >= NOW() - INTERVAL '24 hours'
      GROUP BY endpoint
      ORDER BY total_requests DESC
      LIMIT 20
    `,

    // Rate limit status for all keys
    prisma.$queryRaw<{
      key_name:     string;
      api_key_id:   string;
      hourly_limit: number;
      requests_used: bigint;
      pct_used:     number;
    }[]>`
      SELECT
        ak.name          AS key_name,
        ak.id::text      AS api_key_id,
        ak.rate_limit    AS hourly_limit,
        COUNT(ar.id)     AS requests_used,
        ROUND(COUNT(ar.id)::numeric / ak.rate_limit * 100, 1) AS pct_used
      FROM api_keys ak
      LEFT JOIN api_requests ar
        ON ar.api_key_id = ak.id
        AND ar.occurred_at >= date_trunc('hour', NOW())
      WHERE ak.workspace_id = ${workspaceId}::uuid
      GROUP BY ak.id, ak.name, ak.rate_limit
      ORDER BY pct_used DESC NULLS LAST
    `,

    // 7-day request volume time-series
    prisma.$queryRaw<{ hour: Date; requests: bigint; errors: bigint }[]>`
      SELECT hour, SUM(request_count) AS requests, SUM(error_count) AS errors
      FROM api_request_hourly
      WHERE workspace_id = ${workspaceId}::uuid
        AND hour >= NOW() - INTERVAL '7 days'
      GROUP BY hour
      ORDER BY hour
    `,
  ]);

  return NextResponse.json({
    endpointStats: endpointStats.map((r) => ({
      ...r,
      total_requests: Number(r.total_requests),
    })),
    rateLimitStatus: rateLimitStatus.map((r) => ({
      ...r,
      requests_used: Number(r.requests_used),
    })),
    volumeSeries: volumeSeries.map((r) => ({
      hour:     r.hour.toISOString(),
      requests: Number(r.requests),
      errors:   Number(r.errors),
    })),
  });
}

Cost and Timeline Estimates

ScopeTeamTimelineCost Range
Raw logging middleware + partitioned table1 dev1–2 days$400–800
Hourly rollup job + latency percentiles1 dev1–2 days$400–800
Dashboard queries + API route1 dev2 days$600–1,200
Full analytics dashboard UI1–2 devs3–5 days$1,200–2,500

See Also


Working With Viprasol

API analytics require a two-tier storage strategy: raw event log for accuracy (partitioned by day for fast drops), hourly rollup for fast dashboard reads (p50/p95/p99 per endpoint). The fire-and-forget logging pattern keeps your API latency clean β€” analytics writes never block the response path.

What we deliver:

  • api_requests: partitioned table by occurred_at, indexes on api_key_id+occurred_at and workspace_id+endpoint+occurred_at
  • api_request_hourly: rollup with request_count, error_count, total_latency, p50/p95/p99
  • withApiAnalytics middleware: fire-and-forget void logApiRequest().catch(), X-Response-Time header
  • rollupApiRequests: percentile_cont(0.95) WITHIN GROUP, ON CONFLICT DO UPDATE upsert, 2-hour overlap window
  • Dashboard SQL: weighted avg latency across hours, rate limit % consumption per key, 7-day time series
  • GET /api/analytics/overview: Promise.all for 3 queries, bigintβ†’Number serialization

Talk to our team about your API observability stack β†’

Or explore our SaaS development services.

Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Building a SaaS Product?

We've helped launch 50+ SaaS platforms. Let's build yours β€” fast.

Free consultation β€’ No commitment β€’ Response within 24 hours

Viprasol Β· AI Agent Systems

Add AI automation to your SaaS product?

Viprasol builds custom AI agent crews that plug into any SaaS workflow β€” automating repetitive tasks, qualifying leads, and responding across every channel your customers use.