Back to Blog

SaaS Analytics Architecture: Clickstream, Cohort Analysis, and Self-Serve Dashboards

Build a production SaaS analytics stack: implement clickstream collection with event schemas, design cohort retention tables in PostgreSQL or BigQuery, build funnel analysis queries, and deliver self-serve BI dashboards.

Viprasol Tech Team
September 30, 2026
13 min read

Most SaaS companies start analytics the wrong way: they drop an analytics SDK into the frontend, watch events appear in a dashboard, and call it done. Then the questions start arriving โ€” "What's our D30 retention for users who completed onboarding?" โ€” and the answer is "we'd need to build that."

A proper analytics architecture starts with the questions you need to answer, then works backward to the schema design, collection pipeline, and dashboard layer.


The Analytics Architecture Stack

Collection Layer:   Client events โ†’ server events โ†’ webhook events
Transport Layer:    Event queue (Kafka/SQS) or direct write
Storage Layer:      Raw events โ†’ transformed tables โ†’ aggregations
Query Layer:        SQL + dbt models + cached materialized views
Presentation Layer: BI tool (Metabase/Superset) or custom dashboards

Event Schema Design

The most important architectural decision is your event schema. A poor schema is expensive to fix later.

// src/analytics/event-schemas.ts
// Events follow the noun-verb pattern: object_action

// Base event โ€” every event has these fields
interface BaseEvent {
  event_id: string;        // UUID, for deduplication
  event_type: string;      // "user_signed_up", "project_created", etc.
  occurred_at: string;     // ISO 8601, when the event happened (not when received)
  received_at: string;     // ISO 8601, when the server received it
  session_id: string;      // Groups events within a session
  user_id: string | null;  // Null for anonymous events
  anonymous_id: string;    // Always present, even before login
  tenant_id: string | null;
  properties: Record<string, unknown>; // Event-specific data
  context: EventContext;
}

interface EventContext {
  page: {
    url: string;
    path: string;
    referrer: string;
    title: string;
  };
  user_agent: string;
  ip: string;              // Used for geo enrichment, then hashed for privacy
  library: {
    name: string;
    version: string;
  };
  campaign: {
    source: string | null;   // UTM source
    medium: string | null;
    name: string | null;
    term: string | null;
    content: string | null;
  };
}

// Typed event definitions
type UserSignedUpEvent = BaseEvent & {
  event_type: "user_signed_up";
  properties: {
    email: string;
    plan: "free" | "starter" | "growth";
    signup_method: "email" | "google" | "github";
    referral_code: string | null;
    time_to_signup_seconds: number; // Time from landing to signup
  };
};

type ProjectCreatedEvent = BaseEvent & {
  event_type: "project_created";
  properties: {
    project_id: string;
    project_type: string;
    is_first_project: boolean;
    template_used: string | null;
  };
};

type SubscriptionUpgradedEvent = BaseEvent & {
  event_type: "subscription_upgraded";
  properties: {
    from_plan: string;
    to_plan: string;
    mrr_delta_usd: number;  // Change in monthly recurring revenue
    days_since_signup: number;
    trigger: "organic" | "in-app-prompt" | "sales-outreach";
  };
};

Server-Side Event Collection

// src/analytics/collector.ts
import { SQSClient, SendMessageCommand } from "@aws-sdk/client-sqs";
import { v4 as uuidv4 } from "uuid";

const sqs = new SQSClient({ region: "us-east-1" });

export class EventCollector {
  private readonly queueUrl: string;

  constructor(queueUrl: string) {
    this.queueUrl = queueUrl;
  }

  async track<T extends BaseEvent>(
    event: Omit<T, "event_id" | "received_at">
  ): Promise<void> {
    const enrichedEvent: T = {
      ...event,
      event_id: uuidv4(),
      received_at: new Date().toISOString(),
    } as T;

    // Critical business events: always server-side to avoid ad-blocker losses
    // (signup, payment, upgrade events must be 100% captured)
    await sqs.send(
      new SendMessageCommand({
        QueueUrl: this.queueUrl,
        MessageBody: JSON.stringify(enrichedEvent),
        MessageGroupId: enrichedEvent.user_id ?? enrichedEvent.anonymous_id,
        MessageDeduplicationId: enrichedEvent.event_id,
      })
    );
  }
}

// Middleware: auto-track page views and identify users
export async function analyticsMiddleware(
  req: Request,
  res: Response,
  next: NextFunction
) {
  const session = await getSession(req);

  // Attach collector to request for use in route handlers
  req.analytics = new EventCollector(process.env.ANALYTICS_QUEUE_URL!);
  req.analytics.userId = session?.user?.id ?? null;
  req.analytics.anonymousId = req.cookies["anon_id"] ?? uuidv4();
  req.analytics.tenantId = session?.user?.tenantId ?? null;

  next();
}

๐Ÿš€ SaaS MVP in 8 Weeks โ€” Seriously

We have launched 50+ SaaS platforms. Multi-tenant architecture, Stripe billing, auth, role-based access, and cloud deployment โ€” all handled by one senior team.

  • Week 1โ€“2: Architecture design + wireframes
  • Week 3โ€“6: Core features built + tested
  • Week 7โ€“8: Launch-ready on AWS/Vercel with CI/CD
  • Post-launch: Maintenance plans from month 3

Raw Events Storage Schema

-- Raw events table โ€” append-only, never update
-- Partitioned by date for query performance
CREATE TABLE events (
  event_id          UUID         NOT NULL,
  event_type        TEXT         NOT NULL,
  occurred_at       TIMESTAMPTZ  NOT NULL,
  received_at       TIMESTAMPTZ  NOT NULL DEFAULT NOW(),
  user_id           UUID,
  anonymous_id      TEXT         NOT NULL,
  tenant_id         UUID,
  session_id        TEXT,
  properties        JSONB        NOT NULL DEFAULT '{}',
  context           JSONB        NOT NULL DEFAULT '{}',
  -- Enriched fields (added by ETL pipeline)
  country_code      CHAR(2),
  city              TEXT,
  device_type       TEXT,        -- 'mobile' | 'tablet' | 'desktop'
  browser           TEXT,
  os                TEXT
) PARTITION BY RANGE (occurred_at);

-- Create monthly partitions (automate this with pg_partman)
CREATE TABLE events_2026_09
  PARTITION OF events
  FOR VALUES FROM ('2026-09-01') TO ('2026-10-01');

-- Indexes on common query patterns
CREATE INDEX ON events (user_id, occurred_at DESC);
CREATE INDEX ON events (tenant_id, occurred_at DESC);
CREATE INDEX ON events (event_type, occurred_at DESC);
CREATE INDEX ON events USING GIN (properties); -- For JSONB property queries

Cohort Retention Analysis

Cohort retention is the most important metric for understanding product health. The question: of users who signed up in week W, what % are still active in week W+N?

-- Build a retention cohort table
-- "Active" = performed any key action in that week

WITH cohorts AS (
  -- Assign each user to their signup cohort (weekly buckets)
  SELECT
    user_id,
    DATE_TRUNC('week', occurred_at)::DATE AS cohort_week
  FROM events
  WHERE event_type = 'user_signed_up'
    AND occurred_at >= NOW() - INTERVAL '16 weeks'
),
weekly_activity AS (
  -- Find every week each user was active (performed any key event)
  SELECT DISTINCT
    e.user_id,
    DATE_TRUNC('week', e.occurred_at)::DATE AS activity_week
  FROM events e
  WHERE e.event_type IN (
    'project_created', 'export_completed', 'collaborator_invited',
    'integration_connected', 'report_viewed'
  )
    AND e.occurred_at >= NOW() - INTERVAL '16 weeks'
),
retention_data AS (
  SELECT
    c.cohort_week,
    wa.activity_week,
    -- Week number since signup (0 = signup week, 1 = first full week after, etc.)
    ROUND(
      EXTRACT(EPOCH FROM (wa.activity_week - c.cohort_week)) / (7 * 86400)
    )::INT AS week_number,
    COUNT(DISTINCT wa.user_id) AS active_users
  FROM cohorts c
  JOIN weekly_activity wa ON wa.user_id = c.user_id
    AND wa.activity_week >= c.cohort_week
  GROUP BY c.cohort_week, wa.activity_week
),
cohort_sizes AS (
  SELECT cohort_week, COUNT(*) AS cohort_size
  FROM cohorts
  GROUP BY cohort_week
)
SELECT
  rd.cohort_week,
  cs.cohort_size,
  rd.week_number,
  rd.active_users,
  ROUND(100.0 * rd.active_users / cs.cohort_size, 1) AS retention_pct
FROM retention_data rd
JOIN cohort_sizes cs ON cs.cohort_week = rd.cohort_week
ORDER BY rd.cohort_week, rd.week_number;

-- Result format (pivot for heatmap display):
-- cohort_week | cohort_size | week_0 | week_1 | week_2 | ... | week_12
-- 2026-07-07  |     412     | 100%   | 68%    | 52%    | ... | 28%
-- 2026-07-14  |     389     | 100%   | 71%    | 55%    | ... | --

๐Ÿ’ก The Difference Between a SaaS Demo and a SaaS Business

Anyone can build a demo. We build SaaS products that handle real load, real users, and real payments โ€” with architecture that does not need to be rewritten at 1,000 users.

  • Multi-tenant PostgreSQL with row-level security
  • Stripe subscriptions, usage billing, annual plans
  • SOC2-ready infrastructure from day one
  • We own zero equity โ€” you own everything

Funnel Analysis

-- Conversion funnel: signup โ†’ first project โ†’ first export โ†’ upgrade
-- Shows drop-off at each step

WITH funnel_base AS (
  SELECT user_id, MIN(occurred_at) AS signed_up_at
  FROM events
  WHERE event_type = 'user_signed_up'
    AND occurred_at >= NOW() - INTERVAL '30 days'
  GROUP BY user_id
),
step2 AS (
  SELECT DISTINCT e.user_id
  FROM events e
  JOIN funnel_base f ON f.user_id = e.user_id
  WHERE e.event_type = 'project_created'
    AND e.occurred_at > f.signed_up_at
    AND e.occurred_at < f.signed_up_at + INTERVAL '7 days'
),
step3 AS (
  SELECT DISTINCT e.user_id
  FROM events e
  JOIN step2 s ON s.user_id = e.user_id
  WHERE e.event_type = 'export_completed'
    AND e.occurred_at >= (
      SELECT MIN(occurred_at) FROM events
      WHERE user_id = e.user_id AND event_type = 'project_created'
    )
    AND e.occurred_at < (
      SELECT MIN(occurred_at) FROM events
      WHERE user_id = e.user_id AND event_type = 'user_signed_up'
    ) + INTERVAL '14 days'
),
step4 AS (
  SELECT DISTINCT e.user_id
  FROM events e
  JOIN step3 s ON s.user_id = e.user_id
  WHERE e.event_type = 'subscription_upgraded'
    AND e.occurred_at >= (
      SELECT MIN(occurred_at) FROM events
      WHERE user_id = e.user_id AND event_type = 'user_signed_up'
    )
)
SELECT
  'signed_up'       AS step, COUNT(*) AS users, 100.0 AS pct FROM funnel_base
UNION ALL
SELECT
  'created_project', COUNT(*), ROUND(100.0 * COUNT(*) / (SELECT COUNT(*) FROM funnel_base), 1)
FROM step2
UNION ALL
SELECT
  'completed_export', COUNT(*), ROUND(100.0 * COUNT(*) / (SELECT COUNT(*) FROM funnel_base), 1)
FROM step3
UNION ALL
SELECT
  'upgraded', COUNT(*), ROUND(100.0 * COUNT(*) / (SELECT COUNT(*) FROM funnel_base), 1)
FROM step4
ORDER BY step;

Self-Serve Dashboard Architecture

// src/analytics/dashboard.service.ts
// Serve analytics data to frontend dashboards with caching

interface DashboardQuery {
  metric: "cohort_retention" | "funnel" | "dau_mau" | "mrr_growth";
  tenantId?: string;   // null = platform-wide (admin only)
  period: "7d" | "30d" | "90d";
  granularity: "day" | "week" | "month";
}

export class DashboardService {
  constructor(
    private readonly db: Pool,
    private readonly cache: Redis
  ) {}

  async query(params: DashboardQuery): Promise<unknown> {
    const cacheKey = `analytics:${JSON.stringify(params)}`;

    // Cache dashboard queries โ€” they're expensive and slightly stale is fine
    const cached = await this.cache.get(cacheKey);
    if (cached) return JSON.parse(cached);

    const result = await this.executeQuery(params);

    // Cache for 15 minutes (adjust based on data freshness needs)
    await this.cache.setex(cacheKey, 900, JSON.stringify(result));

    return result;
  }

  private async executeQuery(params: DashboardQuery): Promise<unknown> {
    switch (params.metric) {
      case "dau_mau":
        return this.getDauMauRatio(params);
      case "mrr_growth":
        return this.getMrrGrowth(params);
      case "funnel":
        return this.getFunnelData(params);
      default:
        throw new Error(`Unknown metric: ${params.metric}`);
    }
  }

  private async getDauMauRatio(params: DashboardQuery) {
    const { rows } = await this.db.query(`
      WITH daily_active AS (
        SELECT
          DATE_TRUNC('day', occurred_at)::DATE AS date,
          COUNT(DISTINCT user_id) AS dau
        FROM events
        WHERE occurred_at >= NOW() - INTERVAL '${params.period}'
          ${params.tenantId ? "AND tenant_id = $1" : ""}
        GROUP BY 1
      ),
      monthly_active AS (
        SELECT
          DATE_TRUNC('month', occurred_at)::DATE AS month,
          COUNT(DISTINCT user_id) AS mau
        FROM events
        WHERE occurred_at >= NOW() - INTERVAL '${params.period}'
          ${params.tenantId ? "AND tenant_id = $1" : ""}
        GROUP BY 1
      )
      SELECT
        d.date,
        d.dau,
        m.mau,
        ROUND(100.0 * d.dau / NULLIF(m.mau, 0), 1) AS dau_mau_ratio
      FROM daily_active d
      JOIN monthly_active m ON DATE_TRUNC('month', d.date) = m.month
      ORDER BY d.date DESC
    `, params.tenantId ? [params.tenantId] : []);

    return rows;
  }
}

BigQuery for Scale

When event volume exceeds ~100M rows/month, move analytics storage to BigQuery:

-- BigQuery schema (uses STRUCT for nested data)
CREATE TABLE IF NOT EXISTS `project.analytics.events` (
  event_id        STRING        NOT NULL,
  event_type      STRING        NOT NULL,
  occurred_at     TIMESTAMP     NOT NULL,
  user_id         STRING,
  tenant_id       STRING,
  properties      JSON,
  context STRUCT<
    page STRUCT<
      url       STRING,
      path      STRING,
      referrer  STRING
    >,
    campaign STRUCT<
      source    STRING,
      medium    STRING,
      name      STRING
    >,
    device_type STRING,
    country     STRING
  >
)
PARTITION BY DATE(occurred_at)
CLUSTER BY event_type, tenant_id;  -- Cluster for common filter patterns

-- BigQuery retention query (same logic, BigQuery syntax)
SELECT
  cohort_week,
  week_number,
  active_users,
  cohort_size,
  ROUND(100.0 * active_users / cohort_size, 1) AS retention_pct
FROM (
  SELECT
    c.cohort_week,
    TIMESTAMP_DIFF(
      TIMESTAMP_TRUNC(wa.activity_week, WEEK),
      TIMESTAMP_TRUNC(c.cohort_week, WEEK),
      WEEK
    ) AS week_number,
    COUNT(DISTINCT wa.user_id) AS active_users,
    MAX(cs.cohort_size) AS cohort_size
  FROM `project.analytics.cohorts` c
  JOIN `project.analytics.weekly_activity` wa ON wa.user_id = c.user_id
  JOIN `project.analytics.cohort_sizes` cs ON cs.cohort_week = c.cohort_week
  GROUP BY 1, 2
)
ORDER BY cohort_week, week_number;

See Also


Working With Viprasol

Analytics architecture is more than dropping an SDK into your app. We design event schemas that answer the questions you'll need 12 months from now, build collection pipelines that don't miss critical business events, and deliver self-serve dashboards that your team actually uses to make decisions.

Data engineering services โ†’ | Talk to our engineers โ†’

Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Building a SaaS Product?

We've helped launch 50+ SaaS platforms. Let's build yours โ€” fast.

Free consultation โ€ข No commitment โ€ข Response within 24 hours

Viprasol ยท AI Agent Systems

Add AI automation to your SaaS product?

Viprasol builds custom AI agent crews that plug into any SaaS workflow โ€” automating repetitive tasks, qualifying leads, and responding across every channel your customers use.