Back to Blog

SaaS CSV Export: Streaming Large Datasets, Background Jobs, and Presigned Download URLs

Build production CSV export for your SaaS. Covers streaming CSV generation with Node.js streams, background job processing for large datasets, S3 presigned download URLs, progress tracking, and email notification on completion.

Viprasol Tech Team
April 26, 2027
12 min read

CSV export is one of those features that works fine in development (50 rows) and breaks in production (500,000 rows). A naive implementation reads all records into memory, times out the HTTP request, and crashes the Node.js process. The right approach: stream the data through a cursor, write to S3 in chunks, and notify the user when the file is ready.

This guide covers streaming CSV generation, background job processing, progress tracking, and presigned download URLs.

Database Schema

CREATE TABLE export_jobs (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  workspace_id    UUID NOT NULL REFERENCES workspaces(id),
  requested_by    UUID NOT NULL REFERENCES users(id),
  export_type     TEXT NOT NULL,           -- 'contacts' | 'invoices' | 'events'
  filters         JSONB NOT NULL DEFAULT '{}',
  status          TEXT NOT NULL DEFAULT 'pending',  -- 'pending' | 'processing' | 'complete' | 'failed'
  row_count       INTEGER,
  file_key        TEXT,                    -- S3 object key
  error_message   TEXT,
  started_at      TIMESTAMPTZ,
  completed_at    TIMESTAMPTZ,
  expires_at      TIMESTAMPTZ,            -- When the S3 file will be deleted
  created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_export_jobs_workspace ON export_jobs(workspace_id, created_at DESC);
CREATE INDEX idx_export_jobs_status    ON export_jobs(status) WHERE status IN ('pending', 'processing');

API: Create Export Job

// app/api/exports/route.ts
import { NextRequest, NextResponse } from "next/server";
import { auth } from "@/auth";
import { prisma } from "@/lib/prisma";
import { exportQueue } from "@/lib/queues/export-queue";
import { z } from "zod";

const CreateExportSchema = z.object({
  exportType: z.enum(["contacts", "invoices", "events", "users"]),
  filters: z.object({
    startDate: z.string().optional(),
    endDate:   z.string().optional(),
    status:    z.string().optional(),
  }).optional(),
});

export async function POST(req: NextRequest) {
  const session = await auth();
  if (!session?.user) {
    return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
  }

  const body = await req.json();
  const parsed = CreateExportSchema.safeParse(body);
  if (!parsed.success) {
    return NextResponse.json({ error: parsed.error.issues[0].message }, { status: 400 });
  }

  // Create the job record
  const job = await prisma.exportJob.create({
    data: {
      workspaceId:  session.user.workspaceId,
      requestedBy:  session.user.id,
      exportType:   parsed.data.exportType,
      filters:      parsed.data.filters ?? {},
      expiresAt:    new Date(Date.now() + 24 * 60 * 60 * 1000), // 24h
    },
  });

  // Enqueue background processing
  await exportQueue.add("process-export", { jobId: job.id });

  return NextResponse.json({
    jobId:   job.id,
    status:  "pending",
    message: "Your export is being prepared. You'll receive an email when it's ready.",
  }, { status: 202 });
}

// GET: Poll job status
export async function GET(req: NextRequest) {
  const session = await auth();
  if (!session?.user) return NextResponse.json({ error: "Unauthorized" }, { status: 401 });

  const jobId = req.nextUrl.searchParams.get("jobId");
  if (!jobId) return NextResponse.json({ error: "Missing jobId" }, { status: 400 });

  const job = await prisma.exportJob.findFirst({
    where: { id: jobId, workspaceId: session.user.workspaceId },
    select: { id: true, status: true, rowCount: true, completedAt: true, errorMessage: true },
  });

  if (!job) return NextResponse.json({ error: "Job not found" }, { status: 404 });

  return NextResponse.json(job);
}

πŸš€ SaaS MVP in 8 Weeks β€” Seriously

We have launched 50+ SaaS platforms. Multi-tenant architecture, Stripe billing, auth, role-based access, and cloud deployment β€” all handled by one senior team.

  • Week 1–2: Architecture design + wireframes
  • Week 3–6: Core features built + tested
  • Week 7–8: Launch-ready on AWS/Vercel with CI/CD
  • Post-launch: Maintenance plans from month 3

Background Worker: Stream CSV to S3

// workers/export-worker.ts
import { Worker } from "bullmq";
import { S3Client, PutObjectCommand, GetObjectCommand } from "@aws-sdk/client-s3";
import { getSignedUrl } from "@aws-sdk/s3-request-presigner";
import { Readable, Transform } from "stream";
import { prisma } from "@/lib/prisma";
import { sendExportReadyEmail } from "@/lib/email/export-ready";

const s3 = new S3Client({ region: process.env.AWS_REGION! });

// CSV column definitions per export type
const EXPORT_COLUMNS: Record<string, { header: string; key: string }[]> = {
  contacts: [
    { header: "ID",         key: "id" },
    { header: "Name",       key: "name" },
    { header: "Email",      key: "email" },
    { header: "Company",    key: "company" },
    { header: "Status",     key: "status" },
    { header: "Created At", key: "createdAt" },
  ],
  invoices: [
    { header: "Invoice #",    key: "number" },
    { header: "Customer",     key: "customerName" },
    { header: "Amount",       key: "amount" },
    { header: "Currency",     key: "currency" },
    { header: "Status",       key: "status" },
    { header: "Due Date",     key: "dueDate" },
    { header: "Paid At",      key: "paidAt" },
  ],
};

// Escape a CSV field
function csvEscape(value: unknown): string {
  if (value == null) return "";
  const str = String(value);
  if (str.includes(",") || str.includes('"') || str.includes("\n")) {
    return `"${str.replace(/"/g, '""')}"`;
  }
  return str;
}

// Convert a record to a CSV row
function toCsvRow(record: Record<string, unknown>, columns: { key: string }[]): string {
  return columns.map((col) => csvEscape(record[col.key])).join(",") + "\n";
}

// Fetch data in pages using cursor pagination
async function* streamExportData(
  exportType: string,
  workspaceId: string,
  filters: Record<string, string>
): AsyncGenerator<Record<string, unknown>[]> {
  const PAGE_SIZE = 1000;
  let cursor: string | undefined;

  while (true) {
    let records: any[];

    if (exportType === "contacts") {
      records = await prisma.contact.findMany({
        where: {
          workspaceId,
          ...(filters.status ? { status: filters.status } : {}),
          ...(filters.startDate ? { createdAt: { gte: new Date(filters.startDate) } } : {}),
        },
        take: PAGE_SIZE,
        skip: cursor ? 1 : 0,
        cursor: cursor ? { id: cursor } : undefined,
        orderBy: { id: "asc" },
        select: { id: true, name: true, email: true, company: true, status: true, createdAt: true },
      });
    } else if (exportType === "invoices") {
      records = await prisma.invoice.findMany({
        where: { workspaceId },
        take: PAGE_SIZE,
        skip: cursor ? 1 : 0,
        cursor: cursor ? { id: cursor } : undefined,
        orderBy: { id: "asc" },
      });
    } else {
      break;
    }

    if (!records.length) break;
    yield records;
    if (records.length < PAGE_SIZE) break;
    cursor = records[records.length - 1].id;
  }
}

// Upload a stream to S3
async function uploadStreamToS3(
  key: string,
  stream: Readable
): Promise<void> {
  // For large files, use multipart upload via @aws-sdk/lib-storage
  const { Upload } = await import("@aws-sdk/lib-storage");

  const upload = new Upload({
    client: s3,
    params: {
      Bucket:      process.env.S3_EXPORTS_BUCKET!,
      Key:         key,
      Body:        stream,
      ContentType: "text/csv",
    },
    queueSize:   4,   // Parallel upload parts
    partSize:    5 * 1024 * 1024, // 5MB parts (S3 minimum)
  });

  await upload.done();
}

// Main worker
const worker = new Worker(
  "exports",
  async (job) => {
    const { jobId } = job.data;

    const exportJob = await prisma.exportJob.findUniqueOrThrow({
      where: { id: jobId },
      include: { requestedBy: { select: { email: true, name: true } } },
    });

    await prisma.exportJob.update({
      where: { id: jobId },
      data: { status: "processing", startedAt: new Date() },
    });

    const columns = EXPORT_COLUMNS[exportJob.exportType];
    if (!columns) throw new Error(`Unknown export type: ${exportJob.exportType}`);

    const fileKey = `exports/${exportJob.workspaceId}/${jobId}.csv`;
    let rowCount = 0;

    try {
      // Create a PassThrough stream for piping
      const { PassThrough } = await import("stream");
      const passThrough = new PassThrough();

      // Write CSV asynchronously while uploading
      const uploadPromise = uploadStreamToS3(fileKey, passThrough);

      // Write header
      passThrough.write(columns.map((c) => c.header).join(",") + "\n");

      // Stream records
      for await (const page of streamExportData(
        exportJob.exportType,
        exportJob.workspaceId,
        exportJob.filters as Record<string, string>
      )) {
        for (const record of page) {
          passThrough.write(toCsvRow(record, columns));
          rowCount++;
        }
      }

      passThrough.end();
      await uploadPromise;

      await prisma.exportJob.update({
        where: { id: jobId },
        data: {
          status:      "complete",
          rowCount,
          fileKey,
          completedAt: new Date(),
        },
      });

      // Notify user
      await sendExportReadyEmail({
        to:         exportJob.requestedBy.email,
        userName:   exportJob.requestedBy.name,
        exportType: exportJob.exportType,
        rowCount,
        downloadUrl: `/api/exports/${jobId}/download`,
      });

    } catch (err) {
      await prisma.exportJob.update({
        where: { id: jobId },
        data: {
          status:       "failed",
          errorMessage: err instanceof Error ? err.message : "Unknown error",
        },
      });
      throw err;
    }
  },
  {
    connection: { host: process.env.REDIS_HOST!, port: 6379 },
    concurrency: 3,
  }
);

Presigned Download URL

// app/api/exports/[jobId]/download/route.ts
import { NextRequest, NextResponse } from "next/server";
import { auth } from "@/auth";
import { prisma } from "@/lib/prisma";
import { S3Client, GetObjectCommand } from "@aws-sdk/client-s3";
import { getSignedUrl } from "@aws-sdk/s3-request-presigner";

const s3 = new S3Client({ region: process.env.AWS_REGION! });

export async function GET(
  req: NextRequest,
  { params }: { params: { jobId: string } }
) {
  const session = await auth();
  if (!session?.user) return NextResponse.json({ error: "Unauthorized" }, { status: 401 });

  const job = await prisma.exportJob.findFirst({
    where: {
      id:          params.jobId,
      workspaceId: session.user.workspaceId,
      status:      "complete",
    },
  });

  if (!job || !job.fileKey) {
    return NextResponse.json({ error: "Export not found or not ready" }, { status: 404 });
  }

  // Check expiry
  if (job.expiresAt && job.expiresAt < new Date()) {
    return NextResponse.json({ error: "Export has expired β€” please generate a new one" }, { status: 410 });
  }

  // Generate a 15-minute presigned download URL
  const url = await getSignedUrl(
    s3,
    new GetObjectCommand({
      Bucket:                     process.env.S3_EXPORTS_BUCKET!,
      Key:                        job.fileKey,
      ResponseContentDisposition: `attachment; filename="${job.exportType}-export-${new Date(job.createdAt).toISOString().slice(0, 10)}.csv"`,
      ResponseContentType:        "text/csv",
    }),
    { expiresIn: 900 } // 15 minutes
  );

  // Redirect to presigned URL (browser downloads directly from S3)
  return NextResponse.redirect(url);
}

πŸ’‘ The Difference Between a SaaS Demo and a SaaS Business

Anyone can build a demo. We build SaaS products that handle real load, real users, and real payments β€” with architecture that does not need to be rewritten at 1,000 users.

  • Multi-tenant PostgreSQL with row-level security
  • Stripe subscriptions, usage billing, annual plans
  • SOC2-ready infrastructure from day one
  • We own zero equity β€” you own everything

Frontend: Export Button with Status Polling

// components/export-button.tsx
"use client";

import { useState, useEffect, useRef } from "react";
import { Download, Loader2 } from "lucide-react";

type ExportStatus = "idle" | "pending" | "processing" | "complete" | "failed";

interface ExportButtonProps {
  exportType: string;
  filters?: Record<string, string>;
  label?: string;
}

export function ExportButton({ exportType, filters, label = "Export CSV" }: ExportButtonProps) {
  const [status, setStatus] = useState<ExportStatus>("idle");
  const [jobId, setJobId] = useState<string | null>(null);
  const [rowCount, setRowCount] = useState<number | null>(null);
  const pollRef = useRef<NodeJS.Timeout>();

  async function startExport() {
    setStatus("pending");
    try {
      const res = await fetch("/api/exports", {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({ exportType, filters }),
      });
      const data = await res.json();
      setJobId(data.jobId);
    } catch {
      setStatus("failed");
    }
  }

  // Poll until complete
  useEffect(() => {
    if (!jobId || status === "complete" || status === "failed") return;

    pollRef.current = setInterval(async () => {
      try {
        const res = await fetch(`/api/exports?jobId=${jobId}`);
        const data = await res.json();
        setStatus(data.status);
        if (data.rowCount) setRowCount(data.rowCount);
      } catch { /* non-fatal */ }
    }, 2000);

    return () => clearInterval(pollRef.current);
  }, [jobId, status]);

  const isLoading = status === "pending" || status === "processing";

  return (
    <div className="inline-flex items-center gap-2">
      {status === "idle" || status === "failed" ? (
        <button
          onClick={startExport}
          className="inline-flex items-center gap-2 px-3 py-1.5 text-sm font-medium text-gray-700 border border-gray-300 rounded-lg hover:bg-gray-50"
        >
          <Download className="w-4 h-4" />
          {status === "failed" ? "Retry export" : label}
        </button>
      ) : isLoading ? (
        <div className="inline-flex items-center gap-2 text-sm text-gray-500">
          <Loader2 className="w-4 h-4 animate-spin" />
          {status === "pending" ? "Preparing…" : "Generating CSV…"}
        </div>
      ) : status === "complete" ? (
        <a
          href={`/api/exports/${jobId}/download`}
          className="inline-flex items-center gap-2 px-3 py-1.5 text-sm font-medium text-blue-600 border border-blue-300 rounded-lg hover:bg-blue-50"
        >
          <Download className="w-4 h-4" />
          Download CSV {rowCount ? `(${rowCount.toLocaleString()} rows)` : ""}
        </a>
      ) : null}
    </div>
  );
}

Cost and Timeline Estimates

ScopeTeamTimelineCost Range
Simple in-memory CSV (< 10K rows)1 dev0.5–1 day$200–400
Streaming background export + S3 + email1 dev3–5 days$1,000–2,000
+ Progress tracking + retry + expiry cleanup1 dev1–2 weeks$2,500–5,000

S3 storage cost: A 100MB export file stored for 24h costs ~$0.0023. Negligible.

See Also


Working With Viprasol

CSV export sounds simple until you hit 500,000 rows. Our team builds streaming export pipelines that never load the full dataset into memory, upload directly to S3 in parallel chunks, and notify users via email when their download is ready β€” with presigned URLs that expire after 15 minutes for security.

What we deliver:

  • BullMQ background worker with cursor-paginated PostgreSQL reads
  • PassThrough stream β†’ S3 multipart upload (no memory spike)
  • Presigned download URL with correct Content-Disposition filename
  • Frontend ExportButton with 2-second polling and status transitions
  • 24h file expiry with cleanup job

Talk to our team about your data export architecture β†’

Or explore our SaaS development services.

Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Building a SaaS Product?

We've helped launch 50+ SaaS platforms. Let's build yours β€” fast.

Free consultation β€’ No commitment β€’ Response within 24 hours

Viprasol Β· AI Agent Systems

Add AI automation to your SaaS product?

Viprasol builds custom AI agent crews that plug into any SaaS workflow β€” automating repetitive tasks, qualifying leads, and responding across every channel your customers use.