An embedded AI assistant turns your SaaS product into something users actually talk to — asking about their data, getting help with workflows, running actions via natural language. The baseline is a chat box that calls an LLM. The production version handles streaming for perceived speed, tool calls so the AI can read your actual data, conversation history so context persists across sessions, and usage limits so you don't go bankrupt on API costs.

This guide builds a complete embedded SaaS AI assistant using the Vercel AI SDK and Claude claude-sonnet-4-6.

Architecture

User types message
  → POST /api/assistant/chat
  → Build system prompt (workspace context)
  → Load conversation history
  → Claude claude-sonnet-4-6 with tools
  → Stream response chunks to client
  → Tool calls: query DB for user's data
  → Persist assistant response to DB
  → Client renders streaming text

Database Schema

CREATE TABLE ai_conversations (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  workspace_id    UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE,
  user_id         UUID NOT NULL REFERENCES users(id),
  title           TEXT,
  created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE ai_messages (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  conversation_id UUID NOT NULL REFERENCES ai_conversations(id) ON DELETE CASCADE,
  role            TEXT NOT NULL CHECK (role IN ('user', 'assistant', 'tool')),
  content         TEXT NOT NULL,
  tool_calls      JSONB,        -- tool invocations from assistant
  tool_results    JSONB,        -- results returned to assistant
  input_tokens    INTEGER,
  output_tokens   INTEGER,
  created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE ai_usage (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  workspace_id    UUID NOT NULL REFERENCES workspaces(id),
  user_id         UUID NOT NULL,
  model           TEXT NOT NULL,
  input_tokens    INTEGER NOT NULL DEFAULT 0,
  output_tokens   INTEGER NOT NULL DEFAULT 0,
  cost_usd        NUMERIC(10,6) NOT NULL DEFAULT 0,
  period_start    DATE NOT NULL DEFAULT CURRENT_DATE,
  created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_ai_messages_conv ON ai_messages(conversation_id, created_at);
CREATE INDEX idx_ai_usage_workspace ON ai_usage(workspace_id, period_start DESC);

🤖 AI Is Not the Future — It Is Right Now

Businesses using AI automation cut manual work by 60–80%. We build production-ready AI systems — RAG pipelines, LLM integrations, custom ML models, and AI agent workflows.

LLM integration (OpenAI, Anthropic, Gemini, local models)
RAG systems that answer from your own data
AI agents that take real actions — not just chat
Custom ML models for prediction, classification, detection

Explore AI for My Business WhatsApp

Vercel AI SDK Setup

npm install ai @ai-sdk/anthropic zod

Tool Definitions

Tools let the AI query your actual product data instead of hallucinating it:

// lib/ai/tools.ts
import { tool } from "ai";
import { z } from "zod";
import { prisma } from "@/lib/prisma";

// Tools are scoped to a workspace — injected at request time
export function createWorkspaceTools(workspaceId: string) {
  return {
    getProjectStats: tool({
      description:
        "Get statistics about projects in the workspace — counts, status breakdown, recent activity.",
      parameters: z.object({
        includeArchived: z
          .boolean()
          .optional()
          .default(false)
          .describe("Whether to include archived projects"),
      }),
      execute: async ({ includeArchived }) => {
        const [total, byStatus, recentActivity] = await Promise.all([
          prisma.project.count({
            where: {
              workspaceId,
              ...(includeArchived ? {} : { archivedAt: null }),
            },
          }),
          prisma.project.groupBy({
            by: ["status"],
            where: {
              workspaceId,
              ...(includeArchived ? {} : { archivedAt: null }),
            },
            _count: true,
          }),
          prisma.project.findMany({
            where: { workspaceId, archivedAt: null },
            orderBy: { updatedAt: "desc" },
            take: 5,
            select: { name: true, status: true, updatedAt: true },
          }),
        ]);

        return {
          total,
          byStatus: Object.fromEntries(
            byStatus.map((s) => [s.status, s._count])
          ),
          recentActivity,
        };
      },
    }),

    getTaskSummary: tool({
      description:
        "Get a summary of tasks — overdue, due today, assigned to specific people, or by project.",
      parameters: z.object({
        filter: z
          .enum(["overdue", "due_today", "due_this_week", "all"])
          .default("all"),
        assigneeId: z.string().optional().describe("Filter by specific user ID"),
        projectId: z.string().optional().describe("Filter by specific project"),
        limit: z.number().min(1).max(50).default(20),
      }),
      execute: async ({ filter, assigneeId, projectId, limit }) => {
        const now = new Date();
        const todayEnd = new Date(now);
        todayEnd.setHours(23, 59, 59, 999);
        const weekEnd = new Date(now);
        weekEnd.setDate(weekEnd.getDate() + 7);

        const dateFilter =
          filter === "overdue"
            ? { dueDate: { lt: now }, completedAt: null }
            : filter === "due_today"
            ? { dueDate: { gte: now, lte: todayEnd } }
            : filter === "due_this_week"
            ? { dueDate: { gte: now, lte: weekEnd } }
            : {};

        const tasks = await prisma.task.findMany({
          where: {
            workspaceId,
            ...dateFilter,
            ...(assigneeId ? { assigneeId } : {}),
            ...(projectId ? { projectId } : {}),
            isDeleted: false,
          },
          orderBy: { dueDate: "asc" },
          take: limit,
          select: {
            id: true,
            title: true,
            status: true,
            priority: true,
            dueDate: true,
            assignee: { select: { name: true } },
            project: { select: { name: true } },
          },
        });

        return {
          count: tasks.length,
          tasks: tasks.map((t) => ({
            id: t.id,
            title: t.title,
            status: t.status,
            priority: t.priority,
            dueDate: t.dueDate?.toISOString(),
            assignee: t.assignee?.name ?? "Unassigned",
            project: t.project?.name ?? "No project",
          })),
        };
      },
    }),

    getMemberList: tool({
      description: "Get workspace members with their roles and recent activity.",
      parameters: z.object({
        includeInactive: z.boolean().optional().default(false),
      }),
      execute: async ({ includeInactive }) => {
        const members = await prisma.workspaceMember.findMany({
          where: { workspaceId },
          include: {
            user: {
              select: {
                id: true,
                name: true,
                email: true,
                image: true,
                lastActiveAt: true,
              },
            },
          },
          orderBy: { user: { lastActiveAt: "desc" } },
        });

        const thirtyDaysAgo = new Date();
        thirtyDaysAgo.setDate(thirtyDaysAgo.getDate() - 30);

        return members
          .filter(
            (m) =>
              includeInactive ||
              !m.user.lastActiveAt ||
              m.user.lastActiveAt > thirtyDaysAgo
          )
          .map((m) => ({
            id: m.user.id,
            name: m.user.name,
            email: m.user.email,
            role: m.role,
            lastActive: m.user.lastActiveAt?.toISOString() ?? null,
          }));
      },
    }),

    searchContent: tool({
      description:
        "Full-text search across projects, tasks, and documents in the workspace.",
      parameters: z.object({
        query: z.string().min(2).describe("Search query"),
        types: z
          .array(z.enum(["projects", "tasks", "documents"]))
          .default(["projects", "tasks"]),
        limit: z.number().min(1).max(20).default(10),
      }),
      execute: async ({ query, types, limit }) => {
        const results: Record<string, any[]> = {};

        if (types.includes("projects")) {
          results.projects = await prisma.project.findMany({
            where: {
              workspaceId,
              archivedAt: null,
              OR: [
                { name: { contains: query, mode: "insensitive" } },
                { description: { contains: query, mode: "insensitive" } },
              ],
            },
            take: limit,
            select: { id: true, name: true, status: true },
          });
        }

        if (types.includes("tasks")) {
          results.tasks = await prisma.task.findMany({
            where: {
              workspaceId,
              isDeleted: false,
              title: { contains: query, mode: "insensitive" },
            },
            take: limit,
            select: {
              id: true,
              title: true,
              status: true,
              project: { select: { name: true } },
            },
          });
        }

        return results;
      },
    }),
  };
}

⚡ Your Competitors Are Already Using AI — Are You?

We build AI systems that actually work in production — not demos that die in a Colab notebook. From data pipeline to deployed model to real business outcomes.

AI agent systems that run autonomously — not just chatbots
Integrates with your existing tools (CRM, ERP, Slack, etc.)
Explainable outputs — know why the model decided what it did
Free AI opportunity audit for your business

Get a Free AI Audit WhatsApp

Chat API Route with Streaming

// app/api/assistant/chat/route.ts
import { NextRequest } from "next/server";
import { streamText, appendResponseMessages } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { auth } from "@/auth";
import { prisma } from "@/lib/prisma";
import { createWorkspaceTools } from "@/lib/ai/tools";
import { getConversationHistory, saveMessages } from "@/lib/ai/history";
import { checkUsageLimit, recordUsage } from "@/lib/ai/usage";
import { z } from "zod";

const ChatSchema = z.object({
  messages: z.array(
    z.object({
      role: z.enum(["user", "assistant"]),
      content: z.string(),
    })
  ),
  conversationId: z.string().optional(),
});

export async function POST(req: NextRequest) {
  const session = await auth();
  if (!session?.user) {
    return new Response("Unauthorized", { status: 401 });
  }

  const body = await req.json();
  const parsed = ChatSchema.safeParse(body);
  if (!parsed.success) {
    return new Response("Invalid request", { status: 400 });
  }

  const { messages, conversationId } = parsed.data;
  const workspaceId = session.user.organizationId;
  const userId = session.user.id;

  // Check usage limits
  const withinLimit = await checkUsageLimit(workspaceId);
  if (!withinLimit) {
    return new Response(
      JSON.stringify({
        error: "Monthly AI usage limit reached. Upgrade your plan for more.",
      }),
      { status: 429, headers: { "Content-Type": "application/json" } }
    );
  }

  // Get or create conversation
  let convId = conversationId;
  if (!convId) {
    const conv = await prisma.aiConversation.create({
      data: { workspaceId, userId },
      select: { id: true },
    });
    convId = conv.id;
  }

  // Load previous messages for context window
  const history = await getConversationHistory(convId, 20);

  // Build workspace context for system prompt
  const workspace = await prisma.workspace.findUnique({
    where: { id: workspaceId },
    select: { name: true, plan: true },
  });

  const systemPrompt = buildSystemPrompt({
    workspaceName: workspace?.name ?? "your workspace",
    plan: workspace?.plan ?? "free",
    userName: session.user.name ?? "the user",
    currentDate: new Date().toISOString().split("T")[0],
  });

  const tools = createWorkspaceTools(workspaceId);

  const result = streamText({
    model: anthropic("claude-sonnet-4-6"),
    system: systemPrompt,
    messages: [
      ...history,
      ...messages,
    ],
    tools,
    maxSteps: 5,  // Allow up to 5 tool call rounds
    temperature: 0.3,
    maxTokens: 2048,
    onFinish: async ({ usage, response }) => {
      // Persist the full conversation turn
      await saveMessages(convId!, [
        ...messages.slice(-1), // last user message
        ...response.messages,  // assistant response + any tool messages
      ]);

      // Record usage for billing/limits
      await recordUsage({
        workspaceId,
        userId,
        model: "claude-sonnet-4-6",
        inputTokens: usage.promptTokens,
        outputTokens: usage.completionTokens,
      });
    },
  });

  // Return SSE stream with conversation ID header
  const response = result.toDataStreamResponse();
  response.headers.set("X-Conversation-Id", convId);
  return response;
}

function buildSystemPrompt(context: {
  workspaceName: string;
  plan: string;
  userName: string;
  currentDate: string;
}): string {
  return `You are an AI assistant embedded in ${context.workspaceName}, a project management SaaS application.

You are helping ${context.userName}. Today's date is ${context.currentDate}.

**Your capabilities:**
- Answer questions about the workspace's projects, tasks, and team members using the tools provided
- Help users understand their data and find information
- Provide actionable suggestions based on workspace context
- Help with planning, prioritization, and workflow optimization

**Guidelines:**
- Always use tools to fetch real data rather than making up information
- Be concise and specific — users are busy professionals
- When showing lists, limit to the most relevant 5–10 items
- If asked about something outside the workspace scope (coding help, general knowledge), answer helpfully but note you don't have specific workspace data for it
- Never expose internal IDs or technical details in responses
- Current plan: ${context.plan}

Respond in plain language, not markdown unless the user specifically requests formatted output.`;
}

Conversation History Management

// lib/ai/history.ts
import { prisma } from "@/lib/prisma";
import type { CoreMessage } from "ai";

export async function getConversationHistory(
  conversationId: string,
  limit = 20
): Promise<CoreMessage[]> {
  const messages = await prisma.aiMessage.findMany({
    where: { conversationId },
    orderBy: { createdAt: "desc" },
    take: limit,
    select: {
      role: true,
      content: true,
      toolCalls: true,
      toolResults: true,
    },
  });

  // Reverse to chronological order
  return messages.reverse().map((m) => {
    if (m.role === "assistant" && m.toolCalls) {
      return {
        role: "assistant" as const,
        content: [
          { type: "text" as const, text: m.content },
          ...(m.toolCalls as any[]).map((tc: any) => ({
            type: "tool-call" as const,
            toolCallId: tc.id,
            toolName: tc.name,
            args: tc.args,
          })),
        ],
      };
    }

    if (m.role === "tool" && m.toolResults) {
      return {
        role: "tool" as const,
        content: (m.toolResults as any[]).map((tr: any) => ({
          type: "tool-result" as const,
          toolCallId: tr.id,
          toolName: tr.name,
          result: tr.result,
        })),
      };
    }

    return {
      role: m.role as "user" | "assistant",
      content: m.content,
    };
  });
}

export async function saveMessages(
  conversationId: string,
  messages: any[]
): Promise<void> {
  const toSave = messages
    .filter((m) => m.role === "user" || m.role === "assistant")
    .map((m) => {
      if (typeof m.content === "string") {
        return {
          conversationId,
          role: m.role,
          content: m.content,
        };
      }

      // Assistant message with potential tool calls
      const textPart = m.content.find((p: any) => p.type === "text");
      const toolCalls = m.content.filter(
        (p: any) => p.type === "tool-call"
      );

      return {
        conversationId,
        role: m.role,
        content: textPart?.text ?? "",
        toolCalls: toolCalls.length > 0 ? toolCalls : undefined,
      };
    });

  if (toSave.length > 0) {
    await prisma.aiMessage.createMany({ data: toSave });

    // Auto-generate conversation title from first user message
    const firstUserMsg = messages.find((m) => m.role === "user");
    if (firstUserMsg) {
      const content =
        typeof firstUserMsg.content === "string"
          ? firstUserMsg.content
          : firstUserMsg.content.find((p: any) => p.type === "text")?.text ?? "";

      await prisma.aiConversation.updateMany({
        where: { id: conversationId, title: null },
        data: {
          title: content.slice(0, 80) + (content.length > 80 ? "…" : ""),
          updatedAt: new Date(),
        },
      });
    }
  }
}

Usage Limits and Cost Tracking

// lib/ai/usage.ts
import { prisma } from "@/lib/prisma";

// Pricing per 1M tokens (claude-sonnet-4-6, 2027)
const PRICING = {
  "claude-sonnet-4-6": {
    inputPerMillion: 3.0,   // $3.00 per 1M input tokens
    outputPerMillion: 15.0, // $15.00 per 1M output tokens
  },
} as const;

// Monthly limits by plan (in total tokens)
const PLAN_LIMITS: Record<string, number> = {
  FREE: 50_000,
  STARTER: 500_000,
  PROFESSIONAL: 5_000_000,
  ENTERPRISE: Infinity,
};

export async function checkUsageLimit(workspaceId: string): Promise<boolean> {
  const workspace = await prisma.workspace.findUnique({
    where: { id: workspaceId },
    select: { plan: true },
  });

  const limit = PLAN_LIMITS[workspace?.plan ?? "FREE"];
  if (limit === Infinity) return true;

  const startOfMonth = new Date();
  startOfMonth.setDate(1);
  startOfMonth.setHours(0, 0, 0, 0);

  const usage = await prisma.aiUsage.aggregate({
    where: {
      workspaceId,
      createdAt: { gte: startOfMonth },
    },
    _sum: {
      inputTokens: true,
      outputTokens: true,
    },
  });

  const totalTokens =
    (usage._sum.inputTokens ?? 0) + (usage._sum.outputTokens ?? 0);

  return totalTokens < limit;
}

export async function recordUsage(params: {
  workspaceId: string;
  userId: string;
  model: string;
  inputTokens: number;
  outputTokens: number;
}): Promise<void> {
  const pricing =
    PRICING[params.model as keyof typeof PRICING] ??
    PRICING["claude-sonnet-4-6"];

  const costUsd =
    (params.inputTokens / 1_000_000) * pricing.inputPerMillion +
    (params.outputTokens / 1_000_000) * pricing.outputPerMillion;

  await prisma.aiUsage.create({
    data: {
      workspaceId: params.workspaceId,
      userId: params.userId,
      model: params.model,
      inputTokens: params.inputTokens,
      outputTokens: params.outputTokens,
      costUsd,
    },
  });
}

React Chat Component

// components/ai/assistant-chat.tsx
"use client";

import { useChat } from "ai/react";
import { useState, useRef, useEffect } from "react";
import { Send, Bot, User, Loader2, Sparkles } from "lucide-react";
import { cn } from "@/lib/utils";

interface AssistantChatProps {
  initialConversationId?: string;
}

export function AssistantChat({ initialConversationId }: AssistantChatProps) {
  const [conversationId, setConversationId] = useState(
    initialConversationId
  );
  const messagesEndRef = useRef<HTMLDivElement>(null);

  const { messages, input, handleInputChange, handleSubmit, isLoading, error } =
    useChat({
      api: "/api/assistant/chat",
      body: { conversationId },
      onResponse: (response) => {
        // Capture conversation ID from response header
        const newConvId = response.headers.get("X-Conversation-Id");
        if (newConvId && !conversationId) {
          setConversationId(newConvId);
        }
      },
    });

  // Auto-scroll to latest message
  useEffect(() => {
    messagesEndRef.current?.scrollIntoView({ behavior: "smooth" });
  }, [messages]);

  const suggestedPrompts = [
    "What projects are currently in progress?",
    "Show me overdue tasks",
    "Who are the most active team members?",
    "What's due this week?",
  ];

  return (
    <div className="flex flex-col h-full bg-white rounded-xl border border-gray-200 overflow-hidden">
      {/* Header */}
      <div className="flex items-center gap-2.5 px-4 py-3 border-b border-gray-100 bg-gray-50">
        <div className="w-7 h-7 bg-gradient-to-br from-blue-500 to-purple-600 rounded-lg flex items-center justify-center">
          <Sparkles className="w-4 h-4 text-white" />
        </div>
        <div>
          <p className="text-sm font-semibold text-gray-900">AI Assistant</p>
          <p className="text-xs text-gray-500">Powered by Claude</p>
        </div>
      </div>

      {/* Messages */}
      <div className="flex-1 overflow-y-auto p-4 space-y-4">
        {messages.length === 0 && (
          <div className="space-y-4">
            <div className="text-center py-6">
              <Bot className="w-10 h-10 text-gray-300 mx-auto mb-3" />
              <p className="text-sm text-gray-500">
                Ask me anything about your workspace
              </p>
            </div>
            <div className="grid grid-cols-1 gap-2">
              {suggestedPrompts.map((prompt) => (
                <button
                  key={prompt}
                  onClick={() =>
                    handleSubmit(
                      new Event("submit") as any,
                      { data: { message: prompt } }
                    )
                  }
                  className="text-left px-3 py-2.5 text-sm text-gray-600 border border-gray-200 rounded-lg hover:border-blue-300 hover:bg-blue-50 hover:text-blue-700 transition"
                >
                  {prompt}
                </button>
              ))}
            </div>
          </div>
        )}

        {messages.map((message) => (
          <div
            key={message.id}
            className={cn(
              "flex gap-3",
              message.role === "user" ? "flex-row-reverse" : "flex-row"
            )}
          >
            {/* Avatar */}
            <div
              className={cn(
                "flex-shrink-0 w-7 h-7 rounded-full flex items-center justify-center",
                message.role === "user"
                  ? "bg-blue-600"
                  : "bg-gradient-to-br from-blue-500 to-purple-600"
              )}
            >
              {message.role === "user" ? (
                <User className="w-4 h-4 text-white" />
              ) : (
                <Sparkles className="w-3.5 h-3.5 text-white" />
              )}
            </div>

            {/* Bubble */}
            <div
              className={cn(
                "max-w-[80%] rounded-2xl px-4 py-2.5 text-sm",
                message.role === "user"
                  ? "bg-blue-600 text-white rounded-tr-sm"
                  : "bg-gray-100 text-gray-900 rounded-tl-sm"
              )}
            >
              {message.parts ? (
                message.parts.map((part, i) => {
                  if (part.type === "text") {
                    return <p key={i} className="whitespace-pre-wrap">{part.text}</p>;
                  }
                  if (part.type === "tool-invocation") {
                    return (
                      <div key={i} className="flex items-center gap-1.5 text-xs text-gray-500 my-1">
                        <Loader2 className="w-3 h-3 animate-spin" />
                        <span>Looking up {part.toolInvocation.toolName.replace(/_/g, " ")}…</span>
                      </div>
                    );
                  }
                  return null;
                })
              ) : (
                <p className="whitespace-pre-wrap">{message.content}</p>
              )}
            </div>
          </div>
        ))}

        {isLoading && messages[messages.length - 1]?.role === "user" && (
          <div className="flex gap-3">
            <div className="w-7 h-7 rounded-full bg-gradient-to-br from-blue-500 to-purple-600 flex items-center justify-center">
              <Sparkles className="w-3.5 h-3.5 text-white" />
            </div>
            <div className="bg-gray-100 rounded-2xl rounded-tl-sm px-4 py-3">
              <div className="flex gap-1">
                {[0, 1, 2].map((i) => (
                  <div
                    key={i}
                    className="w-1.5 h-1.5 bg-gray-400 rounded-full animate-bounce"
                    style={{ animationDelay: `${i * 150}ms` }}
                  />
                ))}
              </div>
            </div>
          </div>
        )}

        {error && (
          <div className="text-center text-xs text-red-500 py-2">
            {error.message || "Something went wrong. Please try again."}
          </div>
        )}

        <div ref={messagesEndRef} />
      </div>

      {/* Input */}
      <form
        onSubmit={handleSubmit}
        className="flex gap-2 p-3 border-t border-gray-100"
      >
        <input
          value={input}
          onChange={handleInputChange}
          placeholder="Ask about your projects, tasks, team…"
          disabled={isLoading}
          className="flex-1 px-3 py-2 text-sm border border-gray-200 rounded-lg focus:outline-none focus:ring-2 focus:ring-blue-500 disabled:opacity-60"
        />
        <button
          type="submit"
          disabled={isLoading || !input.trim()}
          className="p-2 bg-blue-600 text-white rounded-lg hover:bg-blue-700 disabled:opacity-50 disabled:cursor-not-allowed transition"
        >
          {isLoading ? (
            <Loader2 className="w-4 h-4 animate-spin" />
          ) : (
            <Send className="w-4 h-4" />
          )}
        </button>
      </form>
    </div>
  );
}

Sliding Panel Integration

// components/ai/assistant-panel.tsx
"use client";

import { useState } from "react";
import { Sparkles, X } from "lucide-react";
import { AssistantChat } from "./assistant-chat";

export function AssistantPanel() {
  const [isOpen, setIsOpen] = useState(false);

  return (
    <>
      {/* Trigger button — fixed bottom right */}
      <button
        onClick={() => setIsOpen(true)}
        className="fixed bottom-6 right-6 w-12 h-12 bg-gradient-to-br from-blue-500 to-purple-600 text-white rounded-full shadow-lg hover:shadow-xl hover:scale-105 transition-all flex items-center justify-center z-40"
        aria-label="Open AI Assistant"
      >
        <Sparkles className="w-5 h-5" />
      </button>

      {/* Overlay */}
      {isOpen && (
        <div
          className="fixed inset-0 bg-black/20 z-40"
          onClick={() => setIsOpen(false)}
        />
      )}

      {/* Sliding panel */}
      <div
        className={`fixed right-0 top-0 bottom-0 w-96 bg-white shadow-2xl z-50 transform transition-transform duration-300 ${
          isOpen ? "translate-x-0" : "translate-x-full"
        }`}
      >
        <div className="flex items-center justify-between px-4 py-3 border-b border-gray-100">
          <div className="flex items-center gap-2">
            <Sparkles className="w-4 h-4 text-purple-600" />
            <span className="font-semibold text-sm">AI Assistant</span>
          </div>
          <button
            onClick={() => setIsOpen(false)}
            className="p-1 text-gray-400 hover:text-gray-600 rounded"
          >
            <X className="w-4 h-4" />
          </button>
        </div>
        <div className="h-[calc(100%-53px)]">
          <AssistantChat />
        </div>
      </div>
    </>
  );
}

Cost and Timeline Estimates

Scope	Team	Timeline	Cost Range
Basic chat (no tools, no history)	1 dev	1–2 days	$400–800
Streaming + conversation history	1 dev	3–5 days	$1,000–2,000
Full system (tools + history + usage limits)	1–2 devs	2–3 weeks	$5,000–10,000
Enterprise assistant (RAG, custom prompts, analytics)	2–3 devs	4–6 weeks	$12,000–28,000

API costs at scale (claude-sonnet-4-6): ~$3/1M input tokens, ~$15/1M output tokens. A typical SaaS user sending 10 messages/day averages ~2,000 tokens/message = ~6M tokens/month for 1,000 active users ≈ $100–200/month in API costs.

Working With Viprasol

Building an AI assistant that users actually trust requires more than calling an LLM API — it needs real data access, streaming for responsiveness, guardrails against hallucination, and cost controls that don't surprise you at month-end. Our team has integrated AI assistants into SaaS products where the AI answers questions about real customer data, not made-up examples.

What we deliver:

Vercel AI SDK integration with Claude claude-sonnet-4-6 streaming
Tool definitions scoped to your data model and access controls
Conversation history with PostgreSQL persistence
Per-plan usage limits with cost tracking
Sliding panel UI component ready for your design system

Talk to our team about adding AI to your product →

Or explore our AI and ML services.

SaaS AI Assistant: Streaming Chat, Tool Calls, and Conversation History

Architecture

Database Schema

🤖 AI Is Not the Future — It Is Right Now

Vercel AI SDK Setup

Tool Definitions

⚡ Your Competitors Are Already Using AI — Are You?

Chat API Route with Streaming

Conversation History Management

Usage Limits and Cost Tracking

React Chat Component

Sliding Panel Integration

Cost and Timeline Estimates

See Also

Working With Viprasol

Viprasol Tech Team

Want to Implement AI in Your Business?

Ready to automate your business with AI agents?

Related Articles

Next.js Streaming Responses: Server-Sent Events, AI Response Streaming, and Route Handlers

AWS Bedrock RAG in 2026: Knowledge Bases, Embedding Pipeline, and Retrieval-Augmented Generation

SaaS Project Templates: Template Library, Deep Copy Clone, Variable Substitution, and Preview