How to Build an AI Chatbot in Next.js (2026)
How to Build an AI Chatbot in Next.js in 2026
Building an AI chatbot in 2026 is two different problems wearing the same name. There's "ChatGPT clone" (a model wrapper with conversation history) and "real AI chatbot product" (your data, your tools, your deployment, your moat). The first is a weekend. The second is six to twelve weeks.
This guide walks through what a real AI chatbot product has to handle, the architecture decisions that bite you later, and the realistic path from zero to a deployed chatbot you can actually charge for.
Or skip the build entirely: get every kit for $499
If you're shipping more than one product, All Access unlocks every Next.js kit on thefrontkit. The AI UX Kit ships chat UI primitives with streaming, citations, and conversation history. A deployable AI chatbot kit (different from AI UX components) is in development — join the waitlist below. Plus the SaaS Starter Kit, CRM, HR, and 7 more.
What a Real AI Chatbot Product Has to Do
Beyond "ChatGPT for X."
Streaming Chat UI
Token-by-token rendering is the baseline expectation in 2026.
- Server-sent events (SSE) or fetch streaming for token delivery
- Markdown rendering during stream — headings, lists, code blocks render as they arrive
- Code block syntax highlighting with copy button
- Stop generation button so users can interrupt long responses
- Regenerate response if the answer was unsatisfactory
- Edit and resend previous messages to retry the conversation
- Smooth scroll-to-bottom that respects user scroll position
The Vercel AI SDK handles most of this. Don't roll streaming yourself.
Conversation History
Not a single chat — a list of chats:
- Conversation list in a sidebar (like ChatGPT)
- New chat button that resets context
- Rename and delete per conversation
- Folder/category organization for power users
- Search across conversations
- Pin important chats
- Share conversation via public URL (optional)
Storage: a conversations table with messages linked by conversation_id. Index by user and updated-at.
Multi-Model Support
In 2026, most production chatbots use multiple models:
- OpenAI GPT-5 for reasoning-heavy tasks
- Anthropic Claude for long-context document analysis
- Gemini for multimodal (vision, audio)
- Llama / Mistral self-hosted for privacy-sensitive data
- Specialized fine-tunes for specific tasks
The chatbot UI should expose model selection or route automatically based on the question.
RAG (Retrieval-Augmented Generation)
When the chatbot answers questions about your data:
- Document ingestion — PDFs, web pages, docs uploaded by the user
- Chunking strategy (semantic chunking beats fixed-size in 2026)
- Embedding via OpenAI text-embedding-3-large or Voyage AI
- Vector storage (Pinecone, Weaviate, Qdrant, or pgvector)
- Retrieval with hybrid search (keyword + semantic)
- Reranking for top results
- Citation rendering showing which sources the answer is from
- Source previews clickable to the underlying document
Without RAG, your chatbot is generic ChatGPT. With RAG, it knows your customer's data.
Tool Calling
Real AI agents don't just chat — they take actions:
- Function definitions in JSON schema format
- Tool call rendering in the chat (show "Looking up order #1234..." while a tool runs)
- Approval flows for destructive actions (delete, refund, send email)
- Multi-tool conversations where the model picks the right tool sequence
- Error handling when tools fail
The Vercel AI SDK and OpenAI's tool calling API handle the mechanics. The product work is choosing the right tools and UX patterns.
Citations and Source Attribution
The trust layer:
- Inline citations in the response (clickable [1] [2] references)
- Source list at the bottom of each answer
- Source preview on hover showing the actual content snippet
- Confidence indicators when the model is unsure
- Refusal handling when the data doesn't support an answer
Without citations, users can't trust the chatbot for anything important.
Feedback Capture
For improving the product:
- Thumbs up/down per response
- Optional comment explaining the rating
- Inline issue tags ("inaccurate," "irrelevant," "harmful")
- Aggregation dashboard showing top issues
- Per-prompt feedback rollups so you can identify failing prompt patterns
Feedback is how you turn a chatbot into a better product.
Rate Limiting and Cost Control
Critical and often skipped:
- Per-user rate limits (N messages per hour)
- Per-account quotas (M messages per month for free tier)
- Per-model cost tracking with daily/monthly caps
- Hard stop at cost threshold to prevent runaway bills
- Token usage dashboard for admins
A single bug or abusive user can run up a five-figure bill overnight. Build the cost controls before you launch.
Deployment and Auth
The boring critical path:
- Auth (Clerk, Auth.js, Supabase Auth)
- User accounts with conversation history per user
- Subscription tiers (free, paid) with rate limits per tier
- Onboarding flow so first-time users understand what the chatbot does
- Marketing site that drives signups
This is where most "ChatGPT for X" products fail — they have a working chatbot but no auth, no billing, no real product around it.
Tech Stack: AI Chatbot Specific Decisions
Streaming framework: Vercel AI SDK. It's the standard in 2026. Handles streaming, tool calls, structured outputs, and multi-model.
Model API: OpenAI SDK + Anthropic SDK. Or AI SDK abstractions if you want one interface.
Vector database: pgvector if Postgres-based (cheap, easy). Pinecone for managed (paid, fast). Qdrant for self-hosted.
Embeddings: OpenAI text-embedding-3-large or Voyage AI voyage-3.
Real-time: Built into the streaming layer; no separate WebSocket needed.
File parsing: Unstructured.io or LlamaParse for PDF, DOCX, etc.
Background jobs: Inngest for async document ingestion.
Database: Postgres + Drizzle. Conversations + messages tables.
Build Path 1: From Scratch (6 to 10 Weeks)
Week 1-2: Foundation. Next.js, auth, basic chat UI with Vercel AI SDK, conversation history.
Week 3-4: Multi-model + tool calling. Wire OpenAI + Anthropic. Build a few tools.
Week 5-6: RAG (if needed). Document ingestion, embedding, vector storage, retrieval, citations.
Week 7-8: Cost controls and rate limiting. Per-user limits, per-account quotas, cost dashboard.
Week 9-10: Polish and deployment. Auth tiers, marketing site, onboarding, Stripe wiring.
For most teams: 10 to 14 weeks. RAG alone is routinely 3-4 weeks if done well.
Build Path 2: Using an AI Chatbot Kit (1 to 2 Weeks)
A production-ready AI chatbot kit ships:
- Streaming chat UI with markdown, code blocks, regenerate
- Conversation history with sidebar
- Multi-model support (OpenAI, Anthropic, Gemini)
- Tool calling patterns
- RAG with pgvector
- Citation rendering
- Feedback capture (thumbs + comments)
- Rate limiting and cost controls
- Stripe-ready subscription tiers
- 25-35 screens, WCAG AA accessible
A deployable AI chatbot kit (separate from the AI UX components kit) is in active development at thefrontkit — join the waitlist on our All Access page.
For chat UI components today (streaming, citations, feedback) without the full deployable product, see the AI UX Kit.
Common Pitfalls
Rolling streaming yourself. Use the Vercel AI SDK. The edge cases (reconnection, partial messages, stream cancellation) are solved.
Single-model lock-in. OpenAI's pricing changes. Anthropic launches a better model. Without an abstraction layer, switching is a refactor. Use a provider-agnostic interface from day one.
No cost controls. A single bug or abusive user can spend $10k overnight. Build per-user, per-account, and global daily cost caps before launch.
RAG without reranking. Top-K retrieval alone misses the right chunk often. Add a reranking step (Cohere Rerank, OpenAI's rerank) for production quality.
Citations as decoration. If citations aren't clickable to the source, users can't verify. Build the source-preview UX from day one.
Storing prompts as code constants. Prompts change weekly. Pull them into a config or database so non-developers can iterate.
No evals. You change the prompt and have no way to know if quality went up or down. Build a basic eval suite (10-50 test cases) before the first prompt change.
Underbuilding the conversation sidebar. Without conversation history, every chat starts cold. Users hate it. Build the sidebar in v1.
Adjacent Reads
- Best AI Chatbot Templates 2026 — head-to-head comparison
- How to Choose an AI Ops Dashboard Template — for managing chatbot in production
- AI Chat UI Best Practices — chat UX patterns
- Best AI Chat UI Kits 2026 — UI primitive comparison
FAQ
How is this different from the AI UX Kit you already sell? The AI UX Kit ships chat UI primitives (streaming, citations, feedback) that you assemble into your own product. The forthcoming AI chatbot kit ships a deployable end-to-end product — auth, conversation history, multi-model, RAG, billing, marketing site. Different scopes for different needs.
Should I use OpenAI, Anthropic, or Gemini? For text-only general purpose: OpenAI GPT-5 is the strongest default. For long-context (>200k tokens): Anthropic Claude. For multimodal (images, audio): Gemini. In production, most products use multiple models routed by task type.
Do I need RAG? Only if your chatbot answers questions about your data (customer docs, knowledge base, product information). For pure conversational use cases (writing assistant, brainstorming), skip RAG.
How much does running an AI chatbot cost? At 2026 prices, ~$0.001-$0.01 per message depending on model and context length. A free-tier user sending 100 messages/month costs you $0.10-$1. A paid user sending 1000 messages/month at premium models costs $10-$50. Plan your pricing accordingly.
What's the simplest path to v1? Vercel AI SDK + OpenAI + simple chat UI + conversation history. That's 1-2 weeks. Add RAG, multi-model, tool calling, and cost controls in v2.
How many screens does a real AI chatbot product need? The minimum is around 20: chat interface, conversation sidebar, new chat, settings (account, model preferences, API keys for BYOK), billing, admin (cost dashboard, prompt management, eval suite), marketing site (landing, pricing, features, FAQ), and auth (5 screens).
