Skip to main content

How to Build an AI Chatbot From Scratch: A Step-by-Step Guide

Tutorials

How to Build an AI Chatbot From Scratch: A Step-by-Step Guide

Building an AI chatbot is one of the best ways to understand how modern AI applications work under the hood. In this tutorial, we will build a fully functional chatbot with streaming responses, conversation memory, and a clean UI — then deploy it to production.

By the end, you will have a chatbot that rivals the basic functionality of ChatGPT's interface, running on your own infrastructure with your own API key.

Architecture Overview

Before writing code, let us map out what we are building:

┌─────────────┐     HTTP/SSE      ┌──────────────┐     API Call     ┌─────────────┐
│  React UI   │ ───────────────▶  │  Node.js API │ ──────────────▶  │  LLM API    │
│  (Frontend) │ ◀───────────────  │  (Backend)   │ ◀──────────────  │  (Claude/   │
│             │   Streamed tokens │              │  Streamed tokens │   OpenAI)   │
└─────────────┘                   └──────────────┘                  └─────────────┘
                                        │
                                        ▼
                                  ┌──────────────┐
                                  │  In-Memory   │
                                  │  Conversation│
                                  │  Store       │
                                  └──────────────┘
The stack: React frontend, Express.js backend, and either the Anthropic or OpenAI API for the language model. We will use Server-Sent Events (SSE) for streaming.

Step 1: Choose Your Model API

You have two primary options for the LLM backend: Anthropic Claude API — Excellent for nuanced, longer-form responses. Claude's system prompts are powerful for shaping chatbot personality. The API uses a messages-based format that maps cleanly to chat interfaces. OpenAI GPT API — The most widely documented option. GPT-4o provides fast, capable responses. The Chat Completions API is straightforward. For this tutorial, we will use the Anthropic Claude API, but the architecture works identically with OpenAI — you only swap out the API call in one function. Get your API key: Sign up at console.anthropic.com, create a project, and generate an API key. Store it securely — never commit it to version control.

Step 2: Set Up the Backend

Initialize a Node.js project and install dependencies:
mkdir ai-chatbot && cd ai-chatbot
npm init -y
npm install express cors @anthropic-ai/sdk dotenv uuid
Create your environment file:
# .env
ANTHROPIC_API_KEY=sk-ant-your-key-here
PORT=3001
Now build the Express server. Create server.js:
import express from 'express';
import cors from 'cors';
import Anthropic from '@anthropic-ai/sdk';
import { randomUUID } from 'crypto';
import 'dotenv/config';

const app = express();
app.use(cors());
app.use(express.json());

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

// In-memory conversation store
const conversations = new Map();

const SYSTEM_PROMPT = You are a helpful, knowledgeable assistant. 
You give clear, concise answers and ask clarifying questions 
when a request is ambiguous. You format responses with markdown 
when it improves readability.;

app.listen(process.env.PORT || 3001, () => {
  console.log(Server running on port ${process.env.PORT || 3001});
});
This gives us a running server with the Anthropic client initialized and a Map to store conversation histories.

Step 3: Build the Chat Endpoint with Streaming

The key to a responsive chatbot is streaming. Instead of waiting for the entire response to generate (which can take 10-30 seconds for long answers), we stream tokens to the frontend as they are produced. Add this endpoint to server.js:
app.post('/api/chat', async (req, res) => {
  const { message, conversationId } = req.body;

  // Get or create conversation
  const convId = conversationId || randomUUID();
  if (!conversations.has(convId)) {
    conversations.set(convId, []);
  }
  const history = conversations.get(convId);

  // Add user message to history
  history.push({ role: 'user', content: message });

  // Set up SSE headers
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  // Send conversation ID first
  res.write(data: ${JSON.stringify({ type: 'id', conversationId: convId })}\n\n);

  try {
    let fullResponse = '';

    const stream = anthropic.messages.stream({
      model: 'claude-sonnet-4-20250514',
      max_tokens: 4096,
      system: SYSTEM_PROMPT,
      messages: history,
    });

    stream.on('text', (text) => {
      fullResponse += text;
      res.write(data: ${JSON.stringify({ type: 'token', content: text })}\n\n);
    });

    stream.on('finalMessage', () => {
      // Save assistant response to history
      history.push({ role: 'assistant', content: fullResponse });

      res.write(data: ${JSON.stringify({ type: 'done' })}\n\n);
      res.end();
    });

    stream.on('error', (error) => {
      console.error('Stream error:', error);
      res.write(data: ${JSON.stringify({ type: 'error', message: error.message })}\n\n);
      res.end();
    });
  } catch (error) {
    console.error('API error:', error);
    res.write(data: ${JSON.stringify({ type: 'error', message: 'Failed to generate response' })}\n\n);
    res.end();
  }
});
Let us break down what this does:
  • Receives the user message and either retrieves an existing conversation or creates a new one.
  • Sets SSE headers so the browser knows to expect a stream of events.
  • Calls the Anthropic API with streaming enabled. The .stream() method returns an event emitter that fires text events as tokens arrive.
  • Forwards each token to the client as an SSE event.
  • Saves the complete response to conversation history when the stream finishes.
  • Step 4: Add Conversation Management

    Users need to start new conversations and retrieve existing ones. Add these endpoints:
    // List conversations (returns IDs and first message preview)
    app.get('/api/conversations', (req, res) => {
      const list = [];
      for (const [id, messages] of conversations) {
        if (messages.length > 0) {
          list.push({
            id,
            preview: messages[0].content.substring(0, 80),
            messageCount: messages.length,
            lastUpdated: Date.now(),
          });
        }
      }
      res.json(list);
    });
    
    // Get full conversation history
    app.get('/api/conversations/:id', (req, res) => {
      const history = conversations.get(req.params.id);
      if (!history) {
        return res.status(404).json({ error: 'Conversation not found' });
      }
      res.json({ id: req.params.id, messages: history });
    });
    
    // Delete a conversation
    app.delete('/api/conversations/:id', (req, res) => {
      conversations.delete(req.params.id);
      res.json({ success: true });
    });
    

    Step 5: Build the Chat UI

    For the frontend, create a React application. We will keep it focused on the chat functionality:
    npm create vite@latest client -- --template react
    cd client
    npm install
    
    Replace src/App.jsx with the chat interface:
    import { useState, useRef, useEffect } from 'react';
    import './App.css';
    
    function App() {
      const [messages, setMessages] = useState([]);
      const [input, setInput] = useState('');
      const [isStreaming, setIsStreaming] = useState(false);
      const [conversationId, setConversationId] = useState(null);
      const messagesEndRef = useRef(null);
    
      const scrollToBottom = () => {
        messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
      };
    
      useEffect(() => { scrollToBottom(); }, [messages]);
    
      const sendMessage = async () => {
        if (!input.trim() || isStreaming) return;
    
        const userMessage = input.trim();
        setInput('');
        setMessages(prev => [...prev, { role: 'user', content: userMessage }]);
        setIsStreaming(true);
    
        // Add empty assistant message that we will stream into
        setMessages(prev => [...prev, { role: 'assistant', content: '' }]);
    
        try {
          const response = await fetch('http://localhost:3001/api/chat', {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({
              message: userMessage,
              conversationId,
            }),
          });
    
          const reader = response.body.getReader();
          const decoder = new TextDecoder();
    
          while (true) {
            const { done, value } = await reader.read();
            if (done) break;
    
            const chunk = decoder.decode(value);
            const lines = chunk.split('\n').filter(line => line.startsWith('data: '));
    
            for (const line of lines) {
              const data = JSON.parse(line.slice(6));
    
              if (data.type === 'id') {
                setConversationId(data.conversationId);
              } else if (data.type === 'token') {
                setMessages(prev => {
                  const updated = [...prev];
                  const last = updated[updated.length - 1];
                  last.content += data.content;
                  return updated;
                });
              } else if (data.type === 'error') {
                console.error('Stream error:', data.message);
              }
            }
          }
        } catch (error) {
          console.error('Request failed:', error);
          setMessages(prev => {
            const updated = [...prev];
            updated[updated.length - 1].content = 'Sorry, something went wrong. Please try again.';
            return updated;
          });
        } finally {
          setIsStreaming(false);
        }
      };
    
      const handleKeyDown = (e) => {
        if (e.key === 'Enter' && !e.shiftKey) {
          e.preventDefault();
          sendMessage();
        }
      };
    
      return (
        <div className="chat-container">
          <header className="chat-header">
            <h1>AI Chatbot</h1>
            <button onClick={() => { setMessages([]); setConversationId(null); }}>
              New Chat
            </button>
          </header>
    
          <div className="messages">
            {messages.map((msg, i) => (
              <div key={i} className={message ${msg.role}}>
                <div className="message-content">{msg.content}</div>
              </div>
            ))}
            <div ref={messagesEndRef} />
          </div>
    
          <div className="input-area">
            <textarea
              value={input}
              onChange={(e) => setInput(e.target.value)}
              onKeyDown={handleKeyDown}
              placeholder="Type your message..."
              rows={1}
              disabled={isStreaming}
            />
            <button onClick={sendMessage} disabled={isStreaming || !input.trim()}>
              {isStreaming ? '...' : 'Send'}
            </button>
          </div>
        </div>
      );
    }
    
    export default App;
    

    Step 6: Handle Edge Cases

    A production chatbot needs to handle several things that tutorials often skip.

    Token Limit Management

    Conversation histories grow indefinitely, but the API has a context window limit. Add a function to trim old messages when the conversation gets too long:
    function trimHistory(messages, maxTokenEstimate = 150000) {
      // Rough estimate: 1 token ≈ 4 characters
      const estimateTokens = (msgs) =>
        msgs.reduce((sum, m) => sum + Math.ceil(m.content.length / 4), 0);
    
      while (messages.length > 2 && estimateTokens(messages) > maxTokenEstimate) {
        // Remove the oldest user-assistant pair, keeping the first message for context
        messages.splice(1, 2);
      }
      return messages;
    }
    
    Call trimHistory(history) before passing messages to the API. This preserves the first message (which often sets context) while removing older exchanges from the middle.

    Rate Limiting

    Protect your API key from abuse with basic rate limiting:
    import rateLimit from 'express-rate-limit';
    
    const limiter = rateLimit({
      windowMs: 60  1000, // 1 minute
      max: 20, // 20 requests per minute per IP
      message: { error: 'Too many requests. Please wait a moment.' },
    });
    
    app.use('/api/chat', limiter);
    

    Graceful Error Recovery

    When the API returns errors — rate limits, overloaded servers, invalid requests — your chatbot should not just crash. The streaming error handler we built earlier catches API-level errors, but you should also handle network timeouts:
    const stream = anthropic.messages.stream({
      model: 'claude-sonnet-4-20250514',
      max_tokens: 4096,
      system: SYSTEM_PROMPT,
      messages: trimHistory(history),
    }).on('error', (error) => {
      if (error.status === 429) {
        res.write(data: ${JSON.stringify({
          type: 'error',
          message: 'Rate limited. Please wait 30 seconds and try again.'
        })}\n\n);
      } else {
        res.write(data: ${JSON.stringify({
          type: 'error',
          message: 'An error occurred. Please try again.'
        })}\n\n);
      }
      res.end();
    });
    

    Step 7: Add Markdown Rendering

    AI responses frequently contain markdown — code blocks, lists, headers, bold text. Rendering raw markdown in the browser looks terrible. Add a markdown renderer to the frontend:
    cd client
    npm install react-markdown remark-gfm rehype-highlight
    
    Update the message display component:
    import ReactMarkdown from 'react-markdown';
    import remarkGfm from 'remark-gfm';
    import rehypeHighlight from 'rehype-highlight';
    
    // Inside the messages map:
    <div className="message-content">
      {msg.role === 'assistant' ? (
        <ReactMarkdown remarkPlugins={[remarkGfm]} rehypePlugins={[rehypeHighlight]}>
          {msg.content}
        </ReactMarkdown>
      ) : (
        msg.content
      )}
    </div>
    
    This gives you GitHub-flavored markdown with syntax-highlighted code blocks. The visual improvement is dramatic — responses with code snippets, tables, or structured lists become actually readable.

    Step 8: Deploy to Production

    For deployment, we need to combine the frontend and backend into a single deployable unit.

    Build the Frontend

    cd client
    npm run build
    
    This creates a dist/ folder with static files.

    Serve Static Files from Express

    Add this to your server.js, after your API routes:
    import path from 'path';
    import { fileURLToPath } from 'url';
    
    const __dirname = path.dirname(fileURLToPath(import.meta.url));
    
    // Serve the built React app
    app.use(express.static(path.join(__dirname, 'client', 'dist')));
    
    // Catch-all: serve index.html for client-side routing
    app.get('', (req, res) => {
      res.sendFile(path.join(__dirname, 'client', 'dist', 'index.html'));
    });
    

    Deploy to a Cloud Provider

    Railway or Render (simplest): Push your repo to GitHub, connect it to Railway or Render, set the ANTHROPIC_API_KEY environment variable, and deploy. Both platforms detect Node.js automatically and handle the rest. Docker (most portable):
    FROM node:20-alpine
    WORKDIR /app
    COPY package*.json ./
    RUN npm ci --production
    COPY . .
    RUN cd client && npm ci && npm run build
    EXPOSE 3001
    CMD ["node", "server.js"]
    
    Build and run: docker build -t chatbot . && docker run -p 3001:3001 --env-file .env chatbot

    Production Checklist

    Before going live, verify these items:
    • Environment variables are set on the hosting platform, not hardcoded
    • CORS is restricted to your actual domain instead of allowing all origins
    • Rate limiting is configured appropriately for your expected traffic
    • HTTPS is enabled (most platforms handle this automatically)
    • Error logging is connected to a service like Sentry or LogTail so you catch issues in production
    • Conversation cleanup — add a TTL to your conversation store so old conversations are deleted after 24 hours, or switch to Redis for persistent storage with built-in expiration

    Going Further

    This chatbot is functional but intentionally minimal. Here are high-impact improvements worth implementing:

    Persistent storage. Replace the in-memory Map with PostgreSQL or Redis. This lets conversations survive server restarts and enables multi-server deployments. Authentication. Add user accounts so conversations are private. A simple JWT-based auth system works well. Libraries like passport.js or lucia-auth handle the heavy lifting. File uploads. Claude's API supports image inputs. Add a file upload endpoint that converts images to base64 and includes them in the messages array. This enables vision-based conversations. System prompt customization. Let users configure the chatbot's personality. Store system prompts per conversation and let users modify them through a settings panel. Streaming markdown. Our current implementation re-renders the full markdown on every token. For smoother performance, look into incremental markdown parsing libraries that only process new content.

    The core architecture we built — SSE streaming, conversation state management, and a clean separation between frontend and backend — scales cleanly as you add these features. Each improvement is additive rather than requiring a rewrite, which is the sign of a solid foundation.

    Tags:TutorialsRAGChatGPT