Skip to content

System Design: Chat Application

advanced35 min read

Why Chat Is the Ultimate Frontend Challenge

Chat applications look simple — a text box, some bubbles, a list of conversations. But under the hood, they're one of the most complex frontend systems you'll ever build. Real-time delivery, optimistic updates, bidirectional scrolling, offline resilience, encryption awareness, presence tracking — all while keeping the UI buttery smooth on a 5-year-old phone.

Slack has hundreds of engineers working on their frontend. Discord rebuilt theirs from scratch to handle scale. WhatsApp Web manages billions of messages daily with a surprisingly lean architecture. The patterns behind these products are exactly what interviewers at FAANG companies want you to articulate.

We'll use the RADIO framework — Requirements, Architecture, Data Model, Interface, Optimizations — to systematically design a production-grade chat application from scratch.

Mental Model

Think of a chat app as a post office that also delivers in real-time. You have mailboxes (conversations), letters (messages), delivery trucks (WebSockets), a sorting facility (the server), and tracking numbers (message statuses). The post office needs to handle everything from same-day delivery to storing undelivered mail when someone's not home (offline). Every part of the system exists because of a real user need — not because it's technically interesting.

R — Requirements

Before writing a single line of code, you need to know exactly what you're building. In a system design interview, this is where you demonstrate product thinking — not just engineering chops.

Functional Requirements

Core messaging:

  • 1:1 direct messages between two users
  • Group conversations with multiple participants
  • Real-time message delivery (appear instantly, no polling)
  • Message statuses: sent, delivered, read (with timestamps)
  • Typing indicators (show when someone is composing)

Rich interactions:

  • Media sharing: images, files, links with previews
  • Emoji reactions on messages
  • Threaded replies (reply to a specific message without cluttering the main feed)
  • Message editing and deletion
  • Search across conversations and message content

User presence:

  • Online, away, offline status
  • Last seen timestamps

Non-Functional Requirements

These are what separate a toy project from a production system:

RequirementTargetWhy
Delivery latencyUnder 100msMessages must feel instant — anything above 200ms feels sluggish
Offline supportQueue messages, sync on reconnectUsers lose connection constantly on mobile
AccessibilityWCAG 2.1 AAScreen reader support, keyboard navigation, focus management
Encryption awarenessE2E encryption capabilityPrivacy is non-negotiable for modern chat
Performance60fps scrolling with 10K+ messagesLong conversation history is the norm
ReliabilityZero message lossUsers will never forgive a dropped message
Quiz
In a system design interview, which non-functional requirement should you prioritize FIRST for a chat application?

A — Architecture

Now that we know what we're building, let's design the component tree and layout structure.

Layout Structure

The classic chat layout is a three-panel design: sidebar, main conversation pane, and an optional detail/thread panel.

┌─────────────────────────────────────────────────────────────┐
│  Toolbar / App Header                                       │
├──────────┬──────────────────────────┬───────────────────────┤
│          │                          │                       │
│  Chat    │   Message Thread         │   Thread Panel        │
│  Sidebar │   (active conversation)  │   (optional, slides   │
│          │                          │    in from right)     │
│  ┌────┐  │   ┌──────────────────┐   │                       │
│  │Conv│  │   │  MessageBubble   │   │   Shows replies to    │
│  │List│  │   │  MessageBubble   │   │   a specific message  │
│  │    │  │   │  MessageBubble   │   │                       │
│  │    │  │   │  TypingIndicator │   │                       │
│  └────┘  │   └──────────────────┘   │                       │
│          │   ┌──────────────────┐   │                       │
│          │   │  MessageInput    │   │                       │
│          │   └──────────────────┘   │                       │
├──────────┴──────────────────────────┴───────────────────────┤

On mobile, this collapses into a single-pane navigation — sidebar and conversation are separate views with slide transitions.

Component Tree

Here's the component hierarchy, organized by responsibility:

ChatApp
├── ChatSidebar
│   ├── SearchBar
│   ├── ConversationList
│   │   └── ConversationItem (repeated)
│   │       ├── Avatar
│   │       ├── ConversationPreview (last message, timestamp)
│   │       └── UnreadBadge
│   └── UserPresenceStatus
├── MessageThread
│   ├── ConversationHeader
│   │   ├── ParticipantAvatars
│   │   └── ConversationActions (pin, mute, search)
│   ├── MessageList (virtualized)
│   │   └── MessageBubble (repeated)
│   │       ├── MessageContent (text, media, links)
│   │       ├── MessageReactions
│   │       ├── MessageStatus (sent/delivered/read)
│   │       └── MessageActions (reply, react, edit, delete)
│   ├── TypingIndicator
│   └── MessageInput
│       ├── TextComposer (contenteditable or textarea)
│       ├── MediaUploadButton
│       ├── EmojiPicker
│       └── SendButton
└── ThreadPanel (optional)
    ├── ThreadHeader
    ├── ParentMessage
    ├── ReplyList (virtualized)
    └── ThreadInput

Server vs Client Components

In a Next.js architecture, think carefully about what needs interactivity:

Server Components (no client JS):

  • ChatApp layout shell
  • ConversationHeader (static info)
  • Initial conversation list (SSR for first paint)

Client Components (need 'use client'):

  • MessageList — real-time updates, scroll handling
  • MessageInput — user interaction, typing events
  • TypingIndicator — WebSocket-driven
  • ConversationList — live unread counts, reordering
  • EmojiPicker — interactive overlay
  • MediaUploadButton — file handling
The SSR sweet spot

Server-render the conversation list and the most recent messages for the active conversation. Then hydrate and hand off to WebSocket for live updates. This gives you fast first paint AND real-time — the best of both worlds.

Quiz
Which component should NOT be a Client Component in a Next.js chat application?

D — Data Model

The data model is the backbone. Get this wrong and every feature built on top becomes painful.

Core Entities

type MessageStatus = 'sending' | 'sent' | 'delivered' | 'read' | 'failed'

type ConversationType = 'direct' | 'group'

interface User {
  id: string
  name: string
  avatarUrl: string
  presence: 'online' | 'away' | 'offline'
  lastSeen: number
}

interface Conversation {
  id: string
  type: ConversationType
  name: string | null
  participants: Participant[]
  lastMessage: Message | null
  unreadCount: number
  isPinned: boolean
  isMuted: boolean
  updatedAt: number
}

interface Participant {
  userId: string
  role: 'owner' | 'admin' | 'member'
  joinedAt: number
  lastReadMessageId: string | null
}

interface Message {
  id: string
  conversationId: string
  senderId: string
  content: string
  type: 'text' | 'image' | 'file' | 'system'
  status: MessageStatus
  parentMessageId: string | null
  reactions: Reaction[]
  editedAt: number | null
  createdAt: number
}

interface Reaction {
  emoji: string
  userIds: string[]
}

Why Normalized State

Here's a trap most junior engineers fall into: they nest messages inside conversations and users inside messages. That creates a nightmare.

// Bad: deeply nested, duplicated user data everywhere
interface BadState {
  conversations: Array<{
    id: string
    messages: Array<{
      id: string
      sender: User  // duplicated across every message
      // ...
    }>
  }>
}

// Good: normalized, flat, no duplication
interface ChatState {
  users: Record<string, User>
  conversations: Record<string, Conversation>
  messages: Record<string, Message>
  messagesByConversation: Record<string, string[]>
  activeConversationId: string | null
}

Normalized state means:

  • No data duplication — a user's avatar URL is stored once, not in every message
  • O(1) lookups — finding a message by ID is instant
  • Easy updates — changing a user's presence updates one place, reflected everywhere
  • Consistent — no stale copies of the same entity

Cursor-Based Pagination

Chat messages are loaded newest-first, which means offset-based pagination breaks when new messages arrive (every offset shifts). Cursor-based pagination solves this.

interface MessagePage {
  messages: Message[]
  cursor: string | null
  hasMore: boolean
}

// Load older messages: "give me 50 messages before this cursor"
// The cursor is the ID (or timestamp) of the oldest loaded message

The cursor is stable — it points to a specific message, not a position. New messages arriving don't shift anything.

Common Trap

Never use offset-based pagination for chat messages. When new messages arrive between page loads, offsets shift and you'll either skip messages or show duplicates. Cursor-based pagination is the only correct approach for real-time data.

Quiz
Why is normalized state preferred over nested state in a chat application?

I — Interface

The interface layer defines how components communicate with each other and with the server. This is where the real-time magic lives.

WebSocket Events

WebSocket is the right transport for chat. HTTP polling wastes bandwidth, SSE is unidirectional, and WebTransport isn't widely supported yet.

Here's the event contract between client and server:

Server-to-client events (incoming):

// New message in any conversation the user belongs to
{ event: 'message:new', data: Message }

// Message status changed (delivered, read)
{ event: 'message:status', data: { messageId: string, status: MessageStatus, userId: string } }

// Someone started or stopped typing
{ event: 'typing:start', data: { conversationId: string, userId: string } }
{ event: 'typing:stop', data: { conversationId: string, userId: string } }

// User presence changed
{ event: 'presence:update', data: { userId: string, presence: 'online' | 'away' | 'offline' } }

// Reaction added or removed
{ event: 'reaction:update', data: { messageId: string, emoji: string, userId: string, action: 'add' | 'remove' } }

Client-to-server events (outgoing):

// Send a new message
{ event: 'message:send', data: { conversationId: string, content: string, type: MessageType, tempId: string } }

// Mark messages as read
{ event: 'message:read', data: { conversationId: string, lastReadMessageId: string } }

// Typing indicator
{ event: 'typing:start', data: { conversationId: string } }
{ event: 'typing:stop', data: { conversationId: string } }

Notice the tempId in message:send — that's critical for optimistic updates. The client generates a temporary ID, renders the message immediately, then replaces it with the server-assigned ID on confirmation.

REST API for Historical Data

WebSocket is for real-time. REST is for loading history and performing CRUD operations:

GET    /api/conversations                      → ConversationList
GET    /api/conversations/:id                   → Conversation details
GET    /api/conversations/:id/messages?cursor=X → MessagePage (cursor-based)
POST   /api/conversations                      → Create conversation
POST   /api/conversations/:id/messages          → Send message (fallback if WS down)
PUT    /api/messages/:id                        → Edit message
DELETE /api/messages/:id                        → Delete message
GET    /api/search?q=query                      → Search messages
POST   /api/media/upload                       → Upload file/image

Component Interface (Props API)

Your components need clean, well-defined interfaces:

interface MessageListProps {
  conversationId: string
  messages: Message[]
  onLoadMore: () => void
  hasMore: boolean
  isLoading: boolean
}

interface MessageBubbleProps {
  message: Message
  sender: User
  isOwn: boolean
  onReact: (emoji: string) => void
  onReply: () => void
  onEdit: () => void
  onDelete: () => void
}

interface MessageInputProps {
  conversationId: string
  replyTo: Message | null
  onSend: (content: string, type: MessageType) => void
  onTyping: (isTyping: boolean) => void
  onCancelReply: () => void
}
Quiz
Why does the client send a tempId with every message:send WebSocket event?

O — Optimizations

This is where good engineers separate from great ones. The basic chat works. Now make it feel incredible.

1. Optimistic Message Sending

Users should never wait for a server response to see their own message. The moment they hit send:

  1. Generate a tempId (e.g., crypto.randomUUID())
  2. Insert the message into state with status 'sending'
  3. Render it immediately with a subtle sending indicator
  4. Fire the WebSocket event
  5. On server ACK: replace tempId with real ID, update status to 'sent'
  6. On failure: update status to 'failed', show retry button
function sendMessage(content: string) {
  const tempId = crypto.randomUUID()
  const optimisticMessage: Message = {
    id: tempId,
    conversationId: activeConversation.id,
    senderId: currentUser.id,
    content,
    type: 'text',
    status: 'sending',
    parentMessageId: null,
    reactions: [],
    editedAt: null,
    createdAt: Date.now(),
  }

  dispatch({ type: 'MESSAGE_ADD', payload: optimisticMessage })

  socket.emit('message:send', {
    conversationId: activeConversation.id,
    content,
    type: 'text',
    tempId,
  })
}

2. Virtualized Message List with Reverse Scroll

A conversation with 10K messages cannot render 10K DOM nodes. Virtualization renders only the visible messages (plus a small buffer), typically 20-30 elements at a time.

But chat virtualization has a unique twist: reverse scroll. New messages appear at the bottom, and users scroll up to load history. This requires:

  • Reverse scroll direction — new items are appended at the bottom, scroll position anchored
  • Scroll anchoring — when older messages load at the top, the scroll position should not jump. The message the user was looking at must stay in place
  • Dynamic row heights — messages have variable height (text length, media, reactions). You cannot use fixed-height virtualization
  • Sticky date separators — "Today", "Yesterday", date headers that stick as you scroll
// Scroll anchoring: save position before prepending, restore after
function prependMessages(newMessages: Message[]) {
  const scrollContainer = listRef.current
  const previousScrollHeight = scrollContainer.scrollHeight
  const previousScrollTop = scrollContainer.scrollTop

  dispatch({ type: 'MESSAGES_PREPEND', payload: newMessages })

  requestAnimationFrame(() => {
    const newScrollHeight = scrollContainer.scrollHeight
    scrollContainer.scrollTop =
      previousScrollTop + (newScrollHeight - previousScrollHeight)
  })
}

3. Typing Indicator Debounce

Without debounce, every keystroke sends a WebSocket event. With debounce, you send typing:start on the first keystroke, then only typing:stop after the user pauses for ~2 seconds.

function useTypingIndicator(conversationId: string, socket: WebSocket) {
  const timeoutRef = useRef<ReturnType<typeof setTimeout>>()
  const isTypingRef = useRef(false)

  const handleInput = useCallback(() => {
    if (!isTypingRef.current) {
      isTypingRef.current = true
      socket.emit('typing:start', { conversationId })
    }

    clearTimeout(timeoutRef.current)
    timeoutRef.current = setTimeout(() => {
      isTypingRef.current = false
      socket.emit('typing:stop', { conversationId })
    }, 2000)
  }, [conversationId, socket])

  useEffect(() => {
    return () => {
      clearTimeout(timeoutRef.current)
      if (isTypingRef.current) {
        socket.emit('typing:stop', { conversationId })
      }
    }
  }, [conversationId, socket])

  return handleInput
}

4. Reconnection with Gap Filling

WebSocket connections drop. Wi-Fi switches, laptop lids close, tunnels happen. When the connection restores, you need to fill the gap:

  1. Track the timestamp (or ID) of the last received message
  2. On reconnect, request all messages since that timestamp via REST
  3. Merge them into local state, resolving any conflicts with optimistic messages
  4. Re-subscribe to all active conversation channels
socket.addEventListener('open', () => {
  if (lastReceivedTimestamp) {
    fetch(`/api/sync?since=${lastReceivedTimestamp}`)
      .then(res => res.json())
      .then(({ messages, statusUpdates }) => {
        dispatch({ type: 'MESSAGES_MERGE', payload: messages })
        dispatch({ type: 'STATUSES_MERGE', payload: statusUpdates })
      })
  }
})

5. Unread Count Badges

Unread counts need to be:

  • Updated in real-time (new messages increment, reading a conversation resets)
  • Persisted across sessions (the server is the source of truth, not local state)
  • Displayed on the conversation list AND the browser tab title
// Update document title with total unread count
useEffect(() => {
  const totalUnread = Object.values(conversations)
    .reduce((sum, conv) => sum + conv.unreadCount, 0)
  document.title = totalUnread > 0
    ? `(${totalUnread}) Chat App`
    : 'Chat App'
}, [conversations])

6. Push Notifications via Service Worker

When the tab is in the background or closed, push notifications keep users informed:

// Register service worker for push notifications
if ('serviceWorker' in navigator && 'PushManager' in window) {
  const registration = await navigator.serviceWorker.register('/sw.js')
  const subscription = await registration.pushManager.subscribe({
    userVisibleOnly: true,
    applicationServerKey: vapidPublicKey,
  })
  await fetch('/api/push/subscribe', {
    method: 'POST',
    body: JSON.stringify(subscription),
  })
}

The service worker intercepts push events and shows native notifications even when the app is closed.

7. Lazy Loading Media

Images and files in messages should not block the initial render. Load them lazily:

  • Use loading="lazy" on image elements
  • Show a blurred placeholder (blurhash) or aspect-ratio skeleton while loading
  • Preload media that's about to scroll into view (intersection observer with a margin)
  • Cache loaded media URLs in a Map so re-rendering doesn't re-fetch
End-to-end encryption awareness

In a real production chat app, E2E encryption means the server never sees plaintext message content. The frontend handles:

  • Key exchange via the Signal Protocol (or similar) — each conversation has a shared secret derived from participants' public keys
  • Encryption on send — the message is encrypted before it leaves the client
  • Decryption on receive — the message is decrypted after arriving at the recipient's client
  • Key storage — private keys live in IndexedDB or the Web Crypto API's non-extractable key store
  • Device verification — users can verify each other's identity via QR codes or security numbers

The data model changes slightly: the content field on Message stores ciphertext, and there's an additional encrypted: boolean flag. Decryption happens in a dedicated worker thread to avoid blocking the main thread.

For a system design interview, you don't need to implement E2E encryption — but you must demonstrate awareness of where it fits in the architecture and how it affects the data flow.

Accessibility Deep Dive

Chat applications are notoriously bad at accessibility. Let's not be one of them.

Keyboard Navigation

  • Tab moves between sidebar, message list, and input
  • Arrow Up/Down navigates between messages in the list
  • Enter on a message opens the action menu (reply, react, edit)
  • Escape closes any open overlay (emoji picker, thread panel)
  • Ctrl+N / Cmd+N starts a new conversation

Screen Reader Announcements

New messages must be announced via an ARIA live region:

<div aria-live="polite" aria-atomic="false" class="sr-only">
  <!-- Inject new message announcements here -->
  <!-- "Alice: Hey, are you free for a call?" -->
</div>

Use aria-live="polite" so announcements don't interrupt the user. Each message should be announced with the sender's name and content.

Focus Management

  • When switching conversations, focus moves to the message input
  • When opening the thread panel, focus moves to the thread input
  • When closing a modal/overlay, focus returns to the trigger element
  • The message list is a single tab stop with arrow key navigation inside (roving tabindex pattern)
Quiz
What ARIA live region politeness level should you use for incoming chat messages?

Putting It All Together

Let's trace the full lifecycle of a message from the sender's perspective:

Execution Trace
User types message
MessageInput captures keystrokes, debounced typing indicator fires typing:start via WebSocket
Input is uncontrolled for performance — no re-renders on every keystroke
User hits Enter/Send
Generate tempId via crypto.randomUUID(), create optimistic Message object with status 'sending'
No network call yet
Optimistic render
Message appears in MessageList instantly with a subtle clock icon indicating 'sending'
Auto-scroll to bottom triggered
WebSocket emit
message:send event fired with content, conversationId, type, and tempId
Typing indicator timeout fires typing:stop
Server ACK received
Server responds with permanent ID. State updated: tempId replaced, status changed to 'sent'. Checkmark icon appears.
If ACK fails after 5s, status becomes 'failed' with retry button
Delivery confirmation
message:status event arrives: recipient's client confirmed receipt. Status changes to 'delivered'. Double checkmark appears.
Each recipient sends their own delivery confirmation
Read receipt
Recipient scrolls to and views the message. message:status event with 'read' arrives. Double checkmark turns blue.
Read receipts can be disabled by users for privacy

Common Mistakes and Key Patterns

What developers doWhat they should do
Using HTTP polling to check for new messages every second
Polling wastes bandwidth with empty responses and adds at least 1 second latency. WebSocket delivers messages instantly with minimal overhead. A single WebSocket connection replaces hundreds of HTTP requests per minute.
Use WebSocket for bidirectional real-time communication
Rendering all messages in the DOM without virtualization
A conversation with 5000 messages creates 5000+ DOM nodes, destroying scroll performance and eating memory. Virtualization keeps the DOM to 20-30 nodes regardless of conversation length.
Virtualize the message list to render only visible items
Using offset-based pagination for message history
When new messages arrive between page loads, offsets shift — you will skip messages or show duplicates. A cursor points to a specific message, so it is stable regardless of new arrivals.
Use cursor-based pagination with the last message ID as cursor
Sending a WebSocket event on every keystroke for typing indicators
Without debounce, a user typing a 50-character message fires 50 WebSocket events. With debounce, it fires exactly 2 events (start and stop), reducing server load by 96%.
Debounce typing indicators: send typing:start on first keystroke, typing:stop after 2 seconds of inactivity
Waiting for server confirmation before showing the sent message
Waiting for a server round-trip adds 50-200ms of perceived latency. Optimistic rendering makes the app feel instant. The status indicator (clock, checkmark, blue checkmark) communicates delivery state without blocking the experience.
Render the message optimistically with a temporary ID, then reconcile on server ACK
Key Rules
  1. 1WebSocket for real-time, REST for history — never poll for new messages
  2. 2Normalize state: flat entity maps with ID lookups, never deeply nested objects
  3. 3Cursor-based pagination is the only correct approach for real-time message feeds
  4. 4Optimistic updates with tempId correlation — render first, confirm later
  5. 5Virtualize the message list — the DOM should never hold more than 30-50 message nodes
  6. 6Debounce typing indicators: typing:start on first keystroke, typing:stop after inactivity timeout
  7. 7Reconnection must fill the gap: fetch all missed messages since last received timestamp
  8. 8Accessibility is non-negotiable: ARIA live regions for new messages, roving tabindex for message navigation, focus management on view transitions

Interview Tips

When presenting this design in an interview, structure your answer using RADIO — it shows systematic thinking. A few things interviewers specifically look for:

  1. Trade-off awareness — "We could use SSE instead of WebSocket, but SSE is unidirectional, which means we'd need a separate channel for sending. WebSocket gives us bidirectional communication over a single connection."

  2. Failure mode thinking — "What happens when the WebSocket drops? We reconnect with exponential backoff and fetch all missed messages via REST using the last received message ID as a cursor."

  3. Scale awareness — "The conversation list is sorted by updatedAt. When a new message arrives, we move that conversation to the top. With 1000 conversations, this is an O(n) insertion into a sorted array — we could optimize with a priority queue, but the N is small enough that it doesn't matter."

  4. User empathy — "We show a clock icon while sending, a single checkmark for sent, double checkmark for delivered, and blue double checkmark for read. This is not just decoration — it builds trust. Users need to know their message was received."

Quiz
In a production chat app, what happens when the WebSocket connection drops and reconnects?