System Design: Chat Application
Why Chat Is the Ultimate Frontend Challenge
Chat applications look simple — a text box, some bubbles, a list of conversations. But under the hood, they're one of the most complex frontend systems you'll ever build. Real-time delivery, optimistic updates, bidirectional scrolling, offline resilience, encryption awareness, presence tracking — all while keeping the UI buttery smooth on a 5-year-old phone.
Slack has hundreds of engineers working on their frontend. Discord rebuilt theirs from scratch to handle scale. WhatsApp Web manages billions of messages daily with a surprisingly lean architecture. The patterns behind these products are exactly what interviewers at FAANG companies want you to articulate.
We'll use the RADIO framework — Requirements, Architecture, Data Model, Interface, Optimizations — to systematically design a production-grade chat application from scratch.
Think of a chat app as a post office that also delivers in real-time. You have mailboxes (conversations), letters (messages), delivery trucks (WebSockets), a sorting facility (the server), and tracking numbers (message statuses). The post office needs to handle everything from same-day delivery to storing undelivered mail when someone's not home (offline). Every part of the system exists because of a real user need — not because it's technically interesting.
R — Requirements
Before writing a single line of code, you need to know exactly what you're building. In a system design interview, this is where you demonstrate product thinking — not just engineering chops.
Functional Requirements
Core messaging:
- 1:1 direct messages between two users
- Group conversations with multiple participants
- Real-time message delivery (appear instantly, no polling)
- Message statuses: sent, delivered, read (with timestamps)
- Typing indicators (show when someone is composing)
Rich interactions:
- Media sharing: images, files, links with previews
- Emoji reactions on messages
- Threaded replies (reply to a specific message without cluttering the main feed)
- Message editing and deletion
- Search across conversations and message content
User presence:
- Online, away, offline status
- Last seen timestamps
Non-Functional Requirements
These are what separate a toy project from a production system:
| Requirement | Target | Why |
|---|---|---|
| Delivery latency | Under 100ms | Messages must feel instant — anything above 200ms feels sluggish |
| Offline support | Queue messages, sync on reconnect | Users lose connection constantly on mobile |
| Accessibility | WCAG 2.1 AA | Screen reader support, keyboard navigation, focus management |
| Encryption awareness | E2E encryption capability | Privacy is non-negotiable for modern chat |
| Performance | 60fps scrolling with 10K+ messages | Long conversation history is the norm |
| Reliability | Zero message loss | Users will never forgive a dropped message |
A — Architecture
Now that we know what we're building, let's design the component tree and layout structure.
Layout Structure
The classic chat layout is a three-panel design: sidebar, main conversation pane, and an optional detail/thread panel.
┌─────────────────────────────────────────────────────────────┐
│ Toolbar / App Header │
├──────────┬──────────────────────────┬───────────────────────┤
│ │ │ │
│ Chat │ Message Thread │ Thread Panel │
│ Sidebar │ (active conversation) │ (optional, slides │
│ │ │ in from right) │
│ ┌────┐ │ ┌──────────────────┐ │ │
│ │Conv│ │ │ MessageBubble │ │ Shows replies to │
│ │List│ │ │ MessageBubble │ │ a specific message │
│ │ │ │ │ MessageBubble │ │ │
│ │ │ │ │ TypingIndicator │ │ │
│ └────┘ │ └──────────────────┘ │ │
│ │ ┌──────────────────┐ │ │
│ │ │ MessageInput │ │ │
│ │ └──────────────────┘ │ │
├──────────┴──────────────────────────┴───────────────────────┤
On mobile, this collapses into a single-pane navigation — sidebar and conversation are separate views with slide transitions.
Component Tree
Here's the component hierarchy, organized by responsibility:
ChatApp
├── ChatSidebar
│ ├── SearchBar
│ ├── ConversationList
│ │ └── ConversationItem (repeated)
│ │ ├── Avatar
│ │ ├── ConversationPreview (last message, timestamp)
│ │ └── UnreadBadge
│ └── UserPresenceStatus
├── MessageThread
│ ├── ConversationHeader
│ │ ├── ParticipantAvatars
│ │ └── ConversationActions (pin, mute, search)
│ ├── MessageList (virtualized)
│ │ └── MessageBubble (repeated)
│ │ ├── MessageContent (text, media, links)
│ │ ├── MessageReactions
│ │ ├── MessageStatus (sent/delivered/read)
│ │ └── MessageActions (reply, react, edit, delete)
│ ├── TypingIndicator
│ └── MessageInput
│ ├── TextComposer (contenteditable or textarea)
│ ├── MediaUploadButton
│ ├── EmojiPicker
│ └── SendButton
└── ThreadPanel (optional)
├── ThreadHeader
├── ParentMessage
├── ReplyList (virtualized)
└── ThreadInput
Server vs Client Components
In a Next.js architecture, think carefully about what needs interactivity:
Server Components (no client JS):
ChatApplayout shellConversationHeader(static info)- Initial conversation list (SSR for first paint)
Client Components (need 'use client'):
MessageList— real-time updates, scroll handlingMessageInput— user interaction, typing eventsTypingIndicator— WebSocket-drivenConversationList— live unread counts, reorderingEmojiPicker— interactive overlayMediaUploadButton— file handling
Server-render the conversation list and the most recent messages for the active conversation. Then hydrate and hand off to WebSocket for live updates. This gives you fast first paint AND real-time — the best of both worlds.
D — Data Model
The data model is the backbone. Get this wrong and every feature built on top becomes painful.
Core Entities
type MessageStatus = 'sending' | 'sent' | 'delivered' | 'read' | 'failed'
type ConversationType = 'direct' | 'group'
interface User {
id: string
name: string
avatarUrl: string
presence: 'online' | 'away' | 'offline'
lastSeen: number
}
interface Conversation {
id: string
type: ConversationType
name: string | null
participants: Participant[]
lastMessage: Message | null
unreadCount: number
isPinned: boolean
isMuted: boolean
updatedAt: number
}
interface Participant {
userId: string
role: 'owner' | 'admin' | 'member'
joinedAt: number
lastReadMessageId: string | null
}
interface Message {
id: string
conversationId: string
senderId: string
content: string
type: 'text' | 'image' | 'file' | 'system'
status: MessageStatus
parentMessageId: string | null
reactions: Reaction[]
editedAt: number | null
createdAt: number
}
interface Reaction {
emoji: string
userIds: string[]
}
Why Normalized State
Here's a trap most junior engineers fall into: they nest messages inside conversations and users inside messages. That creates a nightmare.
// Bad: deeply nested, duplicated user data everywhere
interface BadState {
conversations: Array<{
id: string
messages: Array<{
id: string
sender: User // duplicated across every message
// ...
}>
}>
}
// Good: normalized, flat, no duplication
interface ChatState {
users: Record<string, User>
conversations: Record<string, Conversation>
messages: Record<string, Message>
messagesByConversation: Record<string, string[]>
activeConversationId: string | null
}
Normalized state means:
- No data duplication — a user's avatar URL is stored once, not in every message
- O(1) lookups — finding a message by ID is instant
- Easy updates — changing a user's presence updates one place, reflected everywhere
- Consistent — no stale copies of the same entity
Cursor-Based Pagination
Chat messages are loaded newest-first, which means offset-based pagination breaks when new messages arrive (every offset shifts). Cursor-based pagination solves this.
interface MessagePage {
messages: Message[]
cursor: string | null
hasMore: boolean
}
// Load older messages: "give me 50 messages before this cursor"
// The cursor is the ID (or timestamp) of the oldest loaded message
The cursor is stable — it points to a specific message, not a position. New messages arriving don't shift anything.
Never use offset-based pagination for chat messages. When new messages arrive between page loads, offsets shift and you'll either skip messages or show duplicates. Cursor-based pagination is the only correct approach for real-time data.
I — Interface
The interface layer defines how components communicate with each other and with the server. This is where the real-time magic lives.
WebSocket Events
WebSocket is the right transport for chat. HTTP polling wastes bandwidth, SSE is unidirectional, and WebTransport isn't widely supported yet.
Here's the event contract between client and server:
Server-to-client events (incoming):
// New message in any conversation the user belongs to
{ event: 'message:new', data: Message }
// Message status changed (delivered, read)
{ event: 'message:status', data: { messageId: string, status: MessageStatus, userId: string } }
// Someone started or stopped typing
{ event: 'typing:start', data: { conversationId: string, userId: string } }
{ event: 'typing:stop', data: { conversationId: string, userId: string } }
// User presence changed
{ event: 'presence:update', data: { userId: string, presence: 'online' | 'away' | 'offline' } }
// Reaction added or removed
{ event: 'reaction:update', data: { messageId: string, emoji: string, userId: string, action: 'add' | 'remove' } }
Client-to-server events (outgoing):
// Send a new message
{ event: 'message:send', data: { conversationId: string, content: string, type: MessageType, tempId: string } }
// Mark messages as read
{ event: 'message:read', data: { conversationId: string, lastReadMessageId: string } }
// Typing indicator
{ event: 'typing:start', data: { conversationId: string } }
{ event: 'typing:stop', data: { conversationId: string } }
Notice the tempId in message:send — that's critical for optimistic updates. The client generates a temporary ID, renders the message immediately, then replaces it with the server-assigned ID on confirmation.
REST API for Historical Data
WebSocket is for real-time. REST is for loading history and performing CRUD operations:
GET /api/conversations → ConversationList
GET /api/conversations/:id → Conversation details
GET /api/conversations/:id/messages?cursor=X → MessagePage (cursor-based)
POST /api/conversations → Create conversation
POST /api/conversations/:id/messages → Send message (fallback if WS down)
PUT /api/messages/:id → Edit message
DELETE /api/messages/:id → Delete message
GET /api/search?q=query → Search messages
POST /api/media/upload → Upload file/image
Component Interface (Props API)
Your components need clean, well-defined interfaces:
interface MessageListProps {
conversationId: string
messages: Message[]
onLoadMore: () => void
hasMore: boolean
isLoading: boolean
}
interface MessageBubbleProps {
message: Message
sender: User
isOwn: boolean
onReact: (emoji: string) => void
onReply: () => void
onEdit: () => void
onDelete: () => void
}
interface MessageInputProps {
conversationId: string
replyTo: Message | null
onSend: (content: string, type: MessageType) => void
onTyping: (isTyping: boolean) => void
onCancelReply: () => void
}
O — Optimizations
This is where good engineers separate from great ones. The basic chat works. Now make it feel incredible.
1. Optimistic Message Sending
Users should never wait for a server response to see their own message. The moment they hit send:
- Generate a
tempId(e.g.,crypto.randomUUID()) - Insert the message into state with status
'sending' - Render it immediately with a subtle sending indicator
- Fire the WebSocket event
- On server ACK: replace
tempIdwith real ID, update status to'sent' - On failure: update status to
'failed', show retry button
function sendMessage(content: string) {
const tempId = crypto.randomUUID()
const optimisticMessage: Message = {
id: tempId,
conversationId: activeConversation.id,
senderId: currentUser.id,
content,
type: 'text',
status: 'sending',
parentMessageId: null,
reactions: [],
editedAt: null,
createdAt: Date.now(),
}
dispatch({ type: 'MESSAGE_ADD', payload: optimisticMessage })
socket.emit('message:send', {
conversationId: activeConversation.id,
content,
type: 'text',
tempId,
})
}
2. Virtualized Message List with Reverse Scroll
A conversation with 10K messages cannot render 10K DOM nodes. Virtualization renders only the visible messages (plus a small buffer), typically 20-30 elements at a time.
But chat virtualization has a unique twist: reverse scroll. New messages appear at the bottom, and users scroll up to load history. This requires:
- Reverse scroll direction — new items are appended at the bottom, scroll position anchored
- Scroll anchoring — when older messages load at the top, the scroll position should not jump. The message the user was looking at must stay in place
- Dynamic row heights — messages have variable height (text length, media, reactions). You cannot use fixed-height virtualization
- Sticky date separators — "Today", "Yesterday", date headers that stick as you scroll
// Scroll anchoring: save position before prepending, restore after
function prependMessages(newMessages: Message[]) {
const scrollContainer = listRef.current
const previousScrollHeight = scrollContainer.scrollHeight
const previousScrollTop = scrollContainer.scrollTop
dispatch({ type: 'MESSAGES_PREPEND', payload: newMessages })
requestAnimationFrame(() => {
const newScrollHeight = scrollContainer.scrollHeight
scrollContainer.scrollTop =
previousScrollTop + (newScrollHeight - previousScrollHeight)
})
}
3. Typing Indicator Debounce
Without debounce, every keystroke sends a WebSocket event. With debounce, you send typing:start on the first keystroke, then only typing:stop after the user pauses for ~2 seconds.
function useTypingIndicator(conversationId: string, socket: WebSocket) {
const timeoutRef = useRef<ReturnType<typeof setTimeout>>()
const isTypingRef = useRef(false)
const handleInput = useCallback(() => {
if (!isTypingRef.current) {
isTypingRef.current = true
socket.emit('typing:start', { conversationId })
}
clearTimeout(timeoutRef.current)
timeoutRef.current = setTimeout(() => {
isTypingRef.current = false
socket.emit('typing:stop', { conversationId })
}, 2000)
}, [conversationId, socket])
useEffect(() => {
return () => {
clearTimeout(timeoutRef.current)
if (isTypingRef.current) {
socket.emit('typing:stop', { conversationId })
}
}
}, [conversationId, socket])
return handleInput
}
4. Reconnection with Gap Filling
WebSocket connections drop. Wi-Fi switches, laptop lids close, tunnels happen. When the connection restores, you need to fill the gap:
- Track the timestamp (or ID) of the last received message
- On reconnect, request all messages since that timestamp via REST
- Merge them into local state, resolving any conflicts with optimistic messages
- Re-subscribe to all active conversation channels
socket.addEventListener('open', () => {
if (lastReceivedTimestamp) {
fetch(`/api/sync?since=${lastReceivedTimestamp}`)
.then(res => res.json())
.then(({ messages, statusUpdates }) => {
dispatch({ type: 'MESSAGES_MERGE', payload: messages })
dispatch({ type: 'STATUSES_MERGE', payload: statusUpdates })
})
}
})
5. Unread Count Badges
Unread counts need to be:
- Updated in real-time (new messages increment, reading a conversation resets)
- Persisted across sessions (the server is the source of truth, not local state)
- Displayed on the conversation list AND the browser tab title
// Update document title with total unread count
useEffect(() => {
const totalUnread = Object.values(conversations)
.reduce((sum, conv) => sum + conv.unreadCount, 0)
document.title = totalUnread > 0
? `(${totalUnread}) Chat App`
: 'Chat App'
}, [conversations])
6. Push Notifications via Service Worker
When the tab is in the background or closed, push notifications keep users informed:
// Register service worker for push notifications
if ('serviceWorker' in navigator && 'PushManager' in window) {
const registration = await navigator.serviceWorker.register('/sw.js')
const subscription = await registration.pushManager.subscribe({
userVisibleOnly: true,
applicationServerKey: vapidPublicKey,
})
await fetch('/api/push/subscribe', {
method: 'POST',
body: JSON.stringify(subscription),
})
}
The service worker intercepts push events and shows native notifications even when the app is closed.
7. Lazy Loading Media
Images and files in messages should not block the initial render. Load them lazily:
- Use
loading="lazy"on image elements - Show a blurred placeholder (blurhash) or aspect-ratio skeleton while loading
- Preload media that's about to scroll into view (intersection observer with a margin)
- Cache loaded media URLs in a
Mapso re-rendering doesn't re-fetch
End-to-end encryption awareness
In a real production chat app, E2E encryption means the server never sees plaintext message content. The frontend handles:
- Key exchange via the Signal Protocol (or similar) — each conversation has a shared secret derived from participants' public keys
- Encryption on send — the message is encrypted before it leaves the client
- Decryption on receive — the message is decrypted after arriving at the recipient's client
- Key storage — private keys live in IndexedDB or the Web Crypto API's non-extractable key store
- Device verification — users can verify each other's identity via QR codes or security numbers
The data model changes slightly: the content field on Message stores ciphertext, and there's an additional encrypted: boolean flag. Decryption happens in a dedicated worker thread to avoid blocking the main thread.
For a system design interview, you don't need to implement E2E encryption — but you must demonstrate awareness of where it fits in the architecture and how it affects the data flow.
Accessibility Deep Dive
Chat applications are notoriously bad at accessibility. Let's not be one of them.
Keyboard Navigation
Tabmoves between sidebar, message list, and inputArrow Up/Downnavigates between messages in the listEnteron a message opens the action menu (reply, react, edit)Escapecloses any open overlay (emoji picker, thread panel)Ctrl+N/Cmd+Nstarts a new conversation
Screen Reader Announcements
New messages must be announced via an ARIA live region:
<div aria-live="polite" aria-atomic="false" class="sr-only">
<!-- Inject new message announcements here -->
<!-- "Alice: Hey, are you free for a call?" -->
</div>
Use aria-live="polite" so announcements don't interrupt the user. Each message should be announced with the sender's name and content.
Focus Management
- When switching conversations, focus moves to the message input
- When opening the thread panel, focus moves to the thread input
- When closing a modal/overlay, focus returns to the trigger element
- The message list is a single tab stop with arrow key navigation inside (roving tabindex pattern)
Putting It All Together
Let's trace the full lifecycle of a message from the sender's perspective:
Common Mistakes and Key Patterns
| What developers do | What they should do |
|---|---|
| Using HTTP polling to check for new messages every second Polling wastes bandwidth with empty responses and adds at least 1 second latency. WebSocket delivers messages instantly with minimal overhead. A single WebSocket connection replaces hundreds of HTTP requests per minute. | Use WebSocket for bidirectional real-time communication |
| Rendering all messages in the DOM without virtualization A conversation with 5000 messages creates 5000+ DOM nodes, destroying scroll performance and eating memory. Virtualization keeps the DOM to 20-30 nodes regardless of conversation length. | Virtualize the message list to render only visible items |
| Using offset-based pagination for message history When new messages arrive between page loads, offsets shift — you will skip messages or show duplicates. A cursor points to a specific message, so it is stable regardless of new arrivals. | Use cursor-based pagination with the last message ID as cursor |
| Sending a WebSocket event on every keystroke for typing indicators Without debounce, a user typing a 50-character message fires 50 WebSocket events. With debounce, it fires exactly 2 events (start and stop), reducing server load by 96%. | Debounce typing indicators: send typing:start on first keystroke, typing:stop after 2 seconds of inactivity |
| Waiting for server confirmation before showing the sent message Waiting for a server round-trip adds 50-200ms of perceived latency. Optimistic rendering makes the app feel instant. The status indicator (clock, checkmark, blue checkmark) communicates delivery state without blocking the experience. | Render the message optimistically with a temporary ID, then reconcile on server ACK |
- 1WebSocket for real-time, REST for history — never poll for new messages
- 2Normalize state: flat entity maps with ID lookups, never deeply nested objects
- 3Cursor-based pagination is the only correct approach for real-time message feeds
- 4Optimistic updates with tempId correlation — render first, confirm later
- 5Virtualize the message list — the DOM should never hold more than 30-50 message nodes
- 6Debounce typing indicators: typing:start on first keystroke, typing:stop after inactivity timeout
- 7Reconnection must fill the gap: fetch all missed messages since last received timestamp
- 8Accessibility is non-negotiable: ARIA live regions for new messages, roving tabindex for message navigation, focus management on view transitions
Interview Tips
When presenting this design in an interview, structure your answer using RADIO — it shows systematic thinking. A few things interviewers specifically look for:
-
Trade-off awareness — "We could use SSE instead of WebSocket, but SSE is unidirectional, which means we'd need a separate channel for sending. WebSocket gives us bidirectional communication over a single connection."
-
Failure mode thinking — "What happens when the WebSocket drops? We reconnect with exponential backoff and fetch all missed messages via REST using the last received message ID as a cursor."
-
Scale awareness — "The conversation list is sorted by
updatedAt. When a new message arrives, we move that conversation to the top. With 1000 conversations, this is an O(n) insertion into a sorted array — we could optimize with a priority queue, but the N is small enough that it doesn't matter." -
User empathy — "We show a clock icon while sending, a single checkmark for sent, double checkmark for delivered, and blue double checkmark for read. This is not just decoration — it builds trust. Users need to know their message was received."