A production-grade frontend system design walkthrough — the answer that covers every edge case WhatsApp, Slack, and Discord had to solve.
"Design a messenger app" is the most frequently asked frontend system design question at Meta, Google, and Microsoft. It sounds straightforward — send messages, receive messages, show them in a list. But the reality is that building a production-quality chat frontend is one of the most technically demanding challenges in web development.
Why? Because chat is where real-time delivery, offline resilience, message ordering guarantees, optimistic UI, end-to-end encryption, and infinite scroll (in reverse) all collide simultaneously. I’ve seen senior engineers at Google stumble on the message ordering problem alone — when two users send messages at the exact same millisecond, whose message appears first?
Let’s build this properly using the RADIO framework.
📋 Step 1: Requirements Exploration
Clarifying Questions I’d Ask
Question | Why It Matters | Assumed Answer |
|---|---|---|
1:1 chat only, or group chat too? | Group chat adds participant management, mentions, and message fan-out complexity | Both 1:1 and group chat (up to 256 members) |
What message types? | Media messages need upload pipelines, previews, and progressive loading | Text, images, files, voice notes, link previews |
Do we need read receipts? | Sent/delivered/read is a 3-state system that needs real-time updates | Yes — sent, delivered, read (with blue ticks) |
Typing indicators? | Requires debounced WebSocket events with timeout logic | Yes — "User is typing..." with multi-user support in groups |
Offline support? | Queue messages locally, sync when back online, handle conflicts | Yes — send offline, sync on reconnect |
Message search? | Full-text search across conversations changes data storage | Yes — search within a conversation and globally |
End-to-end encryption? | E2EE means client handles all encryption/decryption, can’t rely on server for search | Nice-to-have, discuss architecture implications |
Message reactions/threads? | Reactions need real-time aggregation; threads add nested conversation complexity | Reactions yes, threaded replies yes |
Functional Requirements
Conversation list: Sorted by most recent message, unread badges, online status indicators
Message thread: Reverse-chronological infinite scroll (newest at bottom), date separators, message grouping by sender
Sending: Text with emoji, image/file upload with progress, voice notes with waveform
Real-time: Instant message delivery, typing indicators, read receipts, online/offline presence
Interactions: Reply to message, reactions (emoji), forward, delete for me/everyone
Search: Search conversations, search messages within a conversation
Notifications: Browser push notifications, unread count in tab title, notification sounds
Non-Functional Requirements
Latency: Message send-to-display < 100ms on the sender’s device (optimistic), < 500ms on receiver’s
Reliability: Zero message loss — every message must eventually be delivered and displayed
Ordering: Messages must appear in causal order within a conversation
Offline: Full read access to cached conversations, queued sends that sync on reconnect
Memory: Handle conversations with 100K+ messages without browser crash
Accessibility: Screen reader support for conversation navigation and message reading
🔥 Real-world war story: WhatsApp Web had a critical ordering bug in 2020 where messages sent from a phone with a slightly skewed clock would appear out of order on the web client. The root cause: they were sorting by the sender’s device timestamp instead of the server’s received timestamp. The fix was a hybrid approach — use server timestamp for ordering but display the sender’s local timestamp. This is called Lamport timestamp ordering and it’s the standard solution for distributed message ordering.
🏗️ Step 2: Architecture / High-Level Design
Component Architecture
The WebSocket Connection Architecture
This is the heart of any messenger app. The WebSocket connection needs to handle message delivery, typing indicators, presence updates, and read receipts — all while dealing with flaky networks, reconnections, and message deduplication.
Cross-Tab Sync
The Detail: Users often have multiple tabs open. If you don't handle this, each tab will open its own WebSocket, wasting server resources and causing "notification storms" (all tabs pinging at once).
The Solution: Use a SharedWorker or the BroadcastChannel API.
SharedWorker: One "Master" worker owns the WebSocket connection. All open tabs communicate with this single worker to send/receive messages.
BroadcastChannel: If one tab receives a message and updates the local
IndexedDB, it broadcasts an event:channel.postMessage({ type: 'NEW_MESSAGE', id: ... }). Other tabs listen and update their Redux/Zustand stores without refetching from the network.
🔥 Real-world war story: Discord discovered that during large server events (like game launches), thousands of users would disconnect and reconnect simultaneously — a thundering herd problem. Their WebSocket servers would crash under the reconnection flood. The fix was adding jitter to the reconnect delay (random 0-1s added to the exponential backoff). This simple change spread reconnections over time and reduced server peak load by 60%.
📊 Step 3: Data Model
Why Message Ordering is Hard
🔥 Real-world war story: Telegram had a subtle bug where messages in group chats would occasionally appear in different orders for different participants. The root cause: their client was using the message_id for ordering, but message_id was assigned by different servers in a distributed system, and the IDs were not globally monotonic. Their fix was introducing a per-chat pts (points) counter that’s incremented atomically for each new message in a conversation.
🔌 Step 4: Interface Definition (API Design)
REST API for Initial Data
WebSocket Protocol
The Optimistic Send Flow
Gemini said
To elevate your system design from Staff (L6) to Principal (L7), you should integrate these advanced concepts into the existing RADIO framework.
Here is exactly where to place them and the technical depth required for each.
Binary over WebSockets
The Detail:
Instead of sending plain-text JSON, which is verbose and computationally expensive to parse, use a binary format like Protocol Buffers (Protobuf).
The Benefit: On mobile devices, parsing large JSON strings causes CPU spikes that drain battery and block the main thread. Binary payloads are significantly smaller and faster to deserialize.
Implementation: Define a
.protofile that is shared between the frontend and backend. The frontend uses a library likeprotobufjsto encode/decode messages.
Staff Tip: Mention that while binary is faster, it makes debugging harder because you can't read the network traffic in the "Network" tab without a decoder. Suggest a "Development Mode" toggle that falls back to JSON for easier debugging.
🔥 Real-world war story: Slack’s message delivery had a race condition: if a user sent two messages rapidly (within 50ms), the optimistic UI would show them in order, but the server would occasionally acknowledge them in reverse order, causing the messages to "swap" positions after 1-2 seconds. The fix was assigning client-side sequence numbers and having the server respect client ordering within a single sender’s message burst.
⚡ Step 5: Optimizations
1. Typing Indicators with Debounce + Timeout
{text}
);
}
2. Message Virtualization (Reverse Infinite Scroll)
{/* flex-col-reverse: newest at bottom, natural scroll direction */}
{messageGroups.map(group => (
))}
{isLoadingOlder && }
{!isAtBottom && (
scrollToBottom("smooth")}
/>
)}
);
}
3. Offline Support with IndexedDB
E2EE & Client-Side Search
The Detail: If the app requires End-to-End Encryption (E2EE), the server only sees encrypted "blobs." This breaks traditional server-side search.
The Architecture Shift: You must move the search engine to the client. As messages are decrypted and stored in IndexedDB, you should simultaneously index them.
Implementation: Use a library like
lunr.jsorFlexSearchto build an inverted index. When a user searches, the query runs against the local IndexedDB index rather than making an API call.The Conflict: This creates a memory vs. functionality trade-off. You can only search messages that have been synced to the local device.
🔥 Real-world war story: WhatsApp Web’s offline mode had a critical sync bug: if a user sent messages to two different conversations while offline, and both conversations had a pending "typing_start" event, the sync would interleave messages from both conversations in the outbox, causing messages to appear in wrong chats. The fix was per-conversation outbox queues that sync independently, not a single global outbox.
4. Read Receipt Batching
5. Link Preview Generation
6. Notification System
📊 Performance Budget
Metric | Target | How We Achieve It |
|---|---|---|
Message send latency | < 100ms perceived | Optimistic UI — message appears instantly, server confirms async |
Message receive latency | < 500ms | WebSocket push, no polling |
Conversation switch time | < 200ms | IndexedDB cache for instant display, network fetch for fresh data |
Memory with 50 open conversations | < 200MB | Virtualized message lists, evict old messages from memory |
Reconnection time | < 3s | Exponential backoff with jitter, immediate sync on reconnect |
Offline message queue | Unlimited | IndexedDB outbox with per-conversation ordering |
Typing indicator latency | < 200ms | Immediate WebSocket send on first keystroke, debounced stop |
🧠 Summary: What Makes This a 5/5 Answer
Rubric | What We Covered |
|---|---|
Requirements | Scoped 1:1 + group chat with real-time delivery, typing indicators, read receipts, offline support, and specific latency targets |
Architecture | Full component tree with WebSocket connection manager (heartbeat, reconnect with jitter, message buffer, deduplication) |
Data Model | Normalized store with message ordering (Lamport-style sequence numbers), presence tracking, typing state, offline queue |
API Design | REST for initial load + WebSocket protocol for real-time, complete optimistic send flow with 8 steps, read receipt batching |
Optimizations | Typing indicators (debounce + timeout), reverse infinite scroll, IndexedDB offline storage with per-conversation outbox, read receipt batching with IntersectionObserver, link preview generation, notification system (sound + browser + tab title + favicon badge) |
Real-world depth | 5 production war stories from WhatsApp (clock skew ordering), Discord (thundering herd), Telegram (distributed message IDs), Slack (message swap race condition), WhatsApp (interleaved outbox sync) |
The key differentiator: most candidates design a basic "send/receive messages" system. A 5/5 answer tackles the three killer problems: message ordering in distributed systems, offline-first with conflict-free sync, and the WebSocket connection lifecycle (heartbeat, reconnection, message deduplication). These are the exact problems that Slack, Discord, and WhatsApp frontend teams have dedicated engineers working on full-time.
Next up in this series: Design an API Progress Bar — where we will explore how to build a YouTube/GitHub-style top-of-page loading bar that communicates request progress, handles parallel requests, and creates the illusion of speed even when the server is slow.