System Design: Design a Messenger Web App — Real-Time Chat Architecture at Scale

A production-grade frontend system design walkthrough — the answer that covers every edge case WhatsApp, Slack, and Discord had to solve.

"Design a messenger app" is the most frequently asked frontend system design question at Meta, Google, and Microsoft. It sounds straightforward — send messages, receive messages, show them in a list. But the reality is that building a production-quality chat frontend is one of the most technically demanding challenges in web development.

Why? Because chat is where real-time delivery, offline resilience, message ordering guarantees, optimistic UI, end-to-end encryption, and infinite scroll (in reverse) all collide simultaneously. I’ve seen senior engineers at Google stumble on the message ordering problem alone — when two users send messages at the exact same millisecond, whose message appears first?

Let’s build this properly using the RADIO framework.

📋 Step 1: Requirements Exploration

Clarifying Questions I’d Ask

Question	Why It Matters	Assumed Answer
1:1 chat only, or group chat too?	Group chat adds participant management, mentions, and message fan-out complexity	Both 1:1 and group chat (up to 256 members)
What message types?	Media messages need upload pipelines, previews, and progressive loading	Text, images, files, voice notes, link previews
Do we need read receipts?	Sent/delivered/read is a 3-state system that needs real-time updates	Yes — sent, delivered, read (with blue ticks)
Typing indicators?	Requires debounced WebSocket events with timeout logic	Yes — "User is typing..." with multi-user support in groups
Offline support?	Queue messages locally, sync when back online, handle conflicts	Yes — send offline, sync on reconnect
Message search?	Full-text search across conversations changes data storage	Yes — search within a conversation and globally
End-to-end encryption?	E2EE means client handles all encryption/decryption, can’t rely on server for search	Nice-to-have, discuss architecture implications
Message reactions/threads?	Reactions need real-time aggregation; threads add nested conversation complexity	Reactions yes, threaded replies yes

Functional Requirements

Conversation list: Sorted by most recent message, unread badges, online status indicators
Message thread: Reverse-chronological infinite scroll (newest at bottom), date separators, message grouping by sender
Sending: Text with emoji, image/file upload with progress, voice notes with waveform
Real-time: Instant message delivery, typing indicators, read receipts, online/offline presence
Interactions: Reply to message, reactions (emoji), forward, delete for me/everyone
Search: Search conversations, search messages within a conversation
Notifications: Browser push notifications, unread count in tab title, notification sounds

Non-Functional Requirements

Latency: Message send-to-display < 100ms on the sender’s device (optimistic), < 500ms on receiver’s
Reliability: Zero message loss — every message must eventually be delivered and displayed
Ordering: Messages must appear in causal order within a conversation
Offline: Full read access to cached conversations, queued sends that sync on reconnect
Memory: Handle conversations with 100K+ messages without browser crash
Accessibility: Screen reader support for conversation navigation and message reading

🔥 Real-world war story: WhatsApp Web had a critical ordering bug in 2020 where messages sent from a phone with a slightly skewed clock would appear out of order on the web client. The root cause: they were sorting by the sender’s device timestamp instead of the server’s received timestamp. The fix was a hybrid approach — use server timestamp for ordering but display the sender’s local timestamp. This is called Lamport timestamp ordering and it’s the standard solution for distributed message ordering.

🏗️ Step 2: Architecture / High-Level Design

Component Architecture

The WebSocket Connection Architecture

This is the heart of any messenger app. The WebSocket connection needs to handle message delivery, typing indicators, presence updates, and read receipts — all while dealing with flaky networks, reconnections, and message deduplication.

Cross-Tab Sync

The Detail: Users often have multiple tabs open. If you don't handle this, each tab will open its own WebSocket, wasting server resources and causing "notification storms" (all tabs pinging at once).

The Solution: Use a SharedWorker or the BroadcastChannel API.
- SharedWorker: One "Master" worker owns the WebSocket connection. All open tabs communicate with this single worker to send/receive messages.
- BroadcastChannel: If one tab receives a message and updates the local IndexedDB, it broadcasts an event: channel.postMessage({ type: 'NEW_MESSAGE', id: ... }). Other tabs listen and update their Redux/Zustand stores without refetching from the network.

🔥 Real-world war story: Discord discovered that during large server events (like game launches), thousands of users would disconnect and reconnect simultaneously — a thundering herd problem. Their WebSocket servers would crash under the reconnection flood. The fix was adding jitter to the reconnect delay (random 0-1s added to the exponential backoff). This simple change spread reconnections over time and reduced server peak load by 60%.

📊 Step 3: Data Model

Why Message Ordering is Hard

🔥 Real-world war story: Telegram had a subtle bug where messages in group chats would occasionally appear in different orders for different participants. The root cause: their client was using the message_id for ordering, but message_id was assigned by different servers in a distributed system, and the IDs were not globally monotonic. Their fix was introducing a per-chat pts (points) counter that’s incremented atomically for each new message in a conversation.

🔌 Step 4: Interface Definition (API Design)

REST API for Initial Data

WebSocket Protocol

The Optimistic Send Flow

Gemini said

To elevate your system design from Staff (L6) to Principal (L7), you should integrate these advanced concepts into the existing RADIO framework.

Here is exactly where to place them and the technical depth required for each.

Binary over WebSockets

The Detail:
Instead of sending plain-text JSON, which is verbose and computationally expensive to parse, use a binary format like Protocol Buffers (Protobuf).

The Benefit: On mobile devices, parsing large JSON strings causes CPU spikes that drain battery and block the main thread. Binary payloads are significantly smaller and faster to deserialize.
Implementation: Define a .proto file that is shared between the frontend and backend. The frontend uses a library like protobufjs to encode/decode messages.

Staff Tip: Mention that while binary is faster, it makes debugging harder because you can't read the network traffic in the "Network" tab without a decoder. Suggest a "Development Mode" toggle that falls back to JSON for easier debugging.

🔥 Real-world war story: Slack’s message delivery had a race condition: if a user sent two messages rapidly (within 50ms), the optimistic UI would show them in order, but the server would occasionally acknowledge them in reverse order, causing the messages to "swap" positions after 1-2 seconds. The fix was assigning client-side sequence numbers and having the server respect client ordering within a single sender’s message burst.

⚡ Step 5: Optimizations

1. Typing Indicators with Debounce + Timeout

{text}

); }

2. Message Virtualization (Reverse Infinite Scroll)

{/* flex-col-reverse: newest at bottom, natural scroll direction */} {messageGroups.map(group => ( ))} {isLoadingOlder && } {!isAtBottom && ( scrollToBottom("smooth")} /> )}

); }

3. Offline Support with IndexedDB

E2EE & Client-Side Search

The Detail: If the app requires End-to-End Encryption (E2EE), the server only sees encrypted "blobs." This breaks traditional server-side search.

The Architecture Shift: You must move the search engine to the client. As messages are decrypted and stored in IndexedDB, you should simultaneously index them.
Implementation: Use a library like lunr.js or FlexSearch to build an inverted index. When a user searches, the query runs against the local IndexedDB index rather than making an API call.
The Conflict: This creates a memory vs. functionality trade-off. You can only search messages that have been synced to the local device.

🔥 Real-world war story: WhatsApp Web’s offline mode had a critical sync bug: if a user sent messages to two different conversations while offline, and both conversations had a pending "typing_start" event, the sync would interleave messages from both conversations in the outbox, causing messages to appear in wrong chats. The fix was per-conversation outbox queues that sync independently, not a single global outbox.

4. Read Receipt Batching

5. Link Preview Generation

6. Notification System

📊 Performance Budget

Metric	Target	How We Achieve It
Message send latency	< 100ms perceived	Optimistic UI — message appears instantly, server confirms async
Message receive latency	< 500ms	WebSocket push, no polling
Conversation switch time	< 200ms	IndexedDB cache for instant display, network fetch for fresh data
Memory with 50 open conversations	< 200MB	Virtualized message lists, evict old messages from memory
Reconnection time	< 3s	Exponential backoff with jitter, immediate sync on reconnect
Offline message queue	Unlimited	IndexedDB outbox with per-conversation ordering
Typing indicator latency	< 200ms	Immediate WebSocket send on first keystroke, debounced stop

🧠 Summary: What Makes This a 5/5 Answer

Rubric	What We Covered
Requirements	Scoped 1:1 + group chat with real-time delivery, typing indicators, read receipts, offline support, and specific latency targets
Architecture	Full component tree with WebSocket connection manager (heartbeat, reconnect with jitter, message buffer, deduplication)
Data Model	Normalized store with message ordering (Lamport-style sequence numbers), presence tracking, typing state, offline queue
API Design	REST for initial load + WebSocket protocol for real-time, complete optimistic send flow with 8 steps, read receipt batching
Optimizations	Typing indicators (debounce + timeout), reverse infinite scroll, IndexedDB offline storage with per-conversation outbox, read receipt batching with IntersectionObserver, link preview generation, notification system (sound + browser + tab title + favicon badge)
Real-world depth	5 production war stories from WhatsApp (clock skew ordering), Discord (thundering herd), Telegram (distributed message IDs), Slack (message swap race condition), WhatsApp (interleaved outbox sync)

The key differentiator: most candidates design a basic "send/receive messages" system. A 5/5 answer tackles the three killer problems: message ordering in distributed systems, offline-first with conflict-free sync, and the WebSocket connection lifecycle (heartbeat, reconnection, message deduplication). These are the exact problems that Slack, Discord, and WhatsApp frontend teams have dedicated engineers working on full-time.

Next up in this series: Design an API Progress Bar — where we will explore how to build a YouTube/GitHub-style top-of-page loading bar that communicates request progress, handles parallel requests, and creates the illusion of speed even when the server is slow.