- Channel/Interviewer: Gaurav Sen
- Duration: 00:25:12
- Original Video: https://www.youtube.com/watch?v=vvhC64hQZMk
This document summarizes the key content of a system design mock interview. I highly recommend watching the full video if you can.
Teach Me: 5 Years Old | Beginner | Intermediate | Advanced | (reset auto redirect)
Learn Differently: Analogy | Storytelling | Cheatsheet | Mindmap | Flashcards | Practical Projects | Code Examples | Common Mistakes
Check Understanding: Generate Quiz | Interview Me | Refactor Challenge | Assessment Rubric | Next Steps
- Problem Prompt (One-liner): Design a WhatsApp-style chat app supporting 1:1 messaging, groups, media messages, and read/delivered receipts, with online/last-seen indicators.
- Primary Scope
- In scope: 1:1 chat; group messaging (fan-out to members); message persistence/retry; sent/delivered/read receipts; “online/last seen”; gateways; session routing; decoupled microservices; basic load-balancing and retries; idempotency.
- Out of scope: Deep auth details; full image/video pipeline (referenced to a separate video); global search; heavy analytics; group delivery receipts per member (called “expensive” and often omitted).
- Non-Functional Priorities: Real-time delivery, horizontal scalability, high availability (no single points of failure), graceful degradation under load, cost awareness (minimize memory on gateways).
- Key Constraints & Numbers (as stated): Group size capped (example: ≈200) to avoid unbounded fan-out; “temporary” vs “permanent” chat retention discussed conceptually (no hard numbers).
- High-Level Architecture (Text)
- Mobile client maintains a WebSocket with a Gateway.
- Gateways are “dumb” TCP endpoints—offload logic to services.
- Session Service tracks user↔connection (box) mappings and routes messages.
- Group Service maps group→members; queried by Session for fan-out.
- Parser/Unparser Service converts external payloads (e.g., JSON over HTTP/WebSocket) to internal RPC objects (e.g., Thrift).
- [Personal note: Consider gRPC/Protobuf over Thrift in 2025 for broader ecosystem support and first-class HTTP/2.]
- Message Store/Queue persists messages and handles retries/idempotency.
- Last-Seen Service updates and serves online/last-seen timestamps.
- Load balancer + service discovery/heartbeats across all tiers.
- Top Trade-offs
- WebSockets vs HTTP long-polling (real-time vs simplicity).
- Fan-out at send time vs pull/batching (latency vs cost/pressure).
- “Dumb” gateways vs richer edges (memory/scale vs features).
- Transient (device-only) storage vs durable server store (privacy/cost vs reliability/compliance).
- Strict receipts (per member) vs partial/omitted receipts (accuracy vs performance).
- Biggest Risks/Failure Modes
- Gateway memory pressure from many TCP connections.
- Hot groups causing bursty fan-out and backpressure.
- Coupling session state into gateways → duplication & failures.
- Retry storms without idempotency keys.
- Over-prioritizing non-essential signals (e.g., last-seen) during global spikes.
- 5-Min Review Flashcards
- Q: Why WebSockets for chat? A: True server→client pushes with low latency; avoids polling.
- Q: What does the Session service store? A: User→gateway/connection mapping to route messages.
- Q: Where is group membership stored? A: Group Service (decoupled from Session).
- Q: How do you reduce gateway memory/CPU? A: Keep them “dumb”; offload parsing/auth; externalize session state.
- Q: How to avoid single points of failure? A: Replicate services, use service discovery/heartbeats and load balancers.
- Q: How are receipts implemented? A: ACKs from client propagate back; store/persist for retries.
- Q: What to de-prioritize during New Year spikes? A: Last-seen, delivery/read receipts—focus on message send/ACK.
- Q: How to tame fan-out for big groups? A: Cap group size; consider batching/pull for extremes.
- Q: Why idempotency here? A: Retries are common; idempotent sends prevent duplicates.
- Q: Consistent hashing usage? A: Route group lookups to the right shard to reduce duplication.
- Q: Parser/Unparser value? A: Converts external payloads to compact internal RPC objects at a separate tier.
- Q: Why not HTTP long-polling? A: It’s not real-time enough for chat at scale.
- [Personal note: If WebSockets are blocked or flaky on some networks, fallback to HTTP/2 server-push or SSE as a compatibility layer.]
- Domain/Industry:
messaging,social-media,collaboration - Product Pattern:
realtime-chat,notification,queue,rate-limit - System Concerns:
high-availability,low-latency,throttling,backpressure,privacy - Infra/Tech (mentioned):
microservices,websocket,thrift,rest - [Personal note: Prefer gRPC/Protobuf over Thrift in modern stacks due to ecosystem/tooling and HTTP/2 support.]
- Original Prompt: Build a WhatsApp-like chat with 1:1 and group messaging, image/video sharing, read/delivered receipts, and online/last-seen; finish within interview time by prioritizing.
- Use Cases
- Send/receive 1:1 messages with sent/delivered/read receipts.
- Create/join groups; sender’s message fans out to all group members.
- See “online” or “last seen” for contacts.
- Handle network/server failures with retries and durability where required.
- Out of Scope (stated/implicit)
- Per-member group delivery/read receipts (too expensive at scale).
- Full media pipeline details (covered elsewhere).
- Deep auth flows and email/SMS notifications (generic services).
- APIs: Not stated in video.
- Functional
- 1:1 chat with sent/delivered/read receipts.
- Group chat with capped membership (~200).
- Online/last-seen status.
- Message persistence and retry until delivered.
- Non-Functional
- Real-time message delivery (WebSockets).
- Scale horizontally; avoid single points of failure.
- Minimize gateway memory/logic; decouple services.
- Graceful degradation: deprioritize non-essential features during spikes.
- Durable server-side storage when compliance/official comms required; otherwise device-only storage acceptable for privacy/cost.
- [Personal note: For modern privacy expectations, pair E2E encryption with server-side encrypted, short-retention storage if durability is required—verify for your product/regulatory needs.]
- Client maintains exponential backoff on reconnects and retries; server uses capped retries with DLQs.
- Idempotency keys on message sends to tolerate retries.
Ask AI: Requirements & Constraints
Not stated in video—skipping numerical estimation.
- Clients: Mobile apps with persistent WebSocket to Gateways.
- Gateways: Terminate TCP/WebSocket; minimal logic; forward raw payloads to Parser/Unparser; ask Session for routing.
- Parser/Unparser: Converts external JSON to internal RPC objects (e.g., Thrift), then forwards.
- [Personal note: Consider gRPC for streaming RPCs, built-in deadlines, and observability with modern tooling.]
- Session Service: Central router; stores user→connection (box) mappings; queries Group Service for fan-out targets.
- Group Service: Stores group→members; sharded via consistent hashing on groupId to reduce duplication.
- Message Queue/Store: Persists messages, handles retries and idempotency; notifies senders of “sent” immediately and “delivered” on ACK.
- [Personal note: Managed pub/sub (e.g., cloud-native) can simplify ops vs self-hosting; verify latency/SLA needs.]
- Last-Seen Service: Updates timestamp on user-initiated activities; answers online vs last-seen queries.
- Load Balancing & Discovery: LBs at edges; heartbeats/service discovery among services.
Ask AI: High-Level Architecture
- Role & Responsibilities: Maintain TCP/WebSocket; accept frames; forward unparsed payloads; avoid heavy parsing/auth logic to conserve memory.
- Scaling & Partitioning: Horizontally scale; stateless beyond connection tables; fronted by LB.
- Failure Handling: If overloaded, shed non-essential work (last-seen/receipt updates).
- Bottlenecks: Connection memory; CPU for parsing (hence offload).
- Role: Source of truth for user→box mapping; routes 1:1 and group sends to the correct gateways.
- Data Model:
userId -> connection(boxId); transient; frequently updated; heavily cached. - Consistency: Eventually consistent is acceptable; frequent heartbeats keep freshness.
- Failure Handling: Replicated instances; no SPOF.
Ask AI: Subsystem - Session Service
- Role: Maintain
groupId -> [memberUserIds]. - Scaling: Consistent hashing on
groupIdto route to shards; cache toplists locally. - Fan-out: Session queries Group to list targets, then pushes via their mapped gateways.
- Limits: Cap group sizes to avoid excessive fan-out (≈200 discussed).
Ask AI: Subsystem - Group Service
- Role: Convert JSON/HTTP/WebSocket payloads into compact internal RPC objects (e.g., Thrift); centralizes payload evolution.
- Notes: Keeps gateways lean; allows schema-validated internal interfaces.
- [Personal note: Favor Protobuf schemas with gRPC for stronger tooling and streaming; thrift remains viable if already invested.]
Ask AI: Subsystem - Parser/Unparser
- Flow: Sender→Gateway→Session→ReceiverGateway→Receiver; receiver ACKs; ACK propagates to mark “delivered”; “read” fires when chat opened.
- Idempotency: MessageId or client token to dedupe on retries.
- Queues/Retries: Message queues guarantee eventual send; configurable retry counts and delays.
- [Personal note: Ensure DLQs and bounded retries to avoid retry storms under partial outages.]
Ask AI: Subsystem - Messaging & Receipts
- Role: Track latest user activity timestamps from user-initiated requests; serve “online” (within small threshold) or “last seen ”.
- Client Flags: Distinguish user activity vs app/system requests so background polls don’t flip presence.
- Thresholds: e.g., ≤10–15s → “online”; else show timestamp.
| Topic | Option A | Option B | Video’s Leaning | Rationale (from video) |
|---|---|---|---|---|
| Server push | WebSockets | HTTP long-polling | WebSockets | Real-time and bidirectional vs periodic polling. |
| Gateway logic | “Dumb” gateways | Feature-rich edge | “Dumb” | Conserve memory/CPU per connection. |
| Group delivery receipts | Per-member | Omit/partial | Omit/partial | Too expensive for large groups. |
| Group delivery model | Immediate fan-out | Pull/batching | Immediate (with cap) | Chat wants low latency; cap group size to bound cost. |
| Payload conversion | Parser/Unparser (Thrift) | Parse at gateways | Parser/Unparser | Centralize compute; keep edges light. |
| Storage model | Device-only (temporary) | Server-durable | Depends on needs | Privacy/cost vs compliance/reliability trade-off. |
- Replication/No SPOF: Multiple instances for Session, Group, Last-Seen, Parser, Queues.
- Backpressure & Throttling: Rate-limit/queue; shed non-essentials (last-seen, receipts) during global spikes.
- Load Shedding & Degradation: Prioritize “send + ACK” path over secondary signals.
- Retries: Server-side retry with idempotency keys; surface failure after N attempts.
Ask AI: Reliability and Performance
- Auth: Mentioned as separate/simple service; details not covered.
- Privacy: “Temporary” chats possible (device-only) but consider compliance needs for durability.
- Service discovery/heartbeats noted; detailed metrics/tracing not covered.
- Not stated in video.
- Not stated in video.
- Use WebSockets for real-time bidirectional chat; avoid HTTP long-polling latency.
- Keep gateways “dumb”; move logic to services to conserve memory/CPU.
- Centralize session routing (user→box) in a Session service.
- Decouple group membership into a Group service; shard via consistent hashing.
- Employ message queues and idempotency for reliable delivery with retries.
- Prioritize core delivery paths; deprioritize last-seen/receipt signals during global spikes.
- Consider temporary vs durable storage based on privacy vs compliance.
- Use a Parser/Unparser tier to standardize internal RPC objects.
- [Personal note: Modern stacks frequently standardize on gRPC/Protobuf; choose what your team/infra supports best.]
- Cap group sizes to keep fan-out bounded and predictable.
- WebSocket: Persistent, bidirectional client-server channel over TCP.
- Session Service: Router storing user→connection mappings.
- Group Service: Service holding group→member lists.
- Consistent Hashing: Sharding method minimizing remaps on node changes.
- Idempotency: Property where repeated operations have single effect.
- Load Shedding: Dropping/deprioritizing work to protect core functions.
- Parser/Unparser: Translator between external payloads and internal RPC objects.
- ACK (Acknowledgement): Receiver’s confirmation used for receipts.
- Review: WebSockets vs SSE vs HTTP/2 push; retries/backoff; idempotency keys; consistent hashing; fan-out patterns; service discovery/heartbeats; graceful degradation strategies.
- [Personal note: TLS 1.2+ minimum and aim for TLS 1.3 today; older versions are deprecated.]
- Source Video: https://www.youtube.com/watch?v=vvhC64hQZMk
- Channel: Gaurav Sen
- Note: This document is a summary of the linked mock interview.
I’m Ali Sol, a PHP Developer. Learn more:
- Website: alisol.ir
- LinkedIn: linkedin.com/in/alisolphp