Design Process Steps
Follow a repeatable framework from high-level design to deep dives within the time limit.
Why a Structured Design Process Wins Interviews
Imagine you're sitting across from a senior engineer at a company you've wanted to work at for years. They slide a whiteboard marker across the table and say: "Design YouTube." Two words. No further explanation. What do you do next? If your instinct is to immediately start drawing boxes β a load balancer here, a database there, maybe toss in some microservices for good measure β you're about to make the most common and costly mistake in system design interviews. Before we go any further, grab the free flashcards at the end of each section to lock in what you're learning. Now, let's talk about why winging it almost always fails, and what separates candidates who get offers from those who don't.
System design interviews are uniquely uncomfortable because they're deliberately open-ended. Unlike algorithm problems, there is no single correct answer, no hidden test case that will tell you whether you passed or failed. This ambiguity is the point. Companies use system design interviews to evaluate how you think, not just what you know. And yet, the overwhelming majority of candidates treat these interviews like a knowledge dump β a race to show everything they've ever heard about distributed systems before time runs out.
The Hidden Rubric Interviewers Are Actually Using
Here's the uncomfortable truth: interviewers are rarely impressed by raw technical knowledge alone. They're watching something much more specific. They want to see whether you behave like a senior engineer who has actually shipped large-scale systems, or like someone who has memorized blog posts about them.
What does a senior engineer actually do when faced with an ambiguous problem? They don't guess. They don't assume. They lead. They ask clarifying questions, establish scope, reason about constraints quantitatively, propose solutions with explicit trade-offs, and communicate their thinking every step of the way. This is the behavior pattern interviewers are scoring you on, whether or not they've written it down on an explicit rubric.
Most interviewers are evaluating candidates across four dimensions:
| π― Dimension | π What They're Looking For |
|---|---|
| π§ Problem Framing | Can you scope ambiguous problems before solutioning? |
| π§ Technical Depth | Do you understand the systems you propose? |
| π¬ Communication | Can you explain your thinking clearly under pressure? |
| βοΈ Trade-off Reasoning | Do you acknowledge constraints and justify decisions? |
Notice that "correct final architecture" doesn't appear anywhere on that list as a standalone category. That's because two candidates can propose entirely different architectures and both pass β as long as both can justify their choices coherently. Conversely, a candidate who proposes a textbook-perfect architecture but stumbles through it without explaining their reasoning will fail.
π‘ Real-World Example: A senior engineer at a major tech company once described their interview process this way: "I once passed a candidate who designed a pretty mediocre system. I failed one who designed a much better one. The difference was that the first person walked me through every decision with clear reasoning. The second person just drew boxes and said 'and then this scales.' I didn't know if they understood why it would scale."
Why Improvisation Fails Under Pressure
Let's get specific about what happens when candidates improvise. Picture a candidate asked to design a URL shortening service like bit.ly. Here's how an unstructured response typically unfolds:
Candidate (improvising):
"Okay, so we'll have... a web server. And it needs a database.
Probably MySQL. Or maybe NoSQL. Let's say Cassandra.
And then we need to generate short URLs, so we can use...
hm, base62 encoding. And then there's a cache, Redis probably.
Oh wait, what about the load balancer? We should have that too.
And for scale... sharding? Yeah, sharding."
This response has real knowledge in it. But it's a disaster in interview terms. Why? Because the candidate never established what scale means for this system. Are we handling 100 requests per second or 100,000? Is read latency more important than write latency? Are custom short URLs a feature? What about analytics? What about link expiration?
Without answers to these questions, every architectural choice is a guess. And interviewers can tell the difference between an engineer reasoning from constraints and one who's pattern-matching to "things I've heard about." The improvised approach produces poor signal β the interviewer finishes the conversation not knowing whether the candidate could actually build this system or just describe one.
β οΈ Common Mistake β Mistake 1: Jumping to architecture before clarifying requirements. Candidates who start drawing immediately signal that they are reactive rather than analytical. The interviewer is not waiting for you to pick up the marker. They're watching whether you know not to.
β Wrong thinking: "I need to show I know about distributed systems quickly."
β
Correct thinking: "I need to understand the problem deeply before I propose any solution."
The anxiety that drives improvisation is completely understandable. System design interviews feel like a test of how much you know, so candidates try to show everything as fast as possible. But this is a fundamental misread of what's being evaluated. The interviewer has an hour with you. They don't want a firehose. They want a conversation that feels like the ones that happen in real engineering meetings.
π€ Did you know? Research on expert problem-solving consistently shows that experts spend significantly more time understanding a problem before attempting to solve it than novices do. Novices jump to solutions. Experts front-load comprehension. System design interviews are specifically designed to test which mode you naturally operate in.
A Repeatable Framework Changes Everything
Here's what changes when you approach system design interviews with a repeatable structured process: everything.
First, anxiety drops. When you have a clear sequence of steps to follow, you're never wondering "what should I do next?" You always know. This is the same reason checklists are used in surgery, aviation, and nuclear power plants β not because the practitioners are incapable of remembering, but because structured processes under pressure outperform improvised ones every time.
Second, your communication becomes naturally clearer. When you're following a framework, you can narrate each step explicitly: "I'm going to start by clarifying the requirements, then do some rough capacity estimation, and then we'll get into the architecture." This kind of narration is exactly what interviewers want to hear. It signals meta-awareness β the ability to manage a problem-solving process, not just execute within one.
Third, you stop wasting the interview's most valuable resource: time. An unstructured 45-minute design session might spend 35 minutes on architecture and never discuss monitoring, failure modes, or bottlenecks. A structured session allocates time intentionally, ensuring you cover the dimensions interviewers care most about.
π― Key Principle: A repeatable design process is not a crutch β it's a professional discipline. Senior engineers use structured frameworks in real design sessions precisely because they work. Demonstrating this in an interview is demonstrating genuine seniority.
Let's look at what the structured approach looks like in contrast to improvisation, using the same URL shortener example:
Candidate (structured):
"Before I start designing, I want to make sure I understand the problem.
A few clarifying questions:
1. What's the expected scale? Reads per second? Writes?
2. Are custom aliases required, or only system-generated short URLs?
3. Do we need analytics (click counts, geographic data)?
4. What's the expected link lifetime β do URLs expire?
5. Is this global or single-region?"
Interviewer: "Good questions. Let's say 100M URLs created per day,
10:1 read-to-write ratio, no custom aliases for now,
basic click analytics, no expiration, and global."
Candidate: "Perfect. So roughly 1,200 writes/second and
12,000 reads/second globally. Given the read-heavy nature and
global distribution, I'm going to prioritize read latency and
consider a CDN layer and aggressive caching. Let me sketch
the high-level components and then we can go deep on
the parts you're most interested in."
This response takes perhaps 90 seconds longer than jumping straight to the architecture. But it produces a completely different interview dynamic. The candidate now looks like they're leading the session. The interviewer is a collaborator, not an evaluator waiting for mistakes.
Senior Engineers Lead Ambiguity β and So Should You
One of the most important signals a system design interview is designed to measure is comfort with ambiguity. Junior engineers need requirements handed to them. Senior engineers create requirements from ambiguity. This is not a minor distinction β it's one of the core differences between IC levels at most companies.
When you receive a vague prompt like "design a chat system" or "design Uber," you're being given a deliberately underspecified problem. What happens next tells the interviewer everything about your engineering maturity:
ASCII Flow: Two Response Paths
Prompt: "Design a notification system"
|
|
+----+----+
| |
JUNIOR SENIOR
RESPONSE RESPONSE
| |
v v
"Okay I'll "A few questions
build a first β push only
push or also email/SMS?
notification What SLA for delivery?
system" Real-time or batch okay?
| Do we need ordering
v guarantees?"
|
Guesses at v
scope β Establishes scope β
Misses key Designs the RIGHT
requirements system confidently
The senior response doesn't just produce a better design β it signals that the person asking these questions has shipped notification systems and knows what bites you in production if you don't nail it up front. That's the association interviewers are looking to make.
π‘ Mental Model: Think of a system design interview like being hired to build a house. A junior contractor shows up with lumber and starts framing walls. A senior contractor sits down with the client first and asks: How many people will live here? What's your budget? Do you need a home office? Any accessibility requirements? The lumber matters, but not before you know what you're building.
The Cost of Unstructured Responses: Three Failure Modes
To make this concrete, let's examine the three most common ways an unstructured response fails β not just from the candidate's perspective, but from the interviewer's.
Failure Mode 1: Scope Creep and Time Collapse
Without scoping upfront, candidates often design something far larger than the interview allows. They start with a "simple" design, realize mid-session they haven't handled failures, then rush through five more components in the last 10 minutes. The interviewer sees a shallow, frantic finish β exactly the opposite of what signals seniority.
Failure Mode 2: Mismatched Problem Solving
Without clarifying requirements, candidates sometimes solve the wrong problem entirely. A candidate asked to design a "logging system" might design a centralized observability platform when the interviewer had in mind a simple audit trail for compliance. Twenty minutes of brilliant architecture in the wrong direction isn't recoverable.
Failure Mode 3: Poor Signal Quality
This is the subtlest failure, but the most consequential. When a candidate improvises, the interviewer can't tell the difference between genuine understanding and lucky guesses. They can't probe effectively because the candidate hasn't established a clear reasoning foundation to probe against. The result is an inconclusive interview β and in doubt, interviewers vote no.
## This is a useful analogy:
## Consider two functions that produce the same output
## but with very different levels of legibility
## Function 1: Improvised (works, but opaque)
def get_short_url(long_url):
import hashlib, base64
return base64.urlsafe_b64encode(
hashlib.md5(long_url.encode()).digest()
)[:7].decode()
## Function 2: Structured (works AND communicates intent)
def generate_short_url(long_url: str, length: int = 7) -> str:
"""
Generate a short URL identifier.
Args:
long_url: The original URL to shorten
length: Number of characters in the short code (default 7)
7 chars in base62 gives us ~3.5 trillion unique values
Returns:
A URL-safe string of specified length
"""
import hashlib
import base64
# MD5 produces 128-bit hash; we take first `length` base64 chars
# Trade-off: collision risk exists but is acceptable at our scale
url_hash = hashlib.md5(long_url.encode()).digest()
encoded = base64.urlsafe_b64encode(url_hash)
return encoded[:length].decode()
## The output of both functions is identical
## But Function 2 signals that the engineer knows *why* each
## decision was made β exactly what interviewers look for
print(generate_short_url("https://example.com/very/long/path"))
This code analogy captures something important: in system design interviews, how you express your reasoning is as important as the reasoning itself. Two candidates can arrive at the same architecture, but the one who communicates the why behind each decision will consistently outperform the one who doesn't.
Preview: The End-to-End Design Process
The rest of this lesson builds out a complete, repeatable system design process that you can apply to any prompt you're given. Here's the framework you'll learn:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β THE 5-STEP SYSTEM DESIGN PROCESS β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ£
β STEP 1 β Clarify Requirements & Scope β
β β Define functional + non-functional needs β
β ββββββββββͺβββββββββββββββββββββββββββββββββββββββββββββββ£
β STEP 2 β Capacity Estimation β
β β Size the system quantitatively β
β ββββββββββͺβββββββββββββββββββββββββββββββββββββββββββββββ£
β STEP 3 β API Contract Design β
β β Define system boundaries and interfaces β
β ββββββββββͺβββββββββββββββββββββββββββββββββββββββββββββββ£
β STEP 4 β High-Level Architecture β
β β Core components and data flow β
β ββββββββββͺβββββββββββββββββββββββββββββββββββββββββββββββ£
β STEP 5 β Deep Dives & Trade-offs β
β β Bottlenecks, scaling, failure modes β
βββββββββββ§βββββββββββββββββββββββββββββββββββββββββββββββ
Each step serves a distinct purpose. Each step also generates output that the next step depends on β this sequencing is intentional, and understanding why the steps are in this order is as important as knowing what they contain.
π§ Mnemonic: C-E-A-H-D β Clarify, Estimate, API, High-level, Deep dive. Or remember it as: "Can Every Architect Handle Depth?" β a question that your structured process will answer with a resounding yes.
You'll notice something important about this framework: it deliberately delays the part candidates are most eager to get to (the architecture) until after three preparatory steps. This sequencing mirrors how experienced engineers actually approach design problems in real settings. The impatience to skip to the architecture is understandable, but it's exactly what the framework is designed to override.
## A simple way to think about the framework as a dependency chain:
design_process = {
"step_1_requirements": {
"inputs": ["interview_prompt"],
"outputs": ["functional_reqs", "non_functional_reqs", "scope_boundaries"],
"unlocks": "step_2"
},
"step_2_estimation": {
"inputs": ["functional_reqs", "non_functional_reqs"],
"outputs": ["qps_estimate", "storage_estimate", "bandwidth_estimate"],
"unlocks": "step_3"
},
"step_3_api_design": {
"inputs": ["functional_reqs", "scope_boundaries"],
"outputs": ["api_endpoints", "data_contracts", "system_boundaries"],
"unlocks": "step_4"
},
"step_4_architecture": {
"inputs": ["all_previous_outputs"],
"outputs": ["component_diagram", "data_flow", "storage_choices"],
"unlocks": "step_5"
},
"step_5_deep_dives": {
"inputs": ["component_diagram", "qps_estimate"],
"outputs": ["scaling_strategy", "failure_handling", "trade_off_analysis"],
"unlocks": "offer"
}
}
## Each step's outputs become the next step's inputs.
## Skipping a step doesn't save time β it creates hidden debt
## that collapses your reasoning later.
This dependency structure is why improvisation fails so consistently. When you skip requirements clarification and jump to architecture, you're building step 4's output without step 1's inputs. You're drawing boxes in a vacuum. And when the interviewer asks you to justify why you chose a relational database over a document store, you have no requirements to point to. You're guessing, and both of you know it.
π Quick Reference Card: Why Structure Beats Improvisation
| π§ Dimension | β Improvised Approach | β Structured Approach |
|---|---|---|
| π― Scope | Assumed, often wrong | Explicitly confirmed |
| β±οΈ Time Use | Unbalanced, rushed finish | Intentionally allocated |
| π¬ Communication | Reactive, fragmented | Proactive, narrated |
| βοΈ Decisions | Unjustified, pattern-matched | Reasoned from constraints |
| π§ Signal Quality | Ambiguous to interviewer | Clear, assessable |
| π° Candidate Anxiety | High (constant guessing) | Lower (clear next step) |
Setting Up for What Comes Next
The sections that follow will take you through each step of this framework in detail. You'll see concrete examples of what good clarifying questions look like and how interviewers respond to them. You'll learn how to do back-of-the-envelope math quickly and confidently. You'll practice defining API contracts before you've drawn a single box on the whiteboard.
But all of that only works if you internalize the foundation being built here: system design interviews are process evaluations, not knowledge tests. The engineer who wins isn't the one who knows the most β it's the one who demonstrates the clearest thinking under ambiguous conditions.
Every time you sit down to practice, your goal isn't to design the perfect system. Your goal is to practice leading an open-ended problem through a structured process until it becomes muscle memory. The moment that process becomes automatic, your interview performance will stop depending on which topic you're asked about β because the process works on all of them.
π‘ Pro Tip: The next time you practice a design problem, time-box each step explicitly. Give yourself 5 minutes for requirements clarification before you're allowed to touch the architecture. This artificial constraint will feel uncomfortable at first. That discomfort is the training.
Let's build your process, one step at a time.
The Five-Step Design Process Framework
A system design interview is not a test of memorization β it is a test of structured thinking under pressure. The difference between a candidate who impresses and one who struggles often comes down to whether they have a repeatable process they can apply confidently to any problem. This section introduces the five-step framework that transforms an open-ended, ambiguous question like "Design Twitter" into a coherent, professional engineering narrative.
Think of this framework as a scaffolding system. Each step builds upon the last, and each one serves a specific purpose in shaping your answer from vague idea to concrete architecture. Skipping steps β or executing them out of order β is precisely how strong engineers give weak interviews.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β THE FIVE-STEP DESIGN PROCESS β
β β
β 1. CLARIFY REQUIREMENTS βββΊ What are we building? β
β β β
β βΌ β
β 2. ESTIMATE SCALE βββΊ How big will it get? β
β β β
β βΌ β
β 3. DEFINE THE API βββΊ What does it expose? β
β β β
β βΌ β
β 4. HIGH-LEVEL ARCH. βββΊ What are the moving parts? β
β β β
β βΌ β
β 5. IDENTIFY BOTTLENECKS βββΊ Where will it break? β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Each step is a gate. You do not move to the next step until the current one is solid. This discipline is what signals senior engineering thinking to your interviewer β because senior engineers know that decisions made early constrain all decisions that follow.
Step 1 β Clarify Requirements
The very first thing you do after hearing the problem is resist the urge to start drawing boxes. Instead, you ask questions. Not random questions β targeted questions that separate functional requirements from non-functional requirements and establish the boundaries of what you are being asked to design.
Functional requirements describe what the system does: the specific behaviors and features it must support. For a URL shortener, a functional requirement might be "users can submit a long URL and receive a short one" or "users can be redirected when they visit a short URL."
Non-functional requirements describe how the system performs: its quality attributes and operational constraints. These include availability targets (e.g., 99.99% uptime), latency expectations (e.g., redirect must complete in under 100ms), consistency guarantees (e.g., eventual vs. strong consistency), and durability requirements (e.g., links must never expire unless explicitly deleted).
π― Key Principle: Non-functional requirements are often more architecturally consequential than functional ones. A system that needs 99.999% availability looks fundamentally different from one that tolerates 99.9% β even if both do the same thing functionally.
Constraints are a third category worth separating explicitly. Constraints are the hard limits placed on your design by the business or operational context: budget, existing infrastructure, team size, regulatory compliance (GDPR, HIPAA), or the time you have in the interview to explore a problem.
Here is a quick example of how requirements questioning plays out in practice:
Interviewer: "Design a notification system."
You: "Great. Before I start, I want to clarify scope. Are we supporting push notifications, email, SMS, or all three? Should notifications be delivered in real time or can we tolerate some delay? Do we need guaranteed delivery, or is best-effort acceptable? How many users are we serving?"
In under 60 seconds, you have transformed a vague prompt into a scoped engineering problem. That is the entire purpose of Step 1.
β οΈ Common Mistake: Assuming the scope instead of asking. Many candidates hear "Design Instagram" and immediately begin designing photo upload pipelines, feeds, and recommendation systems β when the interviewer only wanted to explore the photo storage layer. Always confirm scope.
Step 2 β Estimate Scale
Once you know what you are building and what constraints apply, you need to understand the size of the problem. Back-of-the-envelope estimation is the skill of producing approximate but directionally correct numbers using simple arithmetic and a handful of memorized constants.
Estimation serves two purposes. First, it grounds your architectural choices in reality β there is no point proposing a single-machine SQLite database for a system handling 10 million writes per day. Second, it demonstrates quantitative reasoning, which is a core competency of senior engineers.
The four dimensions you estimate in almost every system design problem are:
- π§ Users β How many monthly active users (MAU) and daily active users (DAU)?
- π Traffic β How many requests per second (RPS) at peak load?
- π§ Storage β How many bytes do you need to store, and how fast does that grow?
- π― Bandwidth β How much data flows in and out per second?
Here is how this looks for a URL shortener with 100 million DAU and an assumption that each user creates one link per month and clicks ten links per day:
Write volume: 100M users / 30 days = ~3.3M new URLs/day
3.3M / 86,400 sec = ~38 writes/second
Read volume: 100M users Γ 10 clicks = 1B redirects/day
1B / 86,400 sec = ~11,574 reads/second
β round to ~12,000 RPS
Storage: Each URL record β 500 bytes
3.3M URLs/day Γ 500B = 1.65 GB/day
5-year retention = 1.65 Γ 365 Γ 5 β 3 TB
Bandwidth: 12,000 reads/sec Γ 500B = 6 MB/s outbound
These numbers immediately tell you something important: this system is read-heavy (reads outnumber writes by roughly 300:1). That single insight will shape every major architectural decision that follows β you will lean toward caching, read replicas, and CDN offloading rather than write-optimized storage engines.
π‘ Pro Tip: You do not need exact numbers. You need order-of-magnitude accuracy. The difference between 10,000 RPS and 12,000 RPS is irrelevant. The difference between 100 RPS and 100,000 RPS is everything. Always round aggressively and state your assumptions aloud.
Below is a small Python snippet that captures the kind of mental model you can apply rapidly during estimation. You would not write code in the interview β but thinking through estimation programmatically helps you internalize the structure:
## Back-of-envelope estimation helper
## These are the kinds of calculations you run mentally in an interview
## Constants worth memorizing
SECONDS_PER_DAY = 86_400
KB = 1_024
MB = 1_024 * KB
GB = 1_024 * MB
def estimate_system(dau, reads_per_user_per_day, writes_per_user_per_month, record_size_bytes):
"""
Produces a quick scale estimate for a read/write system.
"""
# Traffic
daily_reads = dau * reads_per_user_per_day
read_rps = daily_reads / SECONDS_PER_DAY
daily_writes = (dau * writes_per_user_per_month) / 30
write_rps = daily_writes / SECONDS_PER_DAY
# Storage (5-year projection)
daily_storage_bytes = daily_writes * record_size_bytes
five_year_storage_gb = (daily_storage_bytes * 365 * 5) / GB
# Bandwidth
outbound_mbps = (read_rps * record_size_bytes) / MB
print(f"Read RPS: {read_rps:,.0f}")
print(f"Write RPS: {write_rps:,.0f}")
print(f"5-Year Storage: {five_year_storage_gb:,.1f} GB")
print(f"Outbound Bandwidth:{outbound_mbps:,.2f} MB/s")
print(f"Read/Write Ratio: {read_rps/write_rps:.0f}:1")
## URL shortener example
estimate_system(
dau=100_000_000,
reads_per_user_per_day=10,
writes_per_user_per_month=1,
record_size_bytes=500
)
Running this produces output consistent with the manual calculation above and makes the read-heavy nature explicit. That read/write ratio line is the single most important diagnostic output.
π€ Did you know? Google engineers famously use a set of "numbers every engineer should know" β latency figures for L1 cache reads, SSD random reads, network round trips, and disk seeks. Knowing that an SSD random read takes ~100 microseconds but a disk seek takes ~10 milliseconds gives you the intuition to choose storage technologies wisely without running benchmarks in the interview.
Step 3 β Define the API Contract
With requirements scoped and scale estimated, you are ready to define the API contract β the formal boundary between your system and the outside world. This step is frequently skipped by candidates, and that is a serious mistake. Defining the API before designing the internals forces clarity about what the system must actually do, expressed in unambiguous terms.
An API contract specifies three things for each operation:
- Inputs β what the caller provides (parameters, types, authentication context)
- Outputs β what the system returns (response shape, status codes, error formats)
- System boundaries β what this system is responsible for versus what it delegates to other services
For a URL shortener, the API contract might look like this:
POST /shorten
Input: { long_url: string, custom_alias?: string, expiry_days?: int }
Output: { short_code: string, short_url: string, expires_at?: ISO8601 }
Errors: 400 (invalid URL), 409 (alias already taken), 401 (unauthenticated)
GET /{short_code}
Input: short_code (path param), User-Agent (header)
Output: HTTP 301/302 redirect to long_url
Errors: 404 (code not found), 410 (expired)
GET /analytics/{short_code}
Input: short_code, date_range
Output: { clicks: int, unique_visitors: int, top_referrers: [] }
Errors: 404, 403 (not owner)
Notice what defining this contract reveals immediately: the GET /{short_code} endpoint needs to be extremely fast (it is on the critical path of every redirect), while GET /analytics can tolerate more latency since it is not user-blocking. That asymmetry directly influences your architecture β you will cache redirect lookups aggressively while analytics queries run against a slower, cheaper store.
π‘ Real-World Example: At companies like Stripe and Twilio, the API contract is often designed before implementation begins, as a living document that both client and server teams agree on. This "API-first" approach is exactly what you are mimicking in Step 3. It signals that you think like a platform engineer, not just an implementer.
Here is a more formal representation of the same contract using a Python dataclass pattern, which makes input/output types explicit:
from dataclasses import dataclass
from typing import Optional
from datetime import datetime
## --- Request models ---
@dataclass
class ShortenRequest:
long_url: str # Must be a valid URL
custom_alias: Optional[str] # If None, system generates a code
expiry_days: Optional[int] # If None, link never expires
user_id: str # Extracted from auth token, not user-supplied
@dataclass
class AnalyticsRequest:
short_code: str
start_date: datetime
end_date: datetime
requesting_user_id: str # For ownership validation
## --- Response models ---
@dataclass
class ShortenResponse:
short_code: str # e.g., "abc123"
short_url: str # e.g., "https://sho.rt/abc123"
expires_at: Optional[datetime] # None if permanent
@dataclass
class AnalyticsResponse:
short_code: str
total_clicks: int
unique_visitors: int
top_referrers: list[str] # Top 5 referring domains
breakdown_by_day: list[dict] # [{"date": "2024-01-01", "clicks": 42}]
This level of specificity takes only a few minutes to sketch in an interview, but it communicates volumes. It shows that you think about error cases, authentication context, and the difference between what callers supply and what the system derives internally.
β οΈ Common Mistake: Conflating the API layer with the internal architecture. Your API contract says what the system does from the outside. It says nothing about how the data flows internally. Candidates who jump from API to database schema without acknowledging this distinction often paint themselves into design corners.
π§ Mnemonic: I.O.B. β Inputs, Outputs, Boundaries. For every API endpoint you define, answer those three letters before moving on.
Step 4 β High-Level Architecture Overview
With requirements, scale numbers, and an API contract in place, you have earned the right to start drawing boxes. High-level architecture is the bird's-eye view of your system β the major components and how they connect, without yet diving into the internals of any one component.
At this stage, you are sketching a diagram that shows:
- π§ Clients β who calls the system (web browsers, mobile apps, other services)
- π― Entry points β load balancers, API gateways, CDN edges
- π Core services β the primary application logic components
- π Data stores β what kind of storage each component uses and why
- π§ Async infrastructure β message queues, event streams, background workers
ββββββββββββ ββββββββββββββββ βββββββββββββββββββ
β Client ββββββΊβ API Gateway ββββββΊβ App Service β
β (Browser)β β (Rate Limit)β β (Shortener) β
ββββββββββββ ββββββββββββββββ ββββββββββ¬βββββββββ
β
ββββββββββββββββββΌβββββββββββββββββ
β β β
ββββββΌβββββ ββββββββΌβββββββ βββββββΌββββββ
β Cache β β Primary β β Message β
β (Redis) β β DB (SQL) β β Queue β
βββββββββββ βββββββββββββββ βββββββ¬ββββββ
β
ββββββββΌβββββββ
β Analytics β
β Worker β
βββββββββββββββ
The purpose of this diagram is not to be perfect β it is to establish a shared mental model with your interviewer before you start drilling into any one component. Think of it as the map you show before leading a hike. You would not start hiking without showing the trail first.
π‘ Mental Model: Your high-level architecture is a promise to your interviewer. You are saying: "Here are all the moving parts I believe this system needs. I will now justify each one and explore the hard parts." Every box in your diagram should appear for a reason β if you cannot explain why a component exists, remove it.
π― Key Principle: At the high-level stage, technology choices are less important than structural choices. Whether you use PostgreSQL or MySQL matters far less than whether you need a relational database at all versus a document store or a wide-column store. Make structural decisions first; defer technology selections until you have deeper justification.
Because high-level architecture and bottleneck resolution each deserve deep treatment, they are explored fully in the child lessons that follow this one. For now, understand their role in the sequence: Step 4 gives you a canvas, and Step 5 identifies where that canvas needs reinforcement.
Step 5 β Identify and Resolve Bottlenecks
No first-draft architecture is correct. Step 5 is the iterative refinement pass where you stress-test your own design. You ask: Where will this system fail? Under what load does a component become the limiting factor? What happens when a service goes down?
Bottleneck identification is a structured scan across three failure dimensions:
- π Throughput bottlenecks β components that cannot handle the RPS you estimated in Step 2
- π§ Latency bottlenecks β components that add unacceptable delay to the critical path
- π― Single points of failure (SPOF) β components whose failure would bring down the entire system
For each bottleneck you identify, you propose a resolution: horizontal scaling, caching layers, database sharding, read replicas, circuit breakers, async processing, or geographic distribution. The resolution then becomes part of your refined architecture, which may surface new bottlenecks β hence the iterative nature.
π‘ Pro Tip: When you identify a bottleneck in an interview, name it explicitly before proposing the fix. Say: "I see a potential bottleneck here β the database is the single point of contention for all reads and writes. I would resolve this by introducing a read replica tier and a Redis cache in front of it." This narration proves you are reasoning, not just reciting patterns.
Like Step 4, bottleneck resolution is covered in dedicated depth in a later lesson. The key takeaway here is positional: Step 5 always comes after you have a complete picture, not before. Candidates who start optimizing before they have sketched the full system waste time solving problems that may not exist.
How the Five Steps Form a Cohesive Narrative
The power of this framework is not in any individual step β it is in how the steps chain together to produce a design narrative that feels inevitable and professional.
REQUIREMENTS β SCALE β API β ARCHITECTURE β BOTTLENECKS
(What) (How big) (What boundary) (What parts) (What breaks)
Answers: Answers: Answers: Answers: Answers:
"Are we "Do we need "What exact "How do the "Is this design
buildng the sharding?" contract must parts fit actually
right thing?" "Do we need the system together?" resilient?"
caching?" honor?" "Can it scale?"
Notice how each step answers a distinct question and how the answer to each question constrains the next. Requirements constrain scale estimates (you cannot estimate traffic without knowing what operations users perform). Scale estimates constrain the API design (a 99th-percentile latency requirement influences how you structure synchronous vs. asynchronous endpoints). The API contract constrains the architecture (the data shapes in your responses influence your storage model). And the architecture constrains your bottleneck analysis (you can only identify bottlenecks in things that exist).
β Wrong thinking: "I'll figure out the requirements as I go β let me just start with the database schema."
β Correct thinking: "I need to know what I'm building, at what scale, with what contract, before I make any storage decisions."
π Quick Reference Card:
| π’ Step | π― Purpose | β±οΈ Interview Time | π€ Output |
|---|---|---|---|
| π§ 1. Clarify Requirements | Define what to build | 3β5 min | Functional + non-functional list |
| π 2. Estimate Scale | Size the system | 3β5 min | RPS, storage, bandwidth numbers |
| π§ 3. Define API | Set system boundary | 3β5 min | Endpoint definitions |
| π― 4. High-Level Arch | Sketch the solution | 5β10 min | Component diagram |
| π 5. Bottlenecks | Stress-test and refine | 5β10 min | Hardened architecture |
Applied consistently, this five-step process transforms a 45-minute interview into a structured engineering conversation. The interviewer is not watching you produce a perfect answer β they are watching you think. A repeatable process makes your thinking visible, and visible thinking is what wins system design interviews.
Clarifying Requirements and Scoping the Problem
Before you draw a single box, write a single service name, or mention a single database, you need to do something that separates senior engineers from junior ones in a system design interview: stop and ask questions. This phase β clarifying requirements and scoping the problem β is where interviews are quietly won or lost. A candidate who dives straight into architecture is essentially building a house without measuring the lot. A candidate who methodically uncovers what the system actually needs to do, and how well it needs to do it, signals that they understand engineering is fundamentally a problem of constraints.
This section gives you a battle-tested framework for doing exactly that.
Functional vs. Non-Functional Requirements: The Foundational Distinction
The first mental model you need is the distinction between two very different types of requirements. These are not interchangeable, and conflating them is one of the most common early mistakes candidates make.
Functional requirements describe what the system does β the specific behaviors and features users can observe and interact with. If you were designing Twitter, functional requirements might be: users can post tweets, users can follow other users, users can see a timeline of tweets from people they follow.
Non-functional requirements describe how well the system does it β the quality attributes that constrain your design choices. For the same Twitter system: the timeline must load in under 200ms, the system must handle 100 million daily active users, tweet delivery must be eventually consistent (not strongly consistent).
π― Key Principle: Functional requirements shape what you build. Non-functional requirements shape how you build it. Your architecture decisions β which databases you choose, whether you use synchronous or asynchronous communication, how you partition data β are almost entirely driven by non-functional requirements.
Think of it this way:
FUNCTIONAL NON-FUNCTIONAL
βββββββββββββββββ ββββββββββββββββββββββββββ
Users can upload photos Uploads must complete < 2s
Users can search posts Search latency < 100ms p99
Users can send DMs Messages delivered reliably
Admin can ban accounts System available 99.99% uptime
A helpful mnemonic for remembering the non-functional categories:
π§ Mnemonic: SCALPS β Scalability, Consistency, Availability, Latency, Performance, Security. Whenever you are probing for non-functional requirements, run through SCALPS mentally to make sure you haven't missed a dimension.
The Questions You Should Always Ask
Not every question is equally valuable in an interview context. You have limited time β usually 45 to 60 minutes total β so your questions need to be surgical. Below are the four most high-leverage question categories, each with example phrasings you can use verbatim.
Read/Write Ratio
The read/write ratio tells you whether your system is read-heavy, write-heavy, or balanced. This single number influences almost every major architectural decision: whether to use a relational or document database, whether to invest heavily in caching, whether to replicate data aggressively, and where to place your bottlenecks.
"Is this system predominantly read-heavy, write-heavy, or roughly balanced? For example, are we expecting 10 reads for every write, or something more like 100:1?"
A URL shortener like bit.ly is heavily read-skewed (people click links far more than they create them). A logging pipeline is heavily write-skewed. A collaborative document editor is closer to balanced. Each profile implies a different caching strategy, replication model, and database selection.
Expected Scale
Scale encompasses both the volume of data and the volume of traffic. You need both dimensions because a system with 10 million users but only 1 KB of data per user is a completely different design problem from a system with 1 million users who each generate 10 GB of video.
"What's our expected number of daily active users? And for the core write operation β say, posting a message β what's our peak requests-per-second target? Are we expecting that to grow significantly in the next two years?"
π‘ Pro Tip: Interviewers often don't have a precise answer prepared. That's fine β push for an order of magnitude. "Millions or tens of millions?" is a perfectly reasonable follow-up. The goal is to get within one order of magnitude so your estimation work in the next phase is meaningful.
Consistency vs. Availability Trade-offs
This question probes the CAP theorem implications for your system. In a distributed system, when a network partition occurs, you must choose between maintaining consistency (every read sees the most recent write) or availability (every request gets a response, even if stale). Most real systems live on a spectrum between strong consistency and eventual consistency.
"If two users are looking at the same piece of data, is it acceptable for them to temporarily see different values? In other words, can we tolerate eventual consistency, or does this domain require strong consistency?"
For a bank balance, the answer is almost always strong consistency β showing a stale balance is dangerous. For a "likes count" on a social post, eventual consistency is perfectly fine. Getting this answer early prevents you from over-engineering consistency where it isn't needed, or dangerously under-engineering it where it is.
Geography
Geographic distribution affects latency, data sovereignty, and disaster recovery strategy. A system designed for users in a single country has radically different infrastructure requirements than one designed for a global audience.
"Are we serving a global user base, or are users concentrated in a specific region? Do we have data residency requirements β for example, EU user data must stay in Europe?"
This single question can introduce or eliminate multi-region replication, CDN requirements, and compliance architecture from your design.
Scoping Techniques: MVPs and the Parking Lot
Once you have a picture of what the system should do and how well it should do it, you face a second challenge: the problem is almost certainly too large to design end-to-end in 45 minutes. Scoping is how you manage that reality professionally.
MVP feature prioritization means identifying the minimum set of features that constitute the core system β the features without which the product is meaningless β and explicitly agreeing with your interviewer that you will design for those first. Everything else is a nice-to-have that you may revisit if time allows.
For a ride-sharing app like Uber, the MVP might be:
- Rider can request a ride
- Nearby driver receives the request and accepts
- Rider can see driver ETA
- Ride completes and rider is charged
Notably absent from the MVP: surge pricing, promotions, driver ratings, trip history, support chat, and carpooling. Those are real features β but they are not the core system.
The technique for managing the excluded features is the parking lot β an explicit, visible list of features you have acknowledged but chosen not to design in this session. Naming your parking lot does something powerful: it proves to the interviewer that you are aware the full system is larger than what you're designing. You're not ignorant of the complexity; you're managing it deliberately.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SCOPE BOUNDARY β
β β
β β
IN SCOPE (MVP) π
ΏοΈ PARKING LOT β
β βββββββββββββββββ βββββββββββββββββββββ β
β β’ Request a ride β’ Surge pricing β
β β’ Driver matching β’ Driver ratings β
β β’ Real-time ETA β’ Trip history API β
β β’ Payment processing β’ Promotional codes β
β β’ Carpooling / pooling β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π‘ Real-World Example: In a Google interview for designing Google Maps, a strong candidate explicitly said: "I'm going to scope this to turn-by-turn navigation and ETA calculation. I'm parking real-time traffic updates, public transit routing, and offline maps β I'll come back to those if we have time." This took 30 seconds and immediately established credibility.
Capturing Requirements as a Structured Checklist
During the actual interview, you should be writing your requirements down as you discover them. Don't trust your working memory when you're nervous and time-pressured. A structured requirements checklist serves as your contract with the interviewer β both of you can see what you agreed to design, and you can refer back to it if the conversation drifts.
Here is a pseudoschema-style checklist format you can adapt and write on a whiteboard or shared document in real time:
## ============================================
## SYSTEM DESIGN REQUIREMENTS CHECKLIST
## System: URL Shortener (e.g., bit.ly)
## ============================================
FUNCTIONAL_REQUIREMENTS = [
"POST /shorten β accepts long URL, returns short code",
"GET /{code} β redirects to original long URL",
"(Parking Lot) Analytics dashboard per short link",
"(Parking Lot) Custom alias support",
"(Parking Lot) Link expiration",
]
NON_FUNCTIONAL_REQUIREMENTS = {
"scale": {
"dau": "100M daily active users",
"write_qps": "~1,000 new URLs shortened per second",
"read_qps": "~100,000 redirects per second", # 100:1 read:write
"data_retention": "URLs stored indefinitely",
},
"latency": {
"redirect_p99": "< 10ms", # Users expect instant redirect
"shorten_p99": "< 500ms",
},
"availability": "99.99% uptime (Four Nines)",
"consistency": "Eventual consistency acceptable",
"geography": "Global, CDN-friendly",
"security": "No NSFW/malicious URL filtering in scope",
}
This code block is not meant to run β it's a structured notation for capturing requirements clearly during the interview. Using dictionary and list syntax makes the structure explicit and scannable. The interviewer can read it at a glance and correct any misunderstandings before you've spent 20 minutes designing the wrong system.
Here's a more general template you can memorize and adapt to any problem:
## ============================================
## GENERIC REQUIREMENTS CAPTURE TEMPLATE
## ============================================
class SystemDesignRequirements:
# --- FUNCTIONAL (What it does) ---
core_features = [] # Must-have for MVP
parking_lot = [] # Acknowledged, out of scope today
# --- NON-FUNCTIONAL (How well it does it) ---
scale = {
"users": None, # DAU or MAU?
"read_qps": None, # Reads per second
"write_qps": None, # Writes per second
"storage_gb": None, # Total data volume estimate
}
latency = {} # Per-operation SLA targets
availability = None # e.g., 99.9%, 99.99%
consistency = None # Strong / eventual / session
geography = None # Single region / multi-region / global
compliance = [] # GDPR, HIPAA, SOC2, etc.
Note how the template includes compliance as a field. In real interviews, regulatory requirements are often forgotten entirely. A healthcare system subject to HIPAA has fundamentally different storage and access-logging requirements. Asking about compliance once β even if the answer is "none for this exercise" β demonstrates the kind of holistic thinking interviewers reward.
The Most Dangerous Mistake: Designing for Assumptions
Everything above is preparation against one catastrophic failure mode: designing for unstated assumptions instead of confirmed requirements.
Here is how this failure unfolds in practice. The interviewer says: "Design a notification system." The candidate thinks: This is like the push notification system at my last job. I'll design for mobile push at scale. They spend 20 minutes designing a sophisticated APNs/FCM pipeline. Then the interviewer says: "Actually, this is for email and SMS notifications for a banking app β mobile push isn't in scope." Twenty minutes. Gone.
β οΈ Common Mistake: Every detail you assume is a gamble you take with your limited interview time. Interviewers deliberately leave problem statements vague to test whether you'll ask clarifying questions or charge ahead blindly.
Here are three specific assumption traps to avoid:
β Wrong thinking: "It says 'messaging system,' so it's probably like WhatsApp β I'll design for real-time chat." β Correct thinking: "Before I start, can you tell me what kind of messages we're delivering? Real-time chat, async notifications, transactional emails, or something else?"
β Wrong thinking: "They didn't mention scale, so I'll just design for millions of users to be safe." β Correct thinking: "What's the expected user scale? I want to make sure I'm solving the right problem β a 10,000 user internal tool and a 100 million user consumer app require very different architectures."
β Wrong thinking: "Consistency is probably important, so I'll use a strongly consistent distributed transaction system." β Correct thinking: "Does this domain require strong consistency, or is eventual consistency acceptable? I want to confirm before I choose a storage model."
π€ Did you know? A study of technical interview feedback at major tech companies consistently lists "made too many assumptions" as one of the top reasons candidates fail system design rounds β not because their architecture was wrong, but because they were solving the wrong problem confidently.
Putting It All Together: A Requirements Discovery Dialogue
Let's walk through what a well-executed requirements phase sounds like as a continuous conversation. The interviewer has said: "Design Instagram."
Candidate: "Great. Before I start drawing anything, I'd like to spend a few minutes clarifying requirements. Is that okay?"
Interviewer: "Go ahead."
Candidate: "For core functionality β should I focus on the main user-facing features: posting photos, following users, and viewing a home feed? Or are there other features that are must-haves for this exercise?"
Interviewer: "Those three, plus photo upload, are the core."
Candidate: "Got it. I'll park stories, reels, DMs, explore page, and advertising β I can revisit those later. On scale: are we talking a startup-scale Instagram or production-scale, like a billion monthly active users?"
Interviewer: "Let's say 500 million DAU."
Candidate: "And is this a read-heavy system? I'd expect photo views and feed loads vastly outnumber new posts."
Interviewer: "Roughly 100:1 reads to writes, yes."
Candidate: "For the feed, is it acceptable for a user to see posts that are a few seconds stale, or do we need real-time consistency?"
Interviewer: "Eventual consistency is fine for the feed."
Candidate: "Global deployment? Are we serving users in Asia, Europe, and the Americas?"
Interviewer: "Yes, global."
Candidate: "Perfect. Let me write that down..."
In under two minutes, the candidate has confirmed: MVP scope, parking lot items, scale (500M DAU), read/write ratio (100:1), consistency model (eventual), and geography (global). That is a complete non-functional requirements profile. Everything that follows β estimation, API design, architecture β builds on solid, confirmed ground.
π Quick Reference Card: Requirements Clarification Checklist
| π§ Category | π― Question to Ask | π Why It Matters |
|---|---|---|
| π§ Core Features | What are the must-have behaviors? | Defines what you're designing |
| π ΏοΈ Scope | What's explicitly out of scope? | Prevents scope creep |
| π Scale | DAU, peak QPS, data volume? | Drives capacity and architecture |
| βοΈ Read/Write | What's the read-to-write ratio? | Cache strategy, DB selection |
| π Consistency | Strong or eventual? | Replication model, DB type |
| β Availability | What's the uptime SLA? | Redundancy and failover design |
| π Geography | Single region or global? | CDN, multi-region replication |
| π Compliance | GDPR, HIPAA, SOC2? | Storage, access logging, encryption |
The discipline of clarifying requirements before designing is ultimately a form of professional respect β for the interviewer's time, for the complexity of the problem, and for the users who will depend on the system you design. Every minute you spend confirming requirements in this phase saves ten minutes of redesigning architecture that solves the wrong problem. When you move into the next phase β estimation and API design β you will do so with a solid, confirmed foundation. That confidence is visible to interviewers, and it is exactly what senior-level engineering judgment looks like in practice.
Back-of-the-Envelope Estimation and API Contract Design
You have clarified the requirements and agreed on scope with your interviewer. Now what? Many candidates make the mistake of jumping straight to drawing boxes on the whiteboard β databases here, caches there, load balancers everywhere. The result looks impressive for about thirty seconds, until the interviewer asks: "How many requests per second are you designing for?" Silence. The architecture was never grounded in reality.
The two steps covered in this section β back-of-the-envelope estimation and API contract design β are the bridge between requirements and architecture. Estimation tells you how big the system needs to be. API design tells you what the system does from the outside. Together they give your subsequent architectural decisions a foundation that is both quantitative and contractual.
The Estimation Toolkit: Building Blocks Every Engineer Must Know
Good estimation is not guessing. It is structured reasoning with a small set of memorized building blocks that you combine quickly. There are three categories worth internalizing before any interview.
Powers of Two
Powers of two are the universal language of computer storage and throughput. You do not need a calculator if you have internalized the key reference points:
Value Power of 2 Approximate size
----------- ---------- ----------------
1,000 2^10 1 Kilobyte (KB)
1,000,000 2^20 1 Megabyte (MB)
1,000,000,000 2^30 1 Gigabyte (GB)
1 trillion 2^40 1 Terabyte (TB)
1 quadrillion 2^50 1 Petabyte (PB)
The practical trick: when multiplying or dividing by thousands during estimation, just shift between these named units. A system storing 1 KB per user record with 500 million users stores 500 million KB = 500 TB of data β no calculator required.
Latency Numbers Every Engineer Should Know
Latency numbers are the second pillar. These figures, popularized by Jeff Dean at Google, give you order-of-magnitude intuition for what operations cost:
Operation Approximate Latency
--------------------------------- -------------------
L1 cache reference ~0.5 ns
L2 cache reference ~7 ns
Main memory (RAM) reference ~100 ns
SSD random read ~150 Β΅s (150,000 ns)
Network round-trip in same DC ~500 Β΅s
HDD seek ~10 ms (10,000,000 ns)
Network round-trip coast-to-coast ~150 ms
The ratios matter more than exact numbers. RAM is ~1,000Γ faster than SSD. SSD is ~30Γ faster than spinning disk. A cross-datacenter network call costs as much as thousands of RAM accesses. These ratios directly inform technology choices: if you need sub-millisecond reads, you need an in-memory store, not a disk-based database.
π§ Mnemonic: "Memory is micro, disk is milli, cross-continent is centi." RAM operations finish in microseconds, disk in milliseconds, and cross-continental round trips in hundreds of milliseconds.
Traffic Math: Seconds in a Day
The third building block is a simple fact: there are approximately 86,400 seconds in a day, which engineers routinely round to 10^5 (100,000) seconds per day for easy mental arithmetic. This one number unlocks traffic calculations:
- 1 million requests/day Γ· 100,000 seconds β 10 QPS (queries per second)
- 100 million requests/day Γ· 100,000 seconds β 1,000 QPS
- 1 billion requests/day Γ· 100,000 seconds β 10,000 QPS
π― Key Principle: Peak traffic is typically 2β3Γ the average. If you estimate 1,000 average QPS, design for 2,000β3,000 peak QPS. Systems that only handle average load fail on Monday mornings and during marketing campaigns.
Estimation Walkthrough: A URL Shortener Service
Let us apply these building blocks to a concrete example. Suppose you are designing a URL shortener service similar to bit.ly. Your requirements phase established: 100 million new URLs shortened per day, a 100:1 read-to-write ratio, and URLs stored for five years.
Step 1 β Estimate QPS
Write QPS (URL creation):
100 million writes/day Γ· 100,000 seconds/day = 1,000 writes/second
Read QPS (URL redirects), using the 100:1 ratio:
1,000 writes/second Γ 100 = 100,000 reads/second
Peak QPS (applying a 2Γ spike factor):
Peak writes: ~2,000/second
Peak reads: ~200,000/second
This immediately tells you something important: this is a read-heavy system by two orders of magnitude. Your architecture must optimize for reads, not writes. A caching layer becomes almost mandatory.
Step 2 β Estimate Storage
Assume each shortened URL record stores:
- Original long URL: ~500 bytes average
- Short URL key: ~7 bytes
- Metadata (timestamps, user ID): ~100 bytes
- Total per record: ~600 bytes
Daily new records: 100 million
Per-record size: 600 bytes
Daily storage: 100M Γ 600B = 60 GB/day
Five-year storage:
60 GB/day Γ 365 days Γ 5 years = ~109 TB
Rounded: roughly 110 TB of raw storage over five years. This number tells you that a single database server is not enough. You will need sharding or a distributed storage solution. It also tells you the data fits comfortably in cloud object storage if needed.
Step 3 β Estimate Bandwidth
Inbound bandwidth (URL creation requests):
1,000 writes/second Γ 600 bytes/request β 600 KB/s inbound
Outbound bandwidth (redirect responses): A redirect response is small β just an HTTP 301/302 with a Location header, roughly 500 bytes:
100,000 reads/second Γ 500 bytes = 50 MB/s outbound
This is a modest bandwidth number. A single 1 Gbps network link can handle it. But with 200,000 peak reads/second, you are looking at 100 MB/s peak β well within modern infrastructure but worth noting for CDN and caching strategy.
Translating Estimates into Design Constraints
Now comes the payoff. Your estimates are not just numbers β they are design constraints that directly drive technology choices:
Estimate β Design Constraint
------------------------ -----------------------------------------------
100,000 reads/second β Need caching layer (Redis/Memcached)
110 TB over 5 years β Need distributed DB or sharding strategy
1,000 writes/second β Single write primary is feasible initially
600 bytes/record β NoSQL key-value store is ideal (simple lookups)
2-orders read/write ratio β Read replicas are essential
β οΈ Common Mistake: Presenting estimates without connecting them to decisions. An interviewer watching you calculate 100,000 reads/second wants to hear you say: "This read volume is why I'm going to propose a caching layer in front of the database." Numbers without conclusions are just arithmetic.
π‘ Pro Tip: Write your estimates in a visible corner of the whiteboard and refer back to them when justifying architectural choices. This demonstrates that your design is driven by data, not intuition β a hallmark of senior engineering thinking.
Defining the API Contract
With your estimates on the board, you know the system's scale. Now you define what the system does β its API contract. This step happens before you touch internal architecture, and that ordering is intentional.
π― Key Principle: The API is the system's public promise. It defines what clients can depend on. Everything inside the system β databases, caches, queues β is an implementation detail that can change. The API, once published, cannot break backward compatibility without coordinating with every client.
REST vs RPC: Choosing Your Style
REST (Representational State Transfer) organizes the API around resources and uses HTTP verbs (GET, POST, PUT, DELETE) to express operations. It is the dominant style for public APIs and web services.
RPC (Remote Procedure Call), including gRPC and Thrift, organizes the API around actions rather than resources. It is common in internal microservice communication where performance and strong typing matter more than convention.
For a system design interview, REST is almost always the clearer choice to communicate intent. Define two to four endpoints that cover the core use cases from your requirements.
URL Shortener API Contract
Here is a complete REST API contract for the URL shortener:
## Create a shortened URL
POST /api/v1/urls
Content-Type: application/json
Authorization: Bearer <token>
Request body:
{
"original_url": "https://www.example.com/very/long/path?with=params",
"custom_alias": "my-link", // optional
"expires_at": "2026-01-01" // optional, ISO 8601
}
Response 201 Created:
{
"short_url": "https://short.ly/abc123",
"short_code": "abc123",
"original_url": "https://www.example.com/very/long/path?with=params",
"created_at": "2024-01-15T10:30:00Z",
"expires_at": "2026-01-01T00:00:00Z"
}
## Redirect a short URL (the core user-facing operation)
GET /{short_code}
Response 301 Moved Permanently:
Location: https://www.example.com/very/long/path?with=params
## Retrieve URL metadata (for analytics dashboard)
GET /api/v1/urls/{short_code}
Authorization: Bearer <token>
Response 200 OK:
{
"short_code": "abc123",
"original_url": "https://www.example.com/very/long/path?with=params",
"click_count": 4821,
"created_at": "2024-01-15T10:30:00Z",
"expires_at": "2026-01-01T00:00:00Z"
}
## Delete a short URL
DELETE /api/v1/urls/{short_code}
Authorization: Bearer <token>
Response 204 No Content
Notice what this API definition accomplishes: it forces clarity on data types, authentication requirements, and edge cases (custom aliases, expiration) that were mentioned in requirements but could otherwise drift into the architecture phase as vague assumptions.
An RPC Alternative with gRPC
For an internal service β imagine this URL shortener is one microservice among many β you might define the contract using Protocol Buffers (the schema language for gRPC):
// url_shortener.proto
syntax = "proto3";
package urlshortener.v1;
// The URL Shortener service definition
service UrlShortenerService {
// Create a new shortened URL
rpc CreateShortUrl(CreateShortUrlRequest) returns (CreateShortUrlResponse);
// Resolve a short code to the original URL (hot path, must be fast)
rpc ResolveUrl(ResolveUrlRequest) returns (ResolveUrlResponse);
// Get metadata and analytics for a URL
rpc GetUrlMetadata(GetUrlMetadataRequest) returns (UrlMetadata);
// Delete a shortened URL
rpc DeleteUrl(DeleteUrlRequest) returns (DeleteUrlResponse);
}
message CreateShortUrlRequest {
string original_url = 1; // required
string custom_alias = 2; // optional, empty string = auto-generate
int64 expires_at_unix = 3; // optional, 0 = no expiration
string user_id = 4; // required for authentication
}
message CreateShortUrlResponse {
string short_code = 1;
string short_url = 2; // full URL with domain
int64 created_at_unix = 3;
}
message ResolveUrlRequest {
string short_code = 1;
}
message ResolveUrlResponse {
string original_url = 1;
bool is_expired = 2; // client handles expired gracefully
}
The gRPC definition makes the strong typing explicit and generates client/server code automatically. For interview purposes, either format works β what matters is that you define the contract before the architecture.
π‘ Real-World Example: Twitter's internal services use Thrift (a similar RPC framework). When Twitter's timeline service needs to call the tweet storage service, it does so through a strictly versioned Thrift contract. Changing that contract requires coordination across teams β exactly the discipline that API-first thinking enforces.
How the API Contract Prevents Scope Creep
Here is where the API definition step earns its place in the process beyond just being "good engineering practice."
In a system design interview, scope creep is one of the most dangerous failure modes. It happens when the conversation drifts β the interviewer mentions analytics, you start designing a real-time dashboard, and suddenly twenty minutes have passed and you have not touched the core URL shortening architecture.
A defined API contract is your scope anchor. Once you and the interviewer agree on three or four endpoints, you have implicitly agreed on the system boundary. When a new idea emerges, you can evaluate it explicitly: "That would require a new endpoint β should we add it to scope or note it as a future extension?" This keeps the conversation structured and signals that you understand product engineering, not just systems.
Without API contract With API contract
--------------------- ---------------------
Requirements Requirements
| |
v v
Architecture <-- drifting API Contract (agreed)
| scope |
v v
Implementation Architecture (bounded)
|
v
Implementation
β οΈ Common Mistake: Defining too many endpoints. You do not need to cover every conceivable operation. Define the core paths that fulfill your functional requirements. A URL shortener needs URL creation, redirection, and deletion β not seventeen analytics variants. Save depth for architecture.
Putting It Together: The Estimation-to-API Flow
These two steps work as a unified bridge. Here is how they connect in practice during an interview:
REQUIREMENTS PHASE
(previous step)
|
v
+-------------------------------+
| BACK-OF-THE-ENVELOPE |
| ESTIMATION |
| |
| 1. Calculate write QPS |
| 2. Calculate read QPS |
| 3. Estimate storage (5 yr) |
| 4. Estimate bandwidth |
| 5. Derive design constraints |
+-------------------------------+
|
| "Now I know the scale. Let me define
| what the system does externally."
v
+-------------------------------+
| API CONTRACT DESIGN |
| |
| 1. Choose REST or RPC |
| 2. Define 3-5 core endpoints |
| 3. Specify request/response |
| 4. Note auth requirements |
| 5. Agree with interviewer |
+-------------------------------+
|
| "Great. Now I'll design the
| internals to meet this contract
| at the estimated scale."
v
ARCHITECTURE PHASE
(next step)
When you walk into the architecture phase with both a quantitative scale target and a contractual boundary, every architectural decision you make can be justified in two ways: it handles the estimated load, and it supports the agreed-upon API. This dual justification is what separates candidates who receive senior engineer feedback from those who receive "good but unclear reasoning" feedback.
π‘ Pro Tip: Before moving from API design to architecture, do a ten-second verbal summary: "So we're designing for 100,000 reads per second, 1,000 writes per second, roughly 110 TB over five years, with these four endpoints as the external contract. Let me now walk through the internal architecture." This resets both you and the interviewer and signals confident, organized thinking.
π Quick Reference Card: Estimation and API Design Checklist
| Step | What to Do | Watch Out For |
|---|---|---|
| π’ Daily volume | State write volume in requests/day | Don't skip the source assumption |
| β‘ QPS | Divide by 100,000, then apply read ratio | Forget peak factor (2-3Γ) |
| πΎ Storage | Per-record size Γ volume Γ retention | Forget replication factor (Γ3) |
| π Bandwidth | QPS Γ average request/response size | Only model dominant direction |
| π Constraints | Map each number to a technology implication | Numbers without conclusions waste time |
| π API style | Pick REST for external, RPC for internal | Don't define more than 5 endpoints |
| π Auth | State the auth mechanism per endpoint | Easy to forget, looks incomplete |
| β Agreement | Verbally confirm API scope with interviewer | Unconfirmed scope leads to drift |
Estimation and API design are not bureaucratic checkboxes β they are the craft of thinking like a senior engineer before touching a design tool. Estimation grounds your decisions in mathematics. API design grounds your scope in a contract. Together they give your architecture a story that the interviewer can follow from first principles to final component. The next section will examine the most common process mistakes that derail candidates even when they understand these concepts individually.
Common Process Mistakes and How to Avoid Them
Even candidates who have studied system design deeply can stumble β not because they lack knowledge, but because they fall into predictable process-level traps. These mistakes are frustrating precisely because they are invisible to the candidate while they are happening. You are busy solving what feels like the right problem, narrating what feels like insightful thinking, and building what feels like an elegant system. But the interviewer, watching the process unfold, is seeing something very different.
This section is about developing the self-awareness to catch these errors in real time and correct them gracefully. We will cover the five most common process mistakes with concrete before-and-after examples, so you can recognize the pattern whether it shows up in a mock interview, a live session, or a system design practice problem at 11pm.
Mistake 1: Jumping Straight to Solutions β οΈ
The premature solutioning trap is the single most common process error in system design interviews. The interviewer says "Design Twitter" and within thirty seconds the candidate is drawing boxes labeled "Load Balancer β App Servers β Cassandra" on the whiteboard. It feels productive. It looks decisive. It is almost always wrong.
The problem is that "Design Twitter" is not a problem statement β it is a topic. The actual problem lives inside a cloud of ambiguity: Are we designing the tweet feed, the notification system, the search index, or all of it? Are we optimizing for reads or writes? Is this a greenfield design or a migration? How many users? Which users β global or regional?
β Wrong thinking: "I know what Twitter is. I'll just start designing and ask questions if something comes up."
β Correct thinking: "The word 'Twitter' contains dozens of possible design problems. My first job is to narrow the problem down to something I can actually solve in 45 minutes."
Before (the trap)
Imagine a candidate who immediately starts designing a global-scale, multi-region, eventually-consistent tweet delivery system β complete with Kafka partitions, a fan-out-on-write architecture, and CDN edge caching. The interviewer lets them go for eight minutes and then asks: "What's the expected number of writes per second?" The candidate pauses. They never asked. The interviewer probes further: "Is this for a startup MVP or Twitter at scale?" Another pause. The candidate has been solving a problem they invented.
After (the fix)
A disciplined candidate spends the first three to five minutes asking targeted clarifying questions:
- "Are we designing the core tweet posting and home timeline retrieval, or also search and notifications?"
- "What's the expected scale β millions of users or hundreds of millions?"
- "Is read latency the primary concern, or is write throughput?"
- "Do we need to handle media attachments, or is text-only sufficient for this scope?"
After gathering answers, they restate the scope out loud: "So I'll focus on posting tweets and retrieving a personalized home timeline for roughly 100 million daily active users, optimizing for read latency, with text content only. Does that sound right?"
This restatement does three things simultaneously: it confirms shared understanding, it signals structured thinking, and it gives the interviewer one last chance to redirect before you commit.
π― Key Principle: Requirements are not a formality you rush through. They are the foundation everything else rests on. A five-minute investment in requirements saves thirty minutes of designing the wrong system.
Mistake 2: Over-Engineering Before Validating the High-Level Approach β οΈ
Over-engineering means diving into implementation details β specific database schemas, cache eviction policies, exact replication factors β before you and the interviewer have agreed that your high-level architecture makes sense. It is a seductive mistake because the details feel concrete and demonstrable. But details built on an unvalidated foundation are details you may have to throw away.
Think of it as the difference between sketching a building's floor plan versus specifying the exact thread count in the drywall insulation. The insulation spec is real work. It just shouldn't happen before the floor plan is approved.
DETAIL-FIRST PATH (risky)
[Prompt] --> [Schema Design] --> [Index Optimization] --> [Replication Config]
|
"Wait, why are
we using SQL?"
HIGH-LEVEL-FIRST PATH (correct)
[Prompt] --> [Block Diagram] --> [Interviewer Validates] --> [Deep Dive]
|
"Yes, this approach
makes sense. Go deeper."
The Two-Phase Design Discipline
Force yourself to work in two explicit phases. In Phase 1 (breadth), your goal is a simple block diagram that shows the major components and their relationships β nothing more. Resist the urge to label exact technologies. "Storage Layer" is fine. "PostgreSQL with a B-tree index on user_id" is premature.
In Phase 2 (depth), once the interviewer has acknowledged the shape of your architecture, you pick the two or three most interesting or challenging components and go deep. This is where specific technology choices, schema decisions, and performance tradeoffs belong.
π‘ Pro Tip: After sketching your high-level diagram, explicitly pause and say: "Before I go deeper, does this overall structure make sense to you? Are there any components you'd like me to focus on first?" This one sentence prevents you from over-engineering the wrong component.
β οΈ Common Mistake: Candidates who spend the first twenty minutes designing a flawless database schema often run out of time before discussing the system's most interesting architectural challenges β which is usually the thing the interviewer most wanted to evaluate.
Mistake 3: Staying Silent β Losing the Interviewer β οΈ
System design interviews are fundamentally collaborative. The interviewer is not watching you perform; they are trying to simulate working with you. When a candidate goes silent for two or three minutes while thinking, they break that simulation. The interviewer loses the thread of your reasoning. They cannot tell if you are thinking brilliantly or are completely stuck.
Narrating your thought process is not optional β it is a core evaluation criterion. Senior engineers think out loud. They say things like:
"I'm weighing two options here. A relational database gives us strong consistency but may bottleneck at this write volume. A wide-column store like Cassandra handles the write throughput but complicates our query patterns. Let me think through which tradeoff is worse for this use case..."
"I'm going to make an assumption here and flag it β I'll assume the fan-out service processes asynchronously. We can revisit that if the read latency requirement changes."
"I notice I'm about to get into cache invalidation details and we're fifteen minutes in. Let me finish the high-level picture first and come back to this."
Each of these narrations does double duty: it keeps the interviewer engaged AND it forces you to be more rigorous about your own thinking.
SILENCE PATTERN (problematic)
You: [drawing boxes silently for 3 minutes]
Interviewer: "What are you thinking?"
You: "Oh, just designing the storage layer."
Interviewer: [cannot evaluate reasoning quality]
NARRATION PATTERN (effective)
You: "I'm starting with the storage layer. The main tension here is
write throughput versus query flexibility. Given we said 50k
writes/second, I'm leaning away from a traditional relational
approach. Let me sketch two options..."
Interviewer: [observing structured tradeoff analysis in real time]
π§ Mnemonic: TAR β Think Aloud, Reason, then Resolve. Never resolve without the first two.
π‘ Mental Model: Treat the interview like pair programming. You are the driver. A driver who goes silent for five minutes while their navigator sits next to them is uncomfortable for everyone.
Mistake 4: Ignoring Non-Functional Requirements Until Too Late β οΈ
Non-functional requirements (NFRs) β availability, latency, consistency, durability, throughput β are not decorations you add at the end of a design. They are constraints that determine which architectural choices are even valid. Ignoring them until the last five minutes is like designing a bridge and then asking "Oh, how much weight should it hold?" after the blueprint is finished.
The most common NFRs that candidates neglect:
| NFR | Why It Gets Ignored | Why It Matters |
|---|---|---|
| π Availability (99.9% vs 99.99%) | Seems like an ops concern | Determines redundancy strategy and cost |
| β‘ Read/Write Latency (p99) | Hard to quantify early | Drives caching, CDN, and database choices |
| π¦ Durability | Often assumed | Determines replication factor and write-ahead logging |
| π Consistency model | Complex to discuss | Shapes everything from DB choice to API design |
| π Scalability ceiling | Feels premature | Informs whether sharding is needed from day one |
The NFR Integration Pattern
The fix is to treat NFRs as first-class requirements during your clarification phase β not as afterthoughts. After establishing functional scope, explicitly ask:
- "What's the acceptable latency for a timeline read β under 100ms, or is 500ms acceptable?"
- "If the system is unavailable for 30 seconds, is that a business-critical incident or acceptable degradation?"
- "Do we prioritize consistency β everyone sees the same feed β or is eventual consistency acceptable for this use case?"
Then, and this is crucial, reference those NFRs when you make architectural decisions. Don't just say "I'll use Redis for caching." Say "Because we committed to sub-100ms read latency, I'll add a Redis cache in front of the database here." This shows the interviewer that your choices are driven by requirements, not technology preferences.
Here is a practical example using a rate limiter design. Notice how the NFR shapes the implementation decision:
## Rate Limiter β design shaped by NFR: "must handle 100k req/sec with < 5ms overhead"
## NFR drives us away from a distributed lock approach (too slow)
## toward a token bucket implemented in Redis with Lua scripts (atomic, fast)
import redis
import time
RATE_LIMIT_SCRIPT = """
-- Lua script executes atomically in Redis (no race conditions)
-- Keys: [1] = rate_limit_key
-- Args: [1] = limit, [2] = window_seconds, [3] = current_timestamp
local key = KEYS[1]
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
-- Remove entries outside the sliding window
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
-- Count current entries in the window
local count = redis.call('ZCARD', key)
if count < limit then
-- Under limit: record this request and allow it
redis.call('ZADD', key, now, now .. math.random())
redis.call('EXPIRE', key, window)
return 1 -- allowed
else
return 0 -- rejected
end
"""
class SlidingWindowRateLimiter:
def __init__(self, redis_client: redis.Redis, limit: int, window_seconds: int):
self.redis = redis_client
self.limit = limit
self.window = window_seconds
# Pre-load script for efficiency (reduces round-trips)
self.script = self.redis.register_script(RATE_LIMIT_SCRIPT)
def is_allowed(self, user_id: str) -> bool:
key = f"rate_limit:{user_id}"
now = int(time.time() * 1000) # millisecond precision
result = self.script(keys=[key], args=[self.limit, self.window * 1000, now])
return result == 1
In an interview, you would say: "Our NFR was sub-5ms overhead at 100k requests per second. A distributed lock with a central counter would create a write bottleneck. Instead, I'm using a Redis sorted set with a Lua script β the atomicity of Lua means no race conditions, and Redis handles this throughput easily within our latency budget." Notice how the NFR justified the choice. This is the pattern interviewers reward.
Mistake 5: Treating the Process as Linear Instead of Iterative β οΈ
The five-step design framework introduced earlier in this lesson is sequential by default but iterative by necessity. Candidates who treat it as a strict one-way pipeline get into trouble the moment they discover, ten minutes into architecture design, that one of their earlier assumptions was wrong. They either ignore the contradiction (building a fragile design) or restart entirely (wasting time and signaling poor adaptability).
LINEAR MENTAL MODEL (fragile)
Requirements --> Estimates --> API --> Architecture --> Deep Dive
β β β β β
[done] [done] [done] [done] [done]
ITERATIVE MENTAL MODEL (robust)
Requirements <--> Estimates <--> API <--> Architecture <--> Deep Dive
β_____________β___________β____________β________________β
Graceful back-references allowed
The skill is graceful backtracking β returning to an earlier step without losing momentum or confidence. The key is to make the revision explicit and frame it positively.
Before (clumsy backtracking)
"Oh wait, I forgot to ask β how many users do we have? Because I think my whole architecture might be wrong."
This signals that the earlier work was wasted and creates anxiety. The interviewer now wonders what else was forgotten.
After (graceful backtracking)
"As I'm thinking through the fan-out logic, I realize our earlier estimate of 50 million daily active users changes things significantly here. Let me flag a revision: if 1% of users are 'celebrities' with 10 million followers each, our fan-out-on-write approach creates a hot-write problem. I want to revise this component to use a hybrid model β fan-out-on-write for regular users, fan-out-on-read for high-follower accounts. Does that change align with what you had in mind?"
This version demonstrates:
- π§ Awareness that new information requires revisiting earlier decisions
- π Understanding of the actual architectural implication
- π§ Ability to revise precisely, not wholesale
- π― Continued collaboration with the interviewer
Here is how a hybrid fan-out service might look in pseudocode, showing the iterative revision in action:
## Revised fan-out service β result of iterative backtracking
## Original design: always fan-out-on-write
## Revised design: hybrid based on follower count threshold
FAN_OUT_THRESHOLD = 10_000 # users with more followers use read-time fan-out
class HybridFanOutService:
def __init__(self, timeline_cache, follower_service, message_queue):
self.timeline_cache = timeline_cache
self.follower_service = follower_service
self.message_queue = message_queue
def publish_tweet(self, author_id: str, tweet: dict) -> None:
follower_count = self.follower_service.get_follower_count(author_id)
if follower_count <= FAN_OUT_THRESHOLD:
# Regular user: write tweet to each follower's timeline cache immediately
# Acceptable because fan-out is bounded (max 10k writes)
self._fan_out_on_write(author_id, tweet)
else:
# Celebrity user: store tweet only in author's timeline
# Followers pull it at read time β avoids thundering herd on write
self._store_for_read_time_fan_out(author_id, tweet)
def _fan_out_on_write(self, author_id: str, tweet: dict) -> None:
follower_ids = self.follower_service.get_all_followers(author_id)
for follower_id in follower_ids:
# Push tweet ID into each follower's pre-computed timeline cache
self.timeline_cache.prepend(f"timeline:{follower_id}", tweet['id'])
def _store_for_read_time_fan_out(self, author_id: str, tweet: dict) -> None:
# Only store in author's own tweet index
# Timeline reads will merge this in at query time
self.timeline_cache.prepend(f"tweets:{author_id}", tweet['id'])
# Notify downstream systems that a high-follower tweet was posted
self.message_queue.publish('celebrity_tweet', {'author_id': author_id, 'tweet_id': tweet['id']})
Pointing to this kind of code in an interview, you would say: "This is the revision I mentioned β the publish_tweet method now routes based on follower count. Regular users get synchronous write-time fan-out, which keeps reads simple. High-follower authors use read-time fan-out to prevent the write amplification problem we'd hit if someone like a celebrity tweeted to 50 million followers simultaneously."
π‘ Real-World Example: Twitter's own engineering blog documented this exact evolution β they started with pure fan-out-on-write and had to revise to a hybrid model when celebrity accounts created write storms. Your iterative revision in the interview mirrors how real systems evolve.
Putting It All Together: The Self-Monitoring Checklist
The most effective way to avoid these mistakes is to build a self-monitoring habit β a brief internal checkpoint you run every ten minutes during the interview. Think of it as a background process that keeps your foreground work on track.
SELF-MONITORING LOOP (run every ~10 minutes)
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β 1. Did I clarify before I drew? β
β β If no: pause and clarify now β
β β
β 2. Am I in detail before high-level is done? β
β β If yes: zoom out and finish the sketch β
β β
β 3. Have I spoken in the last 60 seconds? β
β β If no: narrate my current thinking β
β β
β 4. Are my decisions tied to NFRs? β
β β If no: reference the relevant NFR now β
β β
β 5. Did new info change earlier assumptions? β
β β If yes: backtrack gracefully β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
π Quick Reference Card: The Five Process Mistakes
| Mistake | Signal You're In It | Quick Fix | |
|---|---|---|---|
| π― | Premature solutioning | Drawing boxes before asking questions | Stop, restate scope, ask 3 clarifying questions |
| π§ | Over-engineering early | Explaining schema before block diagram is done | Zoom out, finish high-level, then dive |
| π§ | Staying silent | Interviewer asks "What are you thinking?" | Narrate the tradeoff you are currently evaluating |
| β‘ | Ignoring NFRs | Choices lack justification | Reference an NFR when stating each architectural decision |
| π | Linear process thinking | Ignoring contradictions rather than revising | Explicitly backtrack, frame revision as a positive insight |
π€ Did you know? Research on expert problem-solving consistently shows that the difference between novices and experts is not knowledge β it is metacognition: the ability to monitor and regulate your own thinking process in real time. The self-monitoring loop above is a structured form of engineering metacognition.
These five mistakes share a common root: they all happen when a candidate prioritizes appearing to make progress over actually making progress. The paradox of system design interviews is that the meta-skills β structuring your approach, narrating your thinking, validating before building β are more visible to the interviewer than the object-level architectural knowledge you are demonstrating. A candidate who designs a slightly suboptimal cache invalidation strategy while executing the process flawlessly will typically score higher than a candidate who knows every cache eviction policy by name but designs without asking a single clarifying question.
Process is not the scaffolding around your knowledge. It is the thing being evaluated.
Key Takeaways and Interview Process Cheat Sheet
You started this lesson not knowing why some candidates walk out of system design interviews feeling like they nailed it while others β equally talented engineers β leave feeling like they were improvising the whole time. Now you know the answer: process. A repeatable, structured process is what separates the candidate who designs systems confidently from the one who draws boxes and hopes for the best. This final section locks in everything you've learned, gives you a ready-to-use interview toolkit, and points the way forward into the deeper architectural territory ahead.
The Five-Step Framework: A Complete Recap
Every system design interview, regardless of the specific system being designed, can be navigated with the same five sequential steps. Think of these steps not as rigid checkboxes but as a cognitive scaffold β a mental structure that keeps you organized, communicative, and focused even under interview pressure.
Here is the complete framework in one place:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β THE FIVE-STEP DESIGN PROCESS β
ββββββ¬βββββββββββββββββββββββββ¬βββββββββββββββββββββββββββ€
β # β Step β Core Question Answered β
ββββββΌβββββββββββββββββββββββββΌβββββββββββββββββββββββββββ€
β 1 β Clarify Requirements β What are we building? β
β 2 β Estimate Scale β How big is this system? β
β 3 β Define API Contract β How do components talk? β
β 4 β High-Level Architectureβ What are the major parts?β
β 5 β Identify Bottlenecks β Where will it break? β
ββββββ΄βββββββββββββββββββββββββ΄βββββββββββββββββββββββββββ
Each step answers a fundamentally different question. Step 1 prevents you from solving the wrong problem. Step 2 grounds your design in mathematical reality. Step 3 forces precision about interfaces before you commit to implementation. Step 4 communicates the big picture. Step 5 demonstrates senior engineering instincts by proactively finding weaknesses.
π― Key Principle: The steps are sequential because each one provides the input for the next. Skipping steps doesn't save time β it creates confusion that costs far more time to unravel later.
The Interview Process Cheat Sheet
Print this, save it to your phone, or write it on a sticky note on your monitor. Use it before every practice session until the sequence becomes automatic.
π Quick Reference Card: System Design Interview Playbook
| β±οΈ Time | π Step | π― Goal | β Done When... |
|---|---|---|---|
| π 0β5 min | Clarify Requirements | Pin down functional + non-functional scope | You can restate the problem back to the interviewer in one sentence |
| π 5β10 min | Estimate Scale | Size users, QPS, storage, bandwidth | You have concrete numbers driving your architecture decisions |
| π 10β15 min | API Contract | Define key endpoints or interfaces | A junior engineer could implement the client from your spec |
| π 15β30 min | High-Level Architecture | Draw the major components and data flows | The interviewer can trace a request from client to response |
| π 30β45 min | Bottleneck Analysis | Identify and resolve weaknesses | You've addressed the top 2β3 failure points with concrete solutions |
π§ Mnemonic: C-E-A-A-B β Clarify, Estimate, API, Architecture, Bottlenecks. Or remember the sentence: "Can Every Architect Always Build?"
π‘ Pro Tip: The time allocations above are for a 45-minute interview. For a 60-minute interview, add roughly 5 minutes to the architecture and bottleneck phases. Never shrink the requirements or estimation phases β they protect everything that follows.
A Working Example: The Framework Applied to URL Shortener
To make the cheat sheet concrete, here's a compressed end-to-end walkthrough of designing a URL shortener using the exact framework. Notice how each step produces an artifact that feeds the next.
Step 1 β Clarify (5 min): Functional: shorten URLs, redirect on access, optional custom aliases. Non-functional: 100M daily active users, <100ms redirect latency, 99.99% uptime. Out of scope: analytics dashboard, user authentication.
Step 2 β Estimate (5 min):
Write QPS: 100M new URLs / day Γ· 86,400 sec β 1,200 writes/sec
Read QPS: Assume 100:1 read/write ratio β 120,000 reads/sec
Storage: 1,200 writes/sec Γ 86,400 Γ 365 Γ 5 years Γ 500 bytes β ~10 TB
Bandwidth: 120,000 reads/sec Γ 500 bytes β 60 MB/s outbound
Step 3 β API Contract (5 min):
## POST /api/v1/shorten
## Creates a short URL alias for a long URL
## Request body:
{
"long_url": "https://example.com/some/very/long/path",
"custom_alias": "mylink", # optional
"expiry_days": 30 # optional, null = permanent
}
## Response (201 Created):
{
"short_url": "https://short.ly/abc123",
"alias": "abc123",
"created_at": "2024-01-15T10:30:00Z",
"expires_at": "2024-02-14T10:30:00Z"
}
## GET /{alias}
## Redirects to the original long URL
## Response: HTTP 301 (permanent) or 302 (temporary) redirect
## Location: https://example.com/some/very/long/path
Notice how the API contract makes a concrete decision β 301 vs 302 redirect β that directly impacts caching behavior at the architecture level. This is the kind of precision that impresses interviewers.
Step 4 β Architecture (15 min):
Client
β
βΌ
[CDN / Edge Cache] βββ caches 301 redirects for hot URLs
β
βΌ
[Load Balancer]
β
ββββΊ [Write Service] βββΊ [ID Generator] βββΊ [Primary DB]
β β β
β ββββββββββββββββββββββββββββββββΊ [Cache (Redis)]
β
ββββΊ [Redirect Service] βββΊ [Cache (Redis)] βββΊ [Read Replica DB]
Step 5 β Bottlenecks (10 min):
- Hot URLs: Top 1% of URLs get 50% of traffic β solve with Redis cache + CDN edge caching
- ID collision: Two servers generate the same alias β solve with pre-generated ID ranges per server (ticket server pattern)
- DB write throughput: 1,200 writes/sec is manageable now, but sharding by alias hash prepares for 10x growth
How Process Mastery Frees Up Cognitive Bandwidth
Here is one of the most important β and least discussed β benefits of having a memorized process: it offloads navigation to autopilot so your brain can focus on the interesting technical problems.
Think about what happens when you first learn to drive. You consciously think about every action: check mirrors, signal, check blind spot, turn the wheel a precise amount. It's exhausting. After years of driving, those actions become automatic, and you can hold a conversation, plan your route, and notice a child stepping off a curb β all simultaneously.
System design interviews work the same way. When you internalize the five steps, you stop burning mental energy on "what should I talk about next?" and start spending it on questions like:
- π§ "Should I use a relational or document database for this access pattern?"
- π§ "What consistency model do I need between these services?"
- π§ "Is eventual consistency acceptable here, or does this require strong consistency?"
Those are the questions that differentiate a good design from a great one. A candidate who is improvising their structure cannot think about those questions deeply β they're too busy worrying about whether they've forgotten something important.
β Wrong thinking: "I'll figure out the structure as I go and spend more time on the technical details." β Correct thinking: "A locked-in process is what creates the space for technical depth."
π‘ Mental Model: Think of the five-step framework as the operating system of your interview. Once the OS is running reliably in the background, you can open demanding applications β database selection, consistency tradeoffs, caching strategies β without crashing.
A Practical Code Template: Your Personal Design Notes Structure
Many candidates find it helpful to have a structured template they fill in during the interview. Here is a minimal working template you can adapt for any system:
## System Design: [System Name]
### 1. Requirements
**Functional:**
- [ ] Core feature 1
- [ ] Core feature 2
**Non-Functional:**
- [ ] Scale: _____ DAU
- [ ] Latency: p99 < _____ ms
- [ ] Availability: _____ % uptime
**Out of Scope:**
- Feature X (explicitly confirmed with interviewer)
### 2. Estimates
- Write QPS: _____ / sec
- Read QPS: _____ / sec (___:1 ratio)
- Storage: _____ GB/year
- Bandwidth: _____ MB/s
### 3. API Contract
- POST /resource β creates, returns {id, ...}
- GET /resource/:id β returns {data}
### 4. High-Level Architecture
[Sketch: Client β LB β Services β DB]
### 5. Bottlenecks
1. Bottleneck: _____ β Solution: _____
2. Bottleneck: _____ β Solution: _____
3. Bottleneck: _____ β Solution: _____
This isn't just a note-taking tool β it's a communication device. Narrate as you fill it in. The interviewer sees a structured thinker building toward a solution, not a candidate frantically drawing random components.
π‘ Pro Tip: In a virtual whiteboard interview (Miro, Excalidraw, etc.), create five labeled zones before you start talking. Just labeling the zones signals immediately that you have a process β and it gives the interviewer confidence before you've said a single technical word.
Bridge to Upcoming Lessons: What Comes Next
This lesson established the process layer β the how of approaching any system design interview. Everything that follows in this course is the content layer β the what you plug into that process.
Here is how the upcoming lessons map directly onto the framework you've just learned:
| πΊοΈ Upcoming Lesson | π Framework Step It Serves | π‘ What You'll Be Able to Do |
|---|---|---|
| ποΈ High-Level Architecture Patterns | Step 4: Architecture | Choose between monolith, microservices, event-driven architectures with justification |
| π Bottleneck Deep-Dive: Databases | Step 5: Bottlenecks | Identify read/write hotspots and select sharding, replication, or caching strategies |
| β‘ Caching Strategies | Step 5: Bottlenecks | Apply cache-aside, write-through, and write-behind patterns to the right scenarios |
| π‘ API Design Patterns | Step 3: API Contract | Design REST, GraphQL, and gRPC interfaces with real tradeoff analysis |
| π Scaling Strategies | Steps 2 + 5 | Connect your estimation numbers to concrete horizontal/vertical scaling decisions |
π― Key Principle: The process is your skeleton; the upcoming lessons are your muscle and organs. Neither works without the other. A candidate who knows every caching strategy but has no process will still fumble. A candidate with perfect process but thin technical knowledge will plateau. You need both β and now you have the foundation.
The Daily Practice Drill
Knowledge without repetition doesn't become skill. Here is the single most effective practice habit for internalizing the five-step framework:
Every day, pick one system you already use and run it through all five steps in 30 minutes.
The system should be familiar enough that you don't get stuck on domain knowledge β the goal is to practice the process, not research a new domain. Good candidates:
- π± Instagram's photo feed
- π¬ WhatsApp message delivery
- π Google autocomplete
- π΅ Spotify's play queue
- π¦ Amazon's order status tracker
Here is a sample timer for your daily drill:
β±οΈ 0:00 β 5:00 β Requirements (set a 5-min timer, stop when it rings)
β±οΈ 5:00 β 10:00 β Estimates (calculate on paper, show your work)
β±οΈ 10:00 β 15:00 β API contract (write 2β3 endpoints with request/response)
β±οΈ 15:00 β 25:00 β Architecture diagram (whiteboard or notebook)
β±οΈ 25:00 β 30:00 β Bottleneck identification (name 3, solve 2)
After two weeks of daily drills, the sequence will be as automatic as typing β and your interview anxiety will drop significantly because you'll know exactly what you're going to do from the first second.
β οΈ Critical Point: Don't skip the timer. The time pressure is the point. In a real interview, you will feel compelled to keep talking about requirements forever, or to jump straight to architecture because it feels more impressive. The timer trains you to move deliberately.
π‘ Real-World Example: Competitive chess players don't just study openings β they practice them under time pressure until the moves are tactile memory. The time pressure in their training is exactly what makes their thinking automatic during a match. Treat your daily system design drill the same way.
What You Now Know That You Didn't Before
Let's make the learning gain explicit. Before this lesson, a typical developer asked to "design Twitter" might:
- Jump immediately to drawing boxes
- Spend 20 minutes on database schema before knowing the scale
- Never define what "Twitter" means in this context (timeline? DMs? trending topics?)
- Arrive at bottlenecks only if time remains β which it usually doesn't
After this lesson, you know:
- π§ Improvising signals junior thinking. Interviewers evaluate process explicitly, not just technical output.
- π― The five steps are sequential for a reason. Each step's output is the next step's input.
- π Requirements protect you from scope creep. Explicit out-of-scope items are as valuable as in-scope ones.
- π Estimation grounds your architecture. Decisions without numbers are opinions; decisions with numbers are engineering.
- π§ API contracts force interface precision. Writing endpoints before components prevents premature implementation details.
- π§ Bottleneck analysis is where you demonstrate seniority. It shows you build for failure, not just for the happy path.
β οΈ Final Critical Point to Remember: The framework does not guarantee the right answer β no framework does. What it guarantees is that you will work systematically toward a well-reasoned answer while demonstrating exactly the communication and problem-decomposition skills that senior engineering roles require. Interviewers are not expecting you to produce a production-grade architecture in 45 minutes. They are watching how you think.
Summary Table: Core Concepts at a Glance
π Quick Reference Card: Everything You Learned
| π Concept | π¬ One-Line Definition | β οΈ If You Skip It... |
|---|---|---|
| π― Structured Process | A repeatable five-step sequence for any system design problem | You improvise, panic, and miss critical design dimensions |
| π Requirements Clarification | Pinning down what's in scope before designing anything | You build the wrong system confidently |
| π Back-of-Envelope Estimation | Quantifying scale to drive architecture choices | Your architecture is untethered from reality |
| π‘ API Contract | Defining interfaces before implementing components | Your components are tightly coupled and ambiguous |
| ποΈ High-Level Architecture | The major components and how data flows between them | The interviewer can't follow your design narrative |
| π₯ Bottleneck Analysis | Proactively identifying and resolving failure points | You present a naive design with no depth |
| π§ Cognitive Bandwidth | Mental capacity freed up by automating process navigation | Technical depth crowds out structured thinking |
Three Practical Next Steps
Run the daily 30-minute drill starting today. Choose a familiar app. Set five timers. Go through all five steps without skipping. The first session will feel awkward β that discomfort is learning.
Watch one recorded system design interview (Mock interviews from Google engineers, ex-FAANG engineers on YouTube, etc.) and evaluate the candidate using the five-step checklist. Notice which steps they skip, where they lose structure, and how the interviewer reacts.
Continue to the next lesson on High-Level Architecture Patterns with this question in mind: "For which types of requirements and scale estimates does each architecture pattern become the right choice?" That framing β connecting patterns back to the process β is how you build an integrated mental model rather than isolated knowledge.
π€ Did you know? Research on expert performance consistently shows that structured problem-solving frameworks improve performance under stress more than raw knowledge does. Surgeons, pilots, and software engineers all benefit from checklists not because they don't know what to do β but because checklists prevent the cognitive shortcuts that emerge under pressure from causing errors. Your five-step framework is exactly this kind of professional-grade checklist.
You now have the process. Everything ahead is about filling that process with the technical depth it deserves.