Match each five-step framework element to the specific design question it answers:

!MATCH[["Clarify Requirements","What exactly are we building and what is out of scope?"],["Estimate Scale","How many QPS, bytes, and bandwidth does this system need?"],["Define API Contract","What inputs, outputs, and boundaries does the system expose?"],["High-Level Architecture","What are the major components and how does data flow between them?"],["Identify Bottlenecks","Where will the system fail under load or component failure?"]]

Design Process Steps

Follow a repeatable framework from high-level design to deep dives within the time limit.

Last generated Apr 9, 2026 UTC

Why a Structured Design Process Wins Interviews

Imagine you're sitting across from a senior engineer at a company you've wanted to work at for years. They slide a whiteboard marker across the table and say: "Design YouTube." Two words. No further explanation. What do you do next? If your instinct is to immediately start drawing boxes — a load balancer here, a database there, maybe toss in some microservices for good measure — you're about to make the most common and costly mistake in system design interviews. Before we go any further, grab the free flashcards at the end of each section to lock in what you're learning. Now, let's talk about why winging it almost always fails, and what separates candidates who get offers from those who don't.

System design interviews are uniquely uncomfortable because they're deliberately open-ended. Unlike algorithm problems, there is no single correct answer, no hidden test case that will tell you whether you passed or failed. This ambiguity is the point. Companies use system design interviews to evaluate how you think, not just what you know. And yet, the overwhelming majority of candidates treat these interviews like a knowledge dump — a race to show everything they've ever heard about distributed systems before time runs out.

The Hidden Rubric Interviewers Are Actually Using

Here's the uncomfortable truth: interviewers are rarely impressed by raw technical knowledge alone. They're watching something much more specific. They want to see whether you behave like a senior engineer who has actually shipped large-scale systems, or like someone who has memorized blog posts about them.

What does a senior engineer actually do when faced with an ambiguous problem? They don't guess. They don't assume. They lead. They ask clarifying questions, establish scope, reason about constraints quantitatively, propose solutions with explicit trade-offs, and communicate their thinking every step of the way. This is the behavior pattern interviewers are scoring you on, whether or not they've written it down on an explicit rubric.

Most interviewers are evaluating candidates across four dimensions:

🎯 Dimension	📋 What They're Looking For
🧠 Problem Framing	Can you scope ambiguous problems before solutioning?
🔧 Technical Depth	Do you understand the systems you propose?
💬 Communication	Can you explain your thinking clearly under pressure?
⚖️ Trade-off Reasoning	Do you acknowledge constraints and justify decisions?

Notice that "correct final architecture" doesn't appear anywhere on that list as a standalone category. That's because two candidates can propose entirely different architectures and both pass — as long as both can justify their choices coherently. Conversely, a candidate who proposes a textbook-perfect architecture but stumbles through it without explaining their reasoning will fail.

💡 Real-World Example: A senior engineer at a major tech company once described their interview process this way: "I once passed a candidate who designed a pretty mediocre system. I failed one who designed a much better one. The difference was that the first person walked me through every decision with clear reasoning. The second person just drew boxes and said 'and then this scales.' I didn't know if they understood why it would scale."

Why Improvisation Fails Under Pressure

Let's get specific about what happens when candidates improvise. Picture a candidate asked to design a URL shortening service like bit.ly. Here's how an unstructured response typically unfolds:

Candidate (improvising):

"Okay, so we'll have... a web server. And it needs a database.
Probably MySQL. Or maybe NoSQL. Let's say Cassandra.
And then we need to generate short URLs, so we can use...
hm, base62 encoding. And then there's a cache, Redis probably.
Oh wait, what about the load balancer? We should have that too.
And for scale... sharding? Yeah, sharding."

This response has real knowledge in it. But it's a disaster in interview terms. Why? Because the candidate never established what scale means for this system. Are we handling 100 requests per second or 100,000? Is read latency more important than write latency? Are custom short URLs a feature? What about analytics? What about link expiration?

Without answers to these questions, every architectural choice is a guess. And interviewers can tell the difference between an engineer reasoning from constraints and one who's pattern-matching to "things I've heard about." The improvised approach produces poor signal — the interviewer finishes the conversation not knowing whether the candidate could actually build this system or just describe one.

⚠️ Common Mistake — Mistake 1: Jumping to architecture before clarifying requirements. Candidates who start drawing immediately signal that they are reactive rather than analytical. The interviewer is not waiting for you to pick up the marker. They're watching whether you know not to.

❌ Wrong thinking: "I need to show I know about distributed systems quickly."
✅ Correct thinking: "I need to understand the problem deeply before I propose any solution."

The anxiety that drives improvisation is completely understandable. System design interviews feel like a test of how much you know, so candidates try to show everything as fast as possible. But this is a fundamental misread of what's being evaluated. The interviewer has an hour with you. They don't want a firehose. They want a conversation that feels like the ones that happen in real engineering meetings.

🤔 Did you know? Research on expert problem-solving consistently shows that experts spend significantly more time understanding a problem before attempting to solve it than novices do. Novices jump to solutions. Experts front-load comprehension. System design interviews are specifically designed to test which mode you naturally operate in.

A Repeatable Framework Changes Everything

Here's what changes when you approach system design interviews with a repeatable structured process: everything.

First, anxiety drops. When you have a clear sequence of steps to follow, you're never wondering "what should I do next?" You always know. This is the same reason checklists are used in surgery, aviation, and nuclear power plants — not because the practitioners are incapable of remembering, but because structured processes under pressure outperform improvised ones every time.

Second, your communication becomes naturally clearer. When you're following a framework, you can narrate each step explicitly: "I'm going to start by clarifying the requirements, then do some rough capacity estimation, and then we'll get into the architecture." This kind of narration is exactly what interviewers want to hear. It signals meta-awareness — the ability to manage a problem-solving process, not just execute within one.

Third, you stop wasting the interview's most valuable resource: time. An unstructured 45-minute design session might spend 35 minutes on architecture and never discuss monitoring, failure modes, or bottlenecks. A structured session allocates time intentionally, ensuring you cover the dimensions interviewers care most about.

🎯 Key Principle: A repeatable design process is not a crutch — it's a professional discipline. Senior engineers use structured frameworks in real design sessions precisely because they work. Demonstrating this in an interview is demonstrating genuine seniority.

Let's look at what the structured approach looks like in contrast to improvisation, using the same URL shortener example:

Candidate (structured):

"Before I start designing, I want to make sure I understand the problem.
A few clarifying questions:

1. What's the expected scale? Reads per second? Writes?
2. Are custom aliases required, or only system-generated short URLs?
3. Do we need analytics (click counts, geographic data)?
4. What's the expected link lifetime — do URLs expire?
5. Is this global or single-region?"

Interviewer: "Good questions. Let's say 100M URLs created per day,
10:1 read-to-write ratio, no custom aliases for now,
basic click analytics, no expiration, and global."

Candidate: "Perfect. So roughly 1,200 writes/second and
12,000 reads/second globally. Given the read-heavy nature and
global distribution, I'm going to prioritize read latency and
consider a CDN layer and aggressive caching. Let me sketch
the high-level components and then we can go deep on
the parts you're most interested in."

This response takes perhaps 90 seconds longer than jumping straight to the architecture. But it produces a completely different interview dynamic. The candidate now looks like they're leading the session. The interviewer is a collaborator, not an evaluator waiting for mistakes.

Senior Engineers Lead Ambiguity — and So Should You

One of the most important signals a system design interview is designed to measure is comfort with ambiguity. Junior engineers need requirements handed to them. Senior engineers create requirements from ambiguity. This is not a minor distinction — it's one of the core differences between IC levels at most companies.

When you receive a vague prompt like "design a chat system" or "design Uber," you're being given a deliberately underspecified problem. What happens next tells the interviewer everything about your engineering maturity:

ASCII Flow: Two Response Paths

Prompt: "Design a notification system"
         |
         |
    +----+----+
    |         |
  JUNIOR    SENIOR
  RESPONSE  RESPONSE
    |         |
    v         v
"Okay I'll   "A few questions
build a      first — push only
push         or also email/SMS?
notification What SLA for delivery?
system"      Real-time or batch okay?
    |        Do we need ordering
    v        guarantees?"
             |
Guesses at   v
scope →    Establishes scope →
Misses key  Designs the RIGHT
requirements system confidently

The senior response doesn't just produce a better design — it signals that the person asking these questions has shipped notification systems and knows what bites you in production if you don't nail it up front. That's the association interviewers are looking to make.

💡 Mental Model: Think of a system design interview like being hired to build a house. A junior contractor shows up with lumber and starts framing walls. A senior contractor sits down with the client first and asks: How many people will live here? What's your budget? Do you need a home office? Any accessibility requirements? The lumber matters, but not before you know what you're building.

The Cost of Unstructured Responses: Three Failure Modes

To make this concrete, let's examine the three most common ways an unstructured response fails — not just from the candidate's perspective, but from the interviewer's.

Failure Mode 1: Scope Creep and Time Collapse

Without scoping upfront, candidates often design something far larger than the interview allows. They start with a "simple" design, realize mid-session they haven't handled failures, then rush through five more components in the last 10 minutes. The interviewer sees a shallow, frantic finish — exactly the opposite of what signals seniority.

Failure Mode 2: Mismatched Problem Solving

Without clarifying requirements, candidates sometimes solve the wrong problem entirely. A candidate asked to design a "logging system" might design a centralized observability platform when the interviewer had in mind a simple audit trail for compliance. Twenty minutes of brilliant architecture in the wrong direction isn't recoverable.

Failure Mode 3: Poor Signal Quality

This is the subtlest failure, but the most consequential. When a candidate improvises, the interviewer can't tell the difference between genuine understanding and lucky guesses. They can't probe effectively because the candidate hasn't established a clear reasoning foundation to probe against. The result is an inconclusive interview — and in doubt, interviewers vote no.

## This is a useful analogy:
## Consider two functions that produce the same output
## but with very different levels of legibility

## Function 1: Improvised (works, but opaque)
def get_short_url(long_url):
    import hashlib, base64
    return base64.urlsafe_b64encode(
        hashlib.md5(long_url.encode()).digest()
    )[:7].decode()

## Function 2: Structured (works AND communicates intent)
def generate_short_url(long_url: str, length: int = 7) -> str:
    """
    Generate a short URL identifier.
    
    Args:
        long_url: The original URL to shorten
        length: Number of characters in the short code (default 7)
                7 chars in base62 gives us ~3.5 trillion unique values
    
    Returns:
        A URL-safe string of specified length
    """
    import hashlib
    import base64
    
    # MD5 produces 128-bit hash; we take first `length` base64 chars
    # Trade-off: collision risk exists but is acceptable at our scale
    url_hash = hashlib.md5(long_url.encode()).digest()
    encoded = base64.urlsafe_b64encode(url_hash)
    
    return encoded[:length].decode()

## The output of both functions is identical
## But Function 2 signals that the engineer knows *why* each
## decision was made — exactly what interviewers look for
print(generate_short_url("https://example.com/very/long/path"))

This code analogy captures something important: in system design interviews, how you express your reasoning is as important as the reasoning itself. Two candidates can arrive at the same architecture, but the one who communicates the why behind each decision will consistently outperform the one who doesn't.

Preview: The End-to-End Design Process

The rest of this lesson builds out a complete, repeatable system design process that you can apply to any prompt you're given. Here's the framework you'll learn:

╔════════════════════════════════════════════════════════╗
║          THE 5-STEP SYSTEM DESIGN PROCESS              ║
╠════════════════════════════════════════════════════════╣
║  STEP 1 │ Clarify Requirements & Scope                 ║
║         │ Define functional + non-functional needs     ║
╠═════════╪══════════════════════════════════════════════╣
║  STEP 2 │ Capacity Estimation                          ║
║         │ Size the system quantitatively               ║
╠═════════╪══════════════════════════════════════════════╣
║  STEP 3 │ API Contract Design                          ║
║         │ Define system boundaries and interfaces      ║
╠═════════╪══════════════════════════════════════════════╣
║  STEP 4 │ High-Level Architecture                      ║
║         │ Core components and data flow                ║
╠═════════╪══════════════════════════════════════════════╣
║  STEP 5 │ Deep Dives & Trade-offs                      ║
║         │ Bottlenecks, scaling, failure modes          ║
╚═════════╧══════════════════════════════════════════════╝

Each step serves a distinct purpose. Each step also generates output that the next step depends on — this sequencing is intentional, and understanding why the steps are in this order is as important as knowing what they contain.

🧠 Mnemonic: C-E-A-H-D — Clarify, Estimate, API, High-level, Deep dive. Or remember it as: "Can Every Architect Handle Depth?" — a question that your structured process will answer with a resounding yes.

You'll notice something important about this framework: it deliberately delays the part candidates are most eager to get to (the architecture) until after three preparatory steps. This sequencing mirrors how experienced engineers actually approach design problems in real settings. The impatience to skip to the architecture is understandable, but it's exactly what the framework is designed to override.

## A simple way to think about the framework as a dependency chain:

design_process = {
    "step_1_requirements": {
        "inputs": ["interview_prompt"],
        "outputs": ["functional_reqs", "non_functional_reqs", "scope_boundaries"],
        "unlocks": "step_2"
    },
    "step_2_estimation": {
        "inputs": ["functional_reqs", "non_functional_reqs"],
        "outputs": ["qps_estimate", "storage_estimate", "bandwidth_estimate"],
        "unlocks": "step_3"
    },
    "step_3_api_design": {
        "inputs": ["functional_reqs", "scope_boundaries"],
        "outputs": ["api_endpoints", "data_contracts", "system_boundaries"],
        "unlocks": "step_4"
    },
    "step_4_architecture": {
        "inputs": ["all_previous_outputs"],
        "outputs": ["component_diagram", "data_flow", "storage_choices"],
        "unlocks": "step_5"
    },
    "step_5_deep_dives": {
        "inputs": ["component_diagram", "qps_estimate"],
        "outputs": ["scaling_strategy", "failure_handling", "trade_off_analysis"],
        "unlocks": "offer"
    }
}

## Each step's outputs become the next step's inputs.
## Skipping a step doesn't save time — it creates hidden debt
## that collapses your reasoning later.

This dependency structure is why improvisation fails so consistently. When you skip requirements clarification and jump to architecture, you're building step 4's output without step 1's inputs. You're drawing boxes in a vacuum. And when the interviewer asks you to justify why you chose a relational database over a document store, you have no requirements to point to. You're guessing, and both of you know it.

📋 Quick Reference Card: Why Structure Beats Improvisation

🔧 Dimension	❌ Improvised Approach	✅ Structured Approach
🎯 Scope	Assumed, often wrong	Explicitly confirmed
⏱️ Time Use	Unbalanced, rushed finish	Intentionally allocated
💬 Communication	Reactive, fragmented	Proactive, narrated
⚖️ Decisions	Unjustified, pattern-matched	Reasoned from constraints
🧠 Signal Quality	Ambiguous to interviewer	Clear, assessable
😰 Candidate Anxiety	High (constant guessing)	Lower (clear next step)

Setting Up for What Comes Next

The sections that follow will take you through each step of this framework in detail. You'll see concrete examples of what good clarifying questions look like and how interviewers respond to them. You'll learn how to do back-of-the-envelope math quickly and confidently. You'll practice defining API contracts before you've drawn a single box on the whiteboard.

But all of that only works if you internalize the foundation being built here: system design interviews are process evaluations, not knowledge tests. The engineer who wins isn't the one who knows the most — it's the one who demonstrates the clearest thinking under ambiguous conditions.

Every time you sit down to practice, your goal isn't to design the perfect system. Your goal is to practice leading an open-ended problem through a structured process until it becomes muscle memory. The moment that process becomes automatic, your interview performance will stop depending on which topic you're asked about — because the process works on all of them.

💡 Pro Tip: The next time you practice a design problem, time-box each step explicitly. Give yourself 5 minutes for requirements clarification before you're allowed to touch the architecture. This artificial constraint will feel uncomfortable at first. That discomfort is the training.

Let's build your process, one step at a time.

The Five-Step Design Process Framework

A system design interview is not a test of memorization — it is a test of structured thinking under pressure. The difference between a candidate who impresses and one who struggles often comes down to whether they have a repeatable process they can apply confidently to any problem. This section introduces the five-step framework that transforms an open-ended, ambiguous question like "Design Twitter" into a coherent, professional engineering narrative.

Think of this framework as a scaffolding system. Each step builds upon the last, and each one serves a specific purpose in shaping your answer from vague idea to concrete architecture. Skipping steps — or executing them out of order — is precisely how strong engineers give weak interviews.

┌─────────────────────────────────────────────────────────────────┐
│              THE FIVE-STEP DESIGN PROCESS                       │
│                                                                 │
│  1. CLARIFY REQUIREMENTS  ──►  What are we building?           │
│           │                                                     │
│           ▼                                                     │
│  2. ESTIMATE SCALE        ──►  How big will it get?            │
│           │                                                     │
│           ▼                                                     │
│  3. DEFINE THE API        ──►  What does it expose?            │
│           │                                                     │
│           ▼                                                     │
│  4. HIGH-LEVEL ARCH.      ──►  What are the moving parts?      │
│           │                                                     │
│           ▼                                                     │
│  5. IDENTIFY BOTTLENECKS  ──►  Where will it break?            │
└─────────────────────────────────────────────────────────────────┘

Each step is a gate. You do not move to the next step until the current one is solid. This discipline is what signals senior engineering thinking to your interviewer — because senior engineers know that decisions made early constrain all decisions that follow.

Step 1 — Clarify Requirements

The very first thing you do after hearing the problem is resist the urge to start drawing boxes. Instead, you ask questions. Not random questions — targeted questions that separate functional requirements from non-functional requirements and establish the boundaries of what you are being asked to design.

Functional requirements describe what the system does: the specific behaviors and features it must support. For a URL shortener, a functional requirement might be "users can submit a long URL and receive a short one" or "users can be redirected when they visit a short URL."

Non-functional requirements describe how the system performs: its quality attributes and operational constraints. These include availability targets (e.g., 99.99% uptime), latency expectations (e.g., redirect must complete in under 100ms), consistency guarantees (e.g., eventual vs. strong consistency), and durability requirements (e.g., links must never expire unless explicitly deleted).

🎯 Key Principle: Non-functional requirements are often more architecturally consequential than functional ones. A system that needs 99.999% availability looks fundamentally different from one that tolerates 99.9% — even if both do the same thing functionally.

Constraints are a third category worth separating explicitly. Constraints are the hard limits placed on your design by the business or operational context: budget, existing infrastructure, team size, regulatory compliance (GDPR, HIPAA), or the time you have in the interview to explore a problem.

Here is a quick example of how requirements questioning plays out in practice:

Interviewer: "Design a notification system."

You: "Great. Before I start, I want to clarify scope. Are we supporting push notifications, email, SMS, or all three? Should notifications be delivered in real time or can we tolerate some delay? Do we need guaranteed delivery, or is best-effort acceptable? How many users are we serving?"

In under 60 seconds, you have transformed a vague prompt into a scoped engineering problem. That is the entire purpose of Step 1.

⚠️ Common Mistake: Assuming the scope instead of asking. Many candidates hear "Design Instagram" and immediately begin designing photo upload pipelines, feeds, and recommendation systems — when the interviewer only wanted to explore the photo storage layer. Always confirm scope.

Step 2 — Estimate Scale

Once you know what you are building and what constraints apply, you need to understand the size of the problem. Back-of-the-envelope estimation is the skill of producing approximate but directionally correct numbers using simple arithmetic and a handful of memorized constants.

Estimation serves two purposes. First, it grounds your architectural choices in reality — there is no point proposing a single-machine SQLite database for a system handling 10 million writes per day. Second, it demonstrates quantitative reasoning, which is a core competency of senior engineers.

The four dimensions you estimate in almost every system design problem are:

🧠 Users — How many monthly active users (MAU) and daily active users (DAU)?
📚 Traffic — How many requests per second (RPS) at peak load?
🔧 Storage — How many bytes do you need to store, and how fast does that grow?
🎯 Bandwidth — How much data flows in and out per second?

Here is how this looks for a URL shortener with 100 million DAU and an assumption that each user creates one link per month and clicks ten links per day:

Write volume:  100M users / 30 days  = ~3.3M new URLs/day
               3.3M / 86,400 sec    = ~38 writes/second

Read volume:   100M users × 10 clicks = 1B redirects/day
               1B / 86,400 sec        = ~11,574 reads/second
               → round to ~12,000 RPS

Storage:       Each URL record ≈ 500 bytes
               3.3M URLs/day × 500B  = 1.65 GB/day
               5-year retention      = 1.65 × 365 × 5 ≈ 3 TB

Bandwidth:     12,000 reads/sec × 500B = 6 MB/s outbound

These numbers immediately tell you something important: this system is read-heavy (reads outnumber writes by roughly 300:1). That single insight will shape every major architectural decision that follows — you will lean toward caching, read replicas, and CDN offloading rather than write-optimized storage engines.

💡 Pro Tip: You do not need exact numbers. You need order-of-magnitude accuracy. The difference between 10,000 RPS and 12,000 RPS is irrelevant. The difference between 100 RPS and 100,000 RPS is everything. Always round aggressively and state your assumptions aloud.

Below is a small Python snippet that captures the kind of mental model you can apply rapidly during estimation. You would not write code in the interview — but thinking through estimation programmatically helps you internalize the structure:

## Back-of-envelope estimation helper
## These are the kinds of calculations you run mentally in an interview

## Constants worth memorizing
SECONDS_PER_DAY = 86_400
KB = 1_024
MB = 1_024 * KB
GB = 1_024 * MB

def estimate_system(dau, reads_per_user_per_day, writes_per_user_per_month, record_size_bytes):
    """
    Produces a quick scale estimate for a read/write system.
    """
    # Traffic
    daily_reads = dau * reads_per_user_per_day
    read_rps = daily_reads / SECONDS_PER_DAY

    daily_writes = (dau * writes_per_user_per_month) / 30
    write_rps = daily_writes / SECONDS_PER_DAY

    # Storage (5-year projection)
    daily_storage_bytes = daily_writes * record_size_bytes
    five_year_storage_gb = (daily_storage_bytes * 365 * 5) / GB

    # Bandwidth
    outbound_mbps = (read_rps * record_size_bytes) / MB

    print(f"Read RPS:          {read_rps:,.0f}")
    print(f"Write RPS:         {write_rps:,.0f}")
    print(f"5-Year Storage:    {five_year_storage_gb:,.1f} GB")
    print(f"Outbound Bandwidth:{outbound_mbps:,.2f} MB/s")
    print(f"Read/Write Ratio:  {read_rps/write_rps:.0f}:1")

## URL shortener example
estimate_system(
    dau=100_000_000,
    reads_per_user_per_day=10,
    writes_per_user_per_month=1,
    record_size_bytes=500
)

Running this produces output consistent with the manual calculation above and makes the read-heavy nature explicit. That read/write ratio line is the single most important diagnostic output.

🤔 Did you know? Google engineers famously use a set of "numbers every engineer should know" — latency figures for L1 cache reads, SSD random reads, network round trips, and disk seeks. Knowing that an SSD random read takes ~100 microseconds but a disk seek takes ~10 milliseconds gives you the intuition to choose storage technologies wisely without running benchmarks in the interview.

Step 3 — Define the API Contract

With requirements scoped and scale estimated, you are ready to define the API contract — the formal boundary between your system and the outside world. This step is frequently skipped by candidates, and that is a serious mistake. Defining the API before designing the internals forces clarity about what the system must actually do, expressed in unambiguous terms.

An API contract specifies three things for each operation:

Inputs — what the caller provides (parameters, types, authentication context)
Outputs — what the system returns (response shape, status codes, error formats)
System boundaries — what this system is responsible for versus what it delegates to other services

For a URL shortener, the API contract might look like this:

POST /shorten
  Input:  { long_url: string, custom_alias?: string, expiry_days?: int }
  Output: { short_code: string, short_url: string, expires_at?: ISO8601 }
  Errors: 400 (invalid URL), 409 (alias already taken), 401 (unauthenticated)

GET /{short_code}
  Input:  short_code (path param), User-Agent (header)
  Output: HTTP 301/302 redirect to long_url
  Errors: 404 (code not found), 410 (expired)

GET /analytics/{short_code}
  Input:  short_code, date_range
  Output: { clicks: int, unique_visitors: int, top_referrers: [] }
  Errors: 404, 403 (not owner)

Notice what defining this contract reveals immediately: the GET /{short_code} endpoint needs to be extremely fast (it is on the critical path of every redirect), while GET /analytics can tolerate more latency since it is not user-blocking. That asymmetry directly influences your architecture — you will cache redirect lookups aggressively while analytics queries run against a slower, cheaper store.

💡 Real-World Example: At companies like Stripe and Twilio, the API contract is often designed before implementation begins, as a living document that both client and server teams agree on. This "API-first" approach is exactly what you are mimicking in Step 3. It signals that you think like a platform engineer, not just an implementer.

Here is a more formal representation of the same contract using a Python dataclass pattern, which makes input/output types explicit:

from dataclasses import dataclass
from typing import Optional
from datetime import datetime

## --- Request models ---

@dataclass
class ShortenRequest:
    long_url: str                  # Must be a valid URL
    custom_alias: Optional[str]    # If None, system generates a code
    expiry_days: Optional[int]     # If None, link never expires
    user_id: str                   # Extracted from auth token, not user-supplied

@dataclass
class AnalyticsRequest:
    short_code: str
    start_date: datetime
    end_date: datetime
    requesting_user_id: str        # For ownership validation

## --- Response models ---

@dataclass
class ShortenResponse:
    short_code: str                # e.g., "abc123"
    short_url: str                 # e.g., "https://sho.rt/abc123"
    expires_at: Optional[datetime] # None if permanent

@dataclass
class AnalyticsResponse:
    short_code: str
    total_clicks: int
    unique_visitors: int
    top_referrers: list[str]       # Top 5 referring domains
    breakdown_by_day: list[dict]   # [{"date": "2024-01-01", "clicks": 42}]

This level of specificity takes only a few minutes to sketch in an interview, but it communicates volumes. It shows that you think about error cases, authentication context, and the difference between what callers supply and what the system derives internally.

⚠️ Common Mistake: Conflating the API layer with the internal architecture. Your API contract says what the system does from the outside. It says nothing about how the data flows internally. Candidates who jump from API to database schema without acknowledging this distinction often paint themselves into design corners.

🧠 Mnemonic: I.O.B. — Inputs, Outputs, Boundaries. For every API endpoint you define, answer those three letters before moving on.

Step 4 — High-Level Architecture Overview

With requirements, scale numbers, and an API contract in place, you have earned the right to start drawing boxes. High-level architecture is the bird's-eye view of your system — the major components and how they connect, without yet diving into the internals of any one component.

At this stage, you are sketching a diagram that shows:

🔧 Clients — who calls the system (web browsers, mobile apps, other services)
🎯 Entry points — load balancers, API gateways, CDN edges
📚 Core services — the primary application logic components
🔒 Data stores — what kind of storage each component uses and why
🧠 Async infrastructure — message queues, event streams, background workers

┌──────────┐     ┌──────────────┐     ┌─────────────────┐
│  Client  │────►│  API Gateway │────►│  App Service    │
│ (Browser)│     │  (Rate Limit)│     │  (Shortener)    │
└──────────┘     └──────────────┘     └────────┬────────┘
                                               │
                              ┌────────────────┼────────────────┐
                              │                │                │
                         ┌────▼────┐    ┌──────▼──────┐  ┌─────▼─────┐
                         │  Cache  │    │   Primary   │  │  Message  │
                         │ (Redis) │    │   DB (SQL)  │  │   Queue   │
                         └─────────┘    └─────────────┘  └─────┬─────┘
                                                               │
                                                        ┌──────▼──────┐
                                                        │  Analytics  │
                                                        │   Worker    │
                                                        └─────────────┘

The purpose of this diagram is not to be perfect — it is to establish a shared mental model with your interviewer before you start drilling into any one component. Think of it as the map you show before leading a hike. You would not start hiking without showing the trail first.

💡 Mental Model: Your high-level architecture is a promise to your interviewer. You are saying: "Here are all the moving parts I believe this system needs. I will now justify each one and explore the hard parts." Every box in your diagram should appear for a reason — if you cannot explain why a component exists, remove it.

🎯 Key Principle: At the high-level stage, technology choices are less important than structural choices. Whether you use PostgreSQL or MySQL matters far less than whether you need a relational database at all versus a document store or a wide-column store. Make structural decisions first; defer technology selections until you have deeper justification.

Because high-level architecture and bottleneck resolution each deserve deep treatment, they are explored fully in the child lessons that follow this one. For now, understand their role in the sequence: Step 4 gives you a canvas, and Step 5 identifies where that canvas needs reinforcement.

Step 5 — Identify and Resolve Bottlenecks

No first-draft architecture is correct. Step 5 is the iterative refinement pass where you stress-test your own design. You ask: Where will this system fail? Under what load does a component become the limiting factor? What happens when a service goes down?

Bottleneck identification is a structured scan across three failure dimensions:

📚 Throughput bottlenecks — components that cannot handle the RPS you estimated in Step 2
🔧 Latency bottlenecks — components that add unacceptable delay to the critical path
🎯 Single points of failure (SPOF) — components whose failure would bring down the entire system

For each bottleneck you identify, you propose a resolution: horizontal scaling, caching layers, database sharding, read replicas, circuit breakers, async processing, or geographic distribution. The resolution then becomes part of your refined architecture, which may surface new bottlenecks — hence the iterative nature.

💡 Pro Tip: When you identify a bottleneck in an interview, name it explicitly before proposing the fix. Say: "I see a potential bottleneck here — the database is the single point of contention for all reads and writes. I would resolve this by introducing a read replica tier and a Redis cache in front of it." This narration proves you are reasoning, not just reciting patterns.

Like Step 4, bottleneck resolution is covered in dedicated depth in a later lesson. The key takeaway here is positional: Step 5 always comes after you have a complete picture, not before. Candidates who start optimizing before they have sketched the full system waste time solving problems that may not exist.

How the Five Steps Form a Cohesive Narrative

The power of this framework is not in any individual step — it is in how the steps chain together to produce a design narrative that feels inevitable and professional.

REQUIREMENTS  →  SCALE        →  API           →  ARCHITECTURE  →  BOTTLENECKS
(What)           (How big)       (What boundary)  (What parts)     (What breaks)

Answers:         Answers:        Answers:          Answers:         Answers:
"Are we          "Do we need     "What exact       "How do the      "Is this design
buildng the      sharding?"      contract must     parts fit        actually
right thing?"    "Do we need     the system        together?"       resilient?"
                 caching?"       honor?"                            "Can it scale?"

Notice how each step answers a distinct question and how the answer to each question constrains the next. Requirements constrain scale estimates (you cannot estimate traffic without knowing what operations users perform). Scale estimates constrain the API design (a 99th-percentile latency requirement influences how you structure synchronous vs. asynchronous endpoints). The API contract constrains the architecture (the data shapes in your responses influence your storage model). And the architecture constrains your bottleneck analysis (you can only identify bottlenecks in things that exist).

❌ Wrong thinking: "I'll figure out the requirements as I go — let me just start with the database schema."

✅ Correct thinking: "I need to know what I'm building, at what scale, with what contract, before I make any storage decisions."

📋 Quick Reference Card:

🔢 Step	🎯 Purpose	⏱️ Interview Time	📤 Output
🧠 1. Clarify Requirements	Define what to build	3–5 min	Functional + non-functional list
📚 2. Estimate Scale	Size the system	3–5 min	RPS, storage, bandwidth numbers
🔧 3. Define API	Set system boundary	3–5 min	Endpoint definitions
🎯 4. High-Level Arch	Sketch the solution	5–10 min	Component diagram
🔒 5. Bottlenecks	Stress-test and refine	5–10 min	Hardened architecture

Applied consistently, this five-step process transforms a 45-minute interview into a structured engineering conversation. The interviewer is not watching you produce a perfect answer — they are watching you think. A repeatable process makes your thinking visible, and visible thinking is what wins system design interviews.

Clarifying Requirements and Scoping the Problem

Before you draw a single box, write a single service name, or mention a single database, you need to do something that separates senior engineers from junior ones in a system design interview: stop and ask questions. This phase — clarifying requirements and scoping the problem — is where interviews are quietly won or lost. A candidate who dives straight into architecture is essentially building a house without measuring the lot. A candidate who methodically uncovers what the system actually needs to do, and how well it needs to do it, signals that they understand engineering is fundamentally a problem of constraints.

This section gives you a battle-tested framework for doing exactly that.

Functional vs. Non-Functional Requirements: The Foundational Distinction

The first mental model you need is the distinction between two very different types of requirements. These are not interchangeable, and conflating them is one of the most common early mistakes candidates make.

Functional requirements describe what the system does — the specific behaviors and features users can observe and interact with. If you were designing Twitter, functional requirements might be: users can post tweets, users can follow other users, users can see a timeline of tweets from people they follow.

Non-functional requirements describe how well the system does it — the quality attributes that constrain your design choices. For the same Twitter system: the timeline must load in under 200ms, the system must handle 100 million daily active users, tweet delivery must be eventually consistent (not strongly consistent).

🎯 Key Principle: Functional requirements shape what you build. Non-functional requirements shape how you build it. Your architecture decisions — which databases you choose, whether you use synchronous or asynchronous communication, how you partition data — are almost entirely driven by non-functional requirements.

Think of it this way:

FUNCTIONAL                    NON-FUNCTIONAL
─────────────────             ──────────────────────────
Users can upload photos        Uploads must complete < 2s
Users can search posts         Search latency < 100ms p99
Users can send DMs             Messages delivered reliably
Admin can ban accounts         System available 99.99% uptime

A helpful mnemonic for remembering the non-functional categories:

🧠 Mnemonic: SCALPS — Scalability, Consistency, Availability, Latency, Performance, Security. Whenever you are probing for non-functional requirements, run through SCALPS mentally to make sure you haven't missed a dimension.

The Questions You Should Always Ask

Not every question is equally valuable in an interview context. You have limited time — usually 45 to 60 minutes total — so your questions need to be surgical. Below are the four most high-leverage question categories, each with example phrasings you can use verbatim.

Read/Write Ratio

The read/write ratio tells you whether your system is read-heavy, write-heavy, or balanced. This single number influences almost every major architectural decision: whether to use a relational or document database, whether to invest heavily in caching, whether to replicate data aggressively, and where to place your bottlenecks.

"Is this system predominantly read-heavy, write-heavy, or roughly balanced? For example, are we expecting 10 reads for every write, or something more like 100:1?"

A URL shortener like bit.ly is heavily read-skewed (people click links far more than they create them). A logging pipeline is heavily write-skewed. A collaborative document editor is closer to balanced. Each profile implies a different caching strategy, replication model, and database selection.

Expected Scale

Scale encompasses both the volume of data and the volume of traffic. You need both dimensions because a system with 10 million users but only 1 KB of data per user is a completely different design problem from a system with 1 million users who each generate 10 GB of video.

"What's our expected number of daily active users? And for the core write operation — say, posting a message — what's our peak requests-per-second target? Are we expecting that to grow significantly in the next two years?"

💡 Pro Tip: Interviewers often don't have a precise answer prepared. That's fine — push for an order of magnitude. "Millions or tens of millions?" is a perfectly reasonable follow-up. The goal is to get within one order of magnitude so your estimation work in the next phase is meaningful.

Consistency vs. Availability Trade-offs

This question probes the CAP theorem implications for your system. In a distributed system, when a network partition occurs, you must choose between maintaining consistency (every read sees the most recent write) or availability (every request gets a response, even if stale). Most real systems live on a spectrum between strong consistency and eventual consistency.

"If two users are looking at the same piece of data, is it acceptable for them to temporarily see different values? In other words, can we tolerate eventual consistency, or does this domain require strong consistency?"

For a bank balance, the answer is almost always strong consistency — showing a stale balance is dangerous. For a "likes count" on a social post, eventual consistency is perfectly fine. Getting this answer early prevents you from over-engineering consistency where it isn't needed, or dangerously under-engineering it where it is.

Geography

Geographic distribution affects latency, data sovereignty, and disaster recovery strategy. A system designed for users in a single country has radically different infrastructure requirements than one designed for a global audience.

"Are we serving a global user base, or are users concentrated in a specific region? Do we have data residency requirements — for example, EU user data must stay in Europe?"

This single question can introduce or eliminate multi-region replication, CDN requirements, and compliance architecture from your design.

Scoping Techniques: MVPs and the Parking Lot

Once you have a picture of what the system should do and how well it should do it, you face a second challenge: the problem is almost certainly too large to design end-to-end in 45 minutes. Scoping is how you manage that reality professionally.

MVP feature prioritization means identifying the minimum set of features that constitute the core system — the features without which the product is meaningless — and explicitly agreeing with your interviewer that you will design for those first. Everything else is a nice-to-have that you may revisit if time allows.

For a ride-sharing app like Uber, the MVP might be:

Rider can request a ride
Nearby driver receives the request and accepts
Rider can see driver ETA
Ride completes and rider is charged

Notably absent from the MVP: surge pricing, promotions, driver ratings, trip history, support chat, and carpooling. Those are real features — but they are not the core system.

The technique for managing the excluded features is the parking lot — an explicit, visible list of features you have acknowledged but chosen not to design in this session. Naming your parking lot does something powerful: it proves to the interviewer that you are aware the full system is larger than what you're designing. You're not ignorant of the complexity; you're managing it deliberately.

┌─────────────────────────────────────────────────────┐
│                  SCOPE BOUNDARY                     │
│                                                     │
│  ✅ IN SCOPE (MVP)        🅿️ PARKING LOT             │
│  ─────────────────        ─────────────────────     │
│  • Request a ride         • Surge pricing           │
│  • Driver matching        • Driver ratings          │
│  • Real-time ETA          • Trip history API        │
│  • Payment processing     • Promotional codes       │
│                           • Carpooling / pooling    │
└─────────────────────────────────────────────────────┘

💡 Real-World Example: In a Google interview for designing Google Maps, a strong candidate explicitly said: "I'm going to scope this to turn-by-turn navigation and ETA calculation. I'm parking real-time traffic updates, public transit routing, and offline maps — I'll come back to those if we have time." This took 30 seconds and immediately established credibility.

Capturing Requirements as a Structured Checklist

During the actual interview, you should be writing your requirements down as you discover them. Don't trust your working memory when you're nervous and time-pressured. A structured requirements checklist serves as your contract with the interviewer — both of you can see what you agreed to design, and you can refer back to it if the conversation drifts.

Here is a pseudoschema-style checklist format you can adapt and write on a whiteboard or shared document in real time:

## ============================================
## SYSTEM DESIGN REQUIREMENTS CHECKLIST
## System: URL Shortener (e.g., bit.ly)
## ============================================

FUNCTIONAL_REQUIREMENTS = [
    "POST /shorten  → accepts long URL, returns short code",
    "GET  /{code}   → redirects to original long URL",
    "(Parking Lot) Analytics dashboard per short link",
    "(Parking Lot) Custom alias support",
    "(Parking Lot) Link expiration",
]

NON_FUNCTIONAL_REQUIREMENTS = {
    "scale": {
        "dau": "100M daily active users",
        "write_qps": "~1,000 new URLs shortened per second",
        "read_qps":  "~100,000 redirects per second",   # 100:1 read:write
        "data_retention": "URLs stored indefinitely",
    },
    "latency": {
        "redirect_p99": "< 10ms",   # Users expect instant redirect
        "shorten_p99":  "< 500ms",
    },
    "availability":    "99.99% uptime (Four Nines)",
    "consistency":     "Eventual consistency acceptable",
    "geography":       "Global, CDN-friendly",
    "security":        "No NSFW/malicious URL filtering in scope",
}

This code block is not meant to run — it's a structured notation for capturing requirements clearly during the interview. Using dictionary and list syntax makes the structure explicit and scannable. The interviewer can read it at a glance and correct any misunderstandings before you've spent 20 minutes designing the wrong system.

Here's a more general template you can memorize and adapt to any problem:

## ============================================
## GENERIC REQUIREMENTS CAPTURE TEMPLATE
## ============================================

class SystemDesignRequirements:

    # --- FUNCTIONAL (What it does) ---
    core_features = []          # Must-have for MVP
    parking_lot   = []          # Acknowledged, out of scope today

    # --- NON-FUNCTIONAL (How well it does it) ---
    scale = {
        "users":     None,      # DAU or MAU?
        "read_qps":  None,      # Reads per second
        "write_qps": None,      # Writes per second
        "storage_gb": None,     # Total data volume estimate
    }
    latency       = {}          # Per-operation SLA targets
    availability  = None        # e.g., 99.9%, 99.99%
    consistency   = None        # Strong / eventual / session
    geography     = None        # Single region / multi-region / global
    compliance    = []          # GDPR, HIPAA, SOC2, etc.

Note how the template includes compliance as a field. In real interviews, regulatory requirements are often forgotten entirely. A healthcare system subject to HIPAA has fundamentally different storage and access-logging requirements. Asking about compliance once — even if the answer is "none for this exercise" — demonstrates the kind of holistic thinking interviewers reward.

The Most Dangerous Mistake: Designing for Assumptions

Everything above is preparation against one catastrophic failure mode: designing for unstated assumptions instead of confirmed requirements.

Here is how this failure unfolds in practice. The interviewer says: "Design a notification system." The candidate thinks: This is like the push notification system at my last job. I'll design for mobile push at scale. They spend 20 minutes designing a sophisticated APNs/FCM pipeline. Then the interviewer says: "Actually, this is for email and SMS notifications for a banking app — mobile push isn't in scope." Twenty minutes. Gone.

⚠️ Common Mistake: Every detail you assume is a gamble you take with your limited interview time. Interviewers deliberately leave problem statements vague to test whether you'll ask clarifying questions or charge ahead blindly.

Here are three specific assumption traps to avoid:

❌ Wrong thinking: "It says 'messaging system,' so it's probably like WhatsApp — I'll design for real-time chat." ✅ Correct thinking: "Before I start, can you tell me what kind of messages we're delivering? Real-time chat, async notifications, transactional emails, or something else?"

❌ Wrong thinking: "They didn't mention scale, so I'll just design for millions of users to be safe." ✅ Correct thinking: "What's the expected user scale? I want to make sure I'm solving the right problem — a 10,000 user internal tool and a 100 million user consumer app require very different architectures."

❌ Wrong thinking: "Consistency is probably important, so I'll use a strongly consistent distributed transaction system." ✅ Correct thinking: "Does this domain require strong consistency, or is eventual consistency acceptable? I want to confirm before I choose a storage model."

🤔 Did you know? A study of technical interview feedback at major tech companies consistently lists "made too many assumptions" as one of the top reasons candidates fail system design rounds — not because their architecture was wrong, but because they were solving the wrong problem confidently.

Putting It All Together: A Requirements Discovery Dialogue

Let's walk through what a well-executed requirements phase sounds like as a continuous conversation. The interviewer has said: "Design Instagram."

Candidate: "Great. Before I start drawing anything, I'd like to spend a few minutes clarifying requirements. Is that okay?"

Interviewer: "Go ahead."

Candidate: "For core functionality — should I focus on the main user-facing features: posting photos, following users, and viewing a home feed? Or are there other features that are must-haves for this exercise?"

Interviewer: "Those three, plus photo upload, are the core."

Candidate: "Got it. I'll park stories, reels, DMs, explore page, and advertising — I can revisit those later. On scale: are we talking a startup-scale Instagram or production-scale, like a billion monthly active users?"

Interviewer: "Let's say 500 million DAU."

Candidate: "And is this a read-heavy system? I'd expect photo views and feed loads vastly outnumber new posts."

Interviewer: "Roughly 100:1 reads to writes, yes."

Candidate: "For the feed, is it acceptable for a user to see posts that are a few seconds stale, or do we need real-time consistency?"

Interviewer: "Eventual consistency is fine for the feed."

Candidate: "Global deployment? Are we serving users in Asia, Europe, and the Americas?"

Interviewer: "Yes, global."

Candidate: "Perfect. Let me write that down..."

In under two minutes, the candidate has confirmed: MVP scope, parking lot items, scale (500M DAU), read/write ratio (100:1), consistency model (eventual), and geography (global). That is a complete non-functional requirements profile. Everything that follows — estimation, API design, architecture — builds on solid, confirmed ground.

📋 Quick Reference Card: Requirements Clarification Checklist

🔧 Category	🎯 Question to Ask	📚 Why It Matters
🔧 Core Features	What are the must-have behaviors?	Defines what you're designing
🅿️ Scope	What's explicitly out of scope?	Prevents scope creep
📊 Scale	DAU, peak QPS, data volume?	Drives capacity and architecture
⚖️ Read/Write	What's the read-to-write ratio?	Cache strategy, DB selection
🔄 Consistency	Strong or eventual?	Replication model, DB type
✅ Availability	What's the uptime SLA?	Redundancy and failover design
🌍 Geography	Single region or global?	CDN, multi-region replication
🔒 Compliance	GDPR, HIPAA, SOC2?	Storage, access logging, encryption

The discipline of clarifying requirements before designing is ultimately a form of professional respect — for the interviewer's time, for the complexity of the problem, and for the users who will depend on the system you design. Every minute you spend confirming requirements in this phase saves ten minutes of redesigning architecture that solves the wrong problem. When you move into the next phase — estimation and API design — you will do so with a solid, confirmed foundation. That confidence is visible to interviewers, and it is exactly what senior-level engineering judgment looks like in practice.

Back-of-the-Envelope Estimation and API Contract Design

You have clarified the requirements and agreed on scope with your interviewer. Now what? Many candidates make the mistake of jumping straight to drawing boxes on the whiteboard — databases here, caches there, load balancers everywhere. The result looks impressive for about thirty seconds, until the interviewer asks: "How many requests per second are you designing for?" Silence. The architecture was never grounded in reality.

The two steps covered in this section — back-of-the-envelope estimation and API contract design — are the bridge between requirements and architecture. Estimation tells you how big the system needs to be. API design tells you what the system does from the outside. Together they give your subsequent architectural decisions a foundation that is both quantitative and contractual.

The Estimation Toolkit: Building Blocks Every Engineer Must Know

Good estimation is not guessing. It is structured reasoning with a small set of memorized building blocks that you combine quickly. There are three categories worth internalizing before any interview.

Powers of Two

Powers of two are the universal language of computer storage and throughput. You do not need a calculator if you have internalized the key reference points:

Value         Power of 2    Approximate size
-----------   ----------    ----------------
1,000         2^10          1 Kilobyte  (KB)
1,000,000     2^20          1 Megabyte  (MB)
1,000,000,000 2^30          1 Gigabyte  (GB)
1 trillion    2^40          1 Terabyte  (TB)
1 quadrillion 2^50          1 Petabyte  (PB)

The practical trick: when multiplying or dividing by thousands during estimation, just shift between these named units. A system storing 1 KB per user record with 500 million users stores 500 million KB = 500 TB of data — no calculator required.

Latency Numbers Every Engineer Should Know

Latency numbers are the second pillar. These figures, popularized by Jeff Dean at Google, give you order-of-magnitude intuition for what operations cost:

Operation                          Approximate Latency
---------------------------------  -------------------
L1 cache reference                 ~0.5 ns
L2 cache reference                 ~7 ns
Main memory (RAM) reference        ~100 ns
SSD random read                    ~150 µs  (150,000 ns)
Network round-trip in same DC      ~500 µs
HDD seek                           ~10 ms   (10,000,000 ns)
Network round-trip coast-to-coast  ~150 ms

The ratios matter more than exact numbers. RAM is ~1,000× faster than SSD. SSD is ~30× faster than spinning disk. A cross-datacenter network call costs as much as thousands of RAM accesses. These ratios directly inform technology choices: if you need sub-millisecond reads, you need an in-memory store, not a disk-based database.

🧠 Mnemonic: "Memory is micro, disk is milli, cross-continent is centi." RAM operations finish in microseconds, disk in milliseconds, and cross-continental round trips in hundreds of milliseconds.

Traffic Math: Seconds in a Day

The third building block is a simple fact: there are approximately 86,400 seconds in a day, which engineers routinely round to 10^5 (100,000) seconds per day for easy mental arithmetic. This one number unlocks traffic calculations:

1 million requests/day ÷ 100,000 seconds ≈ 10 QPS (queries per second)
100 million requests/day ÷ 100,000 seconds ≈ 1,000 QPS
1 billion requests/day ÷ 100,000 seconds ≈ 10,000 QPS

🎯 Key Principle: Peak traffic is typically 2–3× the average. If you estimate 1,000 average QPS, design for 2,000–3,000 peak QPS. Systems that only handle average load fail on Monday mornings and during marketing campaigns.

Estimation Walkthrough: A URL Shortener Service

Let us apply these building blocks to a concrete example. Suppose you are designing a URL shortener service similar to bit.ly. Your requirements phase established: 100 million new URLs shortened per day, a 100:1 read-to-write ratio, and URLs stored for five years.

Step 1 — Estimate QPS

Write QPS (URL creation):

100 million writes/day ÷ 100,000 seconds/day = 1,000 writes/second

Read QPS (URL redirects), using the 100:1 ratio:

1,000 writes/second × 100 = 100,000 reads/second

Peak QPS (applying a 2× spike factor):

Peak writes: ~2,000/second
Peak reads:  ~200,000/second

This immediately tells you something important: this is a read-heavy system by two orders of magnitude. Your architecture must optimize for reads, not writes. A caching layer becomes almost mandatory.

Step 2 — Estimate Storage

Assume each shortened URL record stores:

Original long URL: ~500 bytes average
Short URL key: ~7 bytes
Metadata (timestamps, user ID): ~100 bytes
Total per record: ~600 bytes

Daily new records: 100 million
Per-record size:   600 bytes
Daily storage:     100M × 600B = 60 GB/day

Five-year storage:
60 GB/day × 365 days × 5 years = ~109 TB

Rounded: roughly 110 TB of raw storage over five years. This number tells you that a single database server is not enough. You will need sharding or a distributed storage solution. It also tells you the data fits comfortably in cloud object storage if needed.

Step 3 — Estimate Bandwidth

Inbound bandwidth (URL creation requests):

1,000 writes/second × 600 bytes/request ≈ 600 KB/s inbound

Outbound bandwidth (redirect responses): A redirect response is small — just an HTTP 301/302 with a Location header, roughly 500 bytes:

100,000 reads/second × 500 bytes = 50 MB/s outbound

This is a modest bandwidth number. A single 1 Gbps network link can handle it. But with 200,000 peak reads/second, you are looking at 100 MB/s peak — well within modern infrastructure but worth noting for CDN and caching strategy.

Translating Estimates into Design Constraints

Now comes the payoff. Your estimates are not just numbers — they are design constraints that directly drive technology choices:

Estimate                  → Design Constraint
------------------------  -----------------------------------------------
100,000 reads/second      → Need caching layer (Redis/Memcached)
110 TB over 5 years       → Need distributed DB or sharding strategy
1,000 writes/second       → Single write primary is feasible initially
600 bytes/record          → NoSQL key-value store is ideal (simple lookups)
2-orders read/write ratio → Read replicas are essential

⚠️ Common Mistake: Presenting estimates without connecting them to decisions. An interviewer watching you calculate 100,000 reads/second wants to hear you say: "This read volume is why I'm going to propose a caching layer in front of the database." Numbers without conclusions are just arithmetic.

💡 Pro Tip: Write your estimates in a visible corner of the whiteboard and refer back to them when justifying architectural choices. This demonstrates that your design is driven by data, not intuition — a hallmark of senior engineering thinking.

Defining the API Contract

With your estimates on the board, you know the system's scale. Now you define what the system does — its API contract. This step happens before you touch internal architecture, and that ordering is intentional.

🎯 Key Principle: The API is the system's public promise. It defines what clients can depend on. Everything inside the system — databases, caches, queues — is an implementation detail that can change. The API, once published, cannot break backward compatibility without coordinating with every client.

REST vs RPC: Choosing Your Style

REST (Representational State Transfer) organizes the API around resources and uses HTTP verbs (GET, POST, PUT, DELETE) to express operations. It is the dominant style for public APIs and web services.

RPC (Remote Procedure Call), including gRPC and Thrift, organizes the API around actions rather than resources. It is common in internal microservice communication where performance and strong typing matter more than convention.

For a system design interview, REST is almost always the clearer choice to communicate intent. Define two to four endpoints that cover the core use cases from your requirements.

URL Shortener API Contract

Here is a complete REST API contract for the URL shortener:

## Create a shortened URL
POST /api/v1/urls
Content-Type: application/json
Authorization: Bearer <token>

Request body:
{
  "original_url": "https://www.example.com/very/long/path?with=params",
  "custom_alias": "my-link",   // optional
  "expires_at": "2026-01-01"   // optional, ISO 8601
}

Response 201 Created:
{
  "short_url": "https://short.ly/abc123",
  "short_code": "abc123",
  "original_url": "https://www.example.com/very/long/path?with=params",
  "created_at": "2024-01-15T10:30:00Z",
  "expires_at": "2026-01-01T00:00:00Z"
}

## Redirect a short URL (the core user-facing operation)
GET /{short_code}

Response 301 Moved Permanently:
Location: https://www.example.com/very/long/path?with=params

## Retrieve URL metadata (for analytics dashboard)
GET /api/v1/urls/{short_code}
Authorization: Bearer <token>

Response 200 OK:
{
  "short_code": "abc123",
  "original_url": "https://www.example.com/very/long/path?with=params",
  "click_count": 4821,
  "created_at": "2024-01-15T10:30:00Z",
  "expires_at": "2026-01-01T00:00:00Z"
}

## Delete a short URL
DELETE /api/v1/urls/{short_code}
Authorization: Bearer <token>

Response 204 No Content

Notice what this API definition accomplishes: it forces clarity on data types, authentication requirements, and edge cases (custom aliases, expiration) that were mentioned in requirements but could otherwise drift into the architecture phase as vague assumptions.

An RPC Alternative with gRPC

For an internal service — imagine this URL shortener is one microservice among many — you might define the contract using Protocol Buffers (the schema language for gRPC):

// url_shortener.proto
syntax = "proto3";

package urlshortener.v1;

// The URL Shortener service definition
service UrlShortenerService {
  // Create a new shortened URL
  rpc CreateShortUrl(CreateShortUrlRequest) returns (CreateShortUrlResponse);

  // Resolve a short code to the original URL (hot path, must be fast)
  rpc ResolveUrl(ResolveUrlRequest) returns (ResolveUrlResponse);

  // Get metadata and analytics for a URL
  rpc GetUrlMetadata(GetUrlMetadataRequest) returns (UrlMetadata);

  // Delete a shortened URL
  rpc DeleteUrl(DeleteUrlRequest) returns (DeleteUrlResponse);
}

message CreateShortUrlRequest {
  string original_url = 1;         // required
  string custom_alias = 2;         // optional, empty string = auto-generate
  int64  expires_at_unix = 3;      // optional, 0 = no expiration
  string user_id = 4;              // required for authentication
}

message CreateShortUrlResponse {
  string short_code = 1;
  string short_url  = 2;           // full URL with domain
  int64  created_at_unix = 3;
}

message ResolveUrlRequest {
  string short_code = 1;
}

message ResolveUrlResponse {
  string original_url = 1;
  bool   is_expired   = 2;         // client handles expired gracefully
}

The gRPC definition makes the strong typing explicit and generates client/server code automatically. For interview purposes, either format works — what matters is that you define the contract before the architecture.

💡 Real-World Example: Twitter's internal services use Thrift (a similar RPC framework). When Twitter's timeline service needs to call the tweet storage service, it does so through a strictly versioned Thrift contract. Changing that contract requires coordination across teams — exactly the discipline that API-first thinking enforces.

How the API Contract Prevents Scope Creep

Here is where the API definition step earns its place in the process beyond just being "good engineering practice."

In a system design interview, scope creep is one of the most dangerous failure modes. It happens when the conversation drifts — the interviewer mentions analytics, you start designing a real-time dashboard, and suddenly twenty minutes have passed and you have not touched the core URL shortening architecture.

A defined API contract is your scope anchor. Once you and the interviewer agree on three or four endpoints, you have implicitly agreed on the system boundary. When a new idea emerges, you can evaluate it explicitly: "That would require a new endpoint — should we add it to scope or note it as a future extension?" This keeps the conversation structured and signals that you understand product engineering, not just systems.

  Without API contract         With API contract
  ---------------------        ---------------------
  Requirements                 Requirements
       |                            |
       v                            v
  Architecture <-- drifting    API Contract (agreed)
       |           scope            |
       v                            v
  Implementation               Architecture (bounded)
                                    |
                                    v
                               Implementation

⚠️ Common Mistake: Defining too many endpoints. You do not need to cover every conceivable operation. Define the core paths that fulfill your functional requirements. A URL shortener needs URL creation, redirection, and deletion — not seventeen analytics variants. Save depth for architecture.

Putting It Together: The Estimation-to-API Flow

These two steps work as a unified bridge. Here is how they connect in practice during an interview:

  REQUIREMENTS PHASE
  (previous step)
         |
         v
  +-------------------------------+
  |  BACK-OF-THE-ENVELOPE         |
  |  ESTIMATION                   |
  |                               |
  |  1. Calculate write QPS       |
  |  2. Calculate read QPS        |
  |  3. Estimate storage (5 yr)   |
  |  4. Estimate bandwidth        |
  |  5. Derive design constraints |
  +-------------------------------+
         |
         | "Now I know the scale. Let me define
         |  what the system does externally."
         v
  +-------------------------------+
  |  API CONTRACT DESIGN          |
  |                               |
  |  1. Choose REST or RPC        |
  |  2. Define 3-5 core endpoints |
  |  3. Specify request/response  |
  |  4. Note auth requirements    |
  |  5. Agree with interviewer    |
  +-------------------------------+
         |
         | "Great. Now I'll design the
         |  internals to meet this contract
         |  at the estimated scale."
         v
  ARCHITECTURE PHASE
  (next step)

When you walk into the architecture phase with both a quantitative scale target and a contractual boundary, every architectural decision you make can be justified in two ways: it handles the estimated load, and it supports the agreed-upon API. This dual justification is what separates candidates who receive senior engineer feedback from those who receive "good but unclear reasoning" feedback.

💡 Pro Tip: Before moving from API design to architecture, do a ten-second verbal summary: "So we're designing for 100,000 reads per second, 1,000 writes per second, roughly 110 TB over five years, with these four endpoints as the external contract. Let me now walk through the internal architecture." This resets both you and the interviewer and signals confident, organized thinking.

📋 Quick Reference Card: Estimation and API Design Checklist

Step	What to Do	Watch Out For
🔢 Daily volume	State write volume in requests/day	Don't skip the source assumption
⚡ QPS	Divide by 100,000, then apply read ratio	Forget peak factor (2-3×)
💾 Storage	Per-record size × volume × retention	Forget replication factor (×3)
🌐 Bandwidth	QPS × average request/response size	Only model dominant direction
🔗 Constraints	Map each number to a technology implication	Numbers without conclusions waste time
📄 API style	Pick REST for external, RPC for internal	Don't define more than 5 endpoints
🔒 Auth	State the auth mechanism per endpoint	Easy to forget, looks incomplete
✅ Agreement	Verbally confirm API scope with interviewer	Unconfirmed scope leads to drift

Estimation and API design are not bureaucratic checkboxes — they are the craft of thinking like a senior engineer before touching a design tool. Estimation grounds your decisions in mathematics. API design grounds your scope in a contract. Together they give your architecture a story that the interviewer can follow from first principles to final component. The next section will examine the most common process mistakes that derail candidates even when they understand these concepts individually.

Common Process Mistakes and How to Avoid Them

Even candidates who have studied system design deeply can stumble — not because they lack knowledge, but because they fall into predictable process-level traps. These mistakes are frustrating precisely because they are invisible to the candidate while they are happening. You are busy solving what feels like the right problem, narrating what feels like insightful thinking, and building what feels like an elegant system. But the interviewer, watching the process unfold, is seeing something very different.

This section is about developing the self-awareness to catch these errors in real time and correct them gracefully. We will cover the five most common process mistakes with concrete before-and-after examples, so you can recognize the pattern whether it shows up in a mock interview, a live session, or a system design practice problem at 11pm.

Mistake 1: Jumping Straight to Solutions ⚠️

The premature solutioning trap is the single most common process error in system design interviews. The interviewer says "Design Twitter" and within thirty seconds the candidate is drawing boxes labeled "Load Balancer → App Servers → Cassandra" on the whiteboard. It feels productive. It looks decisive. It is almost always wrong.

The problem is that "Design Twitter" is not a problem statement — it is a topic. The actual problem lives inside a cloud of ambiguity: Are we designing the tweet feed, the notification system, the search index, or all of it? Are we optimizing for reads or writes? Is this a greenfield design or a migration? How many users? Which users — global or regional?

❌ Wrong thinking: "I know what Twitter is. I'll just start designing and ask questions if something comes up."

✅ Correct thinking: "The word 'Twitter' contains dozens of possible design problems. My first job is to narrow the problem down to something I can actually solve in 45 minutes."

Before (the trap)

Imagine a candidate who immediately starts designing a global-scale, multi-region, eventually-consistent tweet delivery system — complete with Kafka partitions, a fan-out-on-write architecture, and CDN edge caching. The interviewer lets them go for eight minutes and then asks: "What's the expected number of writes per second?" The candidate pauses. They never asked. The interviewer probes further: "Is this for a startup MVP or Twitter at scale?" Another pause. The candidate has been solving a problem they invented.

After (the fix)

A disciplined candidate spends the first three to five minutes asking targeted clarifying questions:

"Are we designing the core tweet posting and home timeline retrieval, or also search and notifications?"
"What's the expected scale — millions of users or hundreds of millions?"
"Is read latency the primary concern, or is write throughput?"
"Do we need to handle media attachments, or is text-only sufficient for this scope?"

After gathering answers, they restate the scope out loud: "So I'll focus on posting tweets and retrieving a personalized home timeline for roughly 100 million daily active users, optimizing for read latency, with text content only. Does that sound right?"

This restatement does three things simultaneously: it confirms shared understanding, it signals structured thinking, and it gives the interviewer one last chance to redirect before you commit.

🎯 Key Principle: Requirements are not a formality you rush through. They are the foundation everything else rests on. A five-minute investment in requirements saves thirty minutes of designing the wrong system.

Mistake 2: Over-Engineering Before Validating the High-Level Approach ⚠️

Over-engineering means diving into implementation details — specific database schemas, cache eviction policies, exact replication factors — before you and the interviewer have agreed that your high-level architecture makes sense. It is a seductive mistake because the details feel concrete and demonstrable. But details built on an unvalidated foundation are details you may have to throw away.

Think of it as the difference between sketching a building's floor plan versus specifying the exact thread count in the drywall insulation. The insulation spec is real work. It just shouldn't happen before the floor plan is approved.

DETAIL-FIRST PATH (risky)

[Prompt] --> [Schema Design] --> [Index Optimization] --> [Replication Config]
                                                                    |
                                                            "Wait, why are
                                                            we using SQL?"

HIGH-LEVEL-FIRST PATH (correct)

[Prompt] --> [Block Diagram] --> [Interviewer Validates] --> [Deep Dive]
                                       |
                               "Yes, this approach
                                makes sense. Go deeper."

The Two-Phase Design Discipline

Force yourself to work in two explicit phases. In Phase 1 (breadth), your goal is a simple block diagram that shows the major components and their relationships — nothing more. Resist the urge to label exact technologies. "Storage Layer" is fine. "PostgreSQL with a B-tree index on user_id" is premature.

In Phase 2 (depth), once the interviewer has acknowledged the shape of your architecture, you pick the two or three most interesting or challenging components and go deep. This is where specific technology choices, schema decisions, and performance tradeoffs belong.

💡 Pro Tip: After sketching your high-level diagram, explicitly pause and say: "Before I go deeper, does this overall structure make sense to you? Are there any components you'd like me to focus on first?" This one sentence prevents you from over-engineering the wrong component.

⚠️ Common Mistake: Candidates who spend the first twenty minutes designing a flawless database schema often run out of time before discussing the system's most interesting architectural challenges — which is usually the thing the interviewer most wanted to evaluate.

Mistake 3: Staying Silent — Losing the Interviewer ⚠️

System design interviews are fundamentally collaborative. The interviewer is not watching you perform; they are trying to simulate working with you. When a candidate goes silent for two or three minutes while thinking, they break that simulation. The interviewer loses the thread of your reasoning. They cannot tell if you are thinking brilliantly or are completely stuck.

Narrating your thought process is not optional — it is a core evaluation criterion. Senior engineers think out loud. They say things like:

"I'm weighing two options here. A relational database gives us strong consistency but may bottleneck at this write volume. A wide-column store like Cassandra handles the write throughput but complicates our query patterns. Let me think through which tradeoff is worse for this use case..."
"I'm going to make an assumption here and flag it — I'll assume the fan-out service processes asynchronously. We can revisit that if the read latency requirement changes."
"I notice I'm about to get into cache invalidation details and we're fifteen minutes in. Let me finish the high-level picture first and come back to this."

Each of these narrations does double duty: it keeps the interviewer engaged AND it forces you to be more rigorous about your own thinking.

SILENCE PATTERN (problematic)

You: [drawing boxes silently for 3 minutes]
Interviewer: "What are you thinking?"
You: "Oh, just designing the storage layer."
Interviewer: [cannot evaluate reasoning quality]

NARRATION PATTERN (effective)

You: "I'm starting with the storage layer. The main tension here is
      write throughput versus query flexibility. Given we said 50k
      writes/second, I'm leaning away from a traditional relational
      approach. Let me sketch two options..."
Interviewer: [observing structured tradeoff analysis in real time]

🧠 Mnemonic: TAR — Think Aloud, Reason, then Resolve. Never resolve without the first two.

💡 Mental Model: Treat the interview like pair programming. You are the driver. A driver who goes silent for five minutes while their navigator sits next to them is uncomfortable for everyone.

Mistake 4: Ignoring Non-Functional Requirements Until Too Late ⚠️

Non-functional requirements (NFRs) — availability, latency, consistency, durability, throughput — are not decorations you add at the end of a design. They are constraints that determine which architectural choices are even valid. Ignoring them until the last five minutes is like designing a bridge and then asking "Oh, how much weight should it hold?" after the blueprint is finished.

The most common NFRs that candidates neglect:

NFR	Why It Gets Ignored	Why It Matters
🔒 Availability (99.9% vs 99.99%)	Seems like an ops concern	Determines redundancy strategy and cost
⚡ Read/Write Latency (p99)	Hard to quantify early	Drives caching, CDN, and database choices
📦 Durability	Often assumed	Determines replication factor and write-ahead logging
🔄 Consistency model	Complex to discuss	Shapes everything from DB choice to API design
📈 Scalability ceiling	Feels premature	Informs whether sharding is needed from day one

The NFR Integration Pattern

The fix is to treat NFRs as first-class requirements during your clarification phase — not as afterthoughts. After establishing functional scope, explicitly ask:

"What's the acceptable latency for a timeline read — under 100ms, or is 500ms acceptable?"
"If the system is unavailable for 30 seconds, is that a business-critical incident or acceptable degradation?"
"Do we prioritize consistency — everyone sees the same feed — or is eventual consistency acceptable for this use case?"

Then, and this is crucial, reference those NFRs when you make architectural decisions. Don't just say "I'll use Redis for caching." Say "Because we committed to sub-100ms read latency, I'll add a Redis cache in front of the database here." This shows the interviewer that your choices are driven by requirements, not technology preferences.

Here is a practical example using a rate limiter design. Notice how the NFR shapes the implementation decision:

## Rate Limiter — design shaped by NFR: "must handle 100k req/sec with < 5ms overhead"
## NFR drives us away from a distributed lock approach (too slow)
## toward a token bucket implemented in Redis with Lua scripts (atomic, fast)

import redis
import time

RATE_LIMIT_SCRIPT = """
-- Lua script executes atomically in Redis (no race conditions)
-- Keys: [1] = rate_limit_key
-- Args: [1] = limit, [2] = window_seconds, [3] = current_timestamp

local key = KEYS[1]
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local now = tonumber(ARGV[3])

-- Remove entries outside the sliding window
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)

-- Count current entries in the window
local count = redis.call('ZCARD', key)

if count < limit then
    -- Under limit: record this request and allow it
    redis.call('ZADD', key, now, now .. math.random())
    redis.call('EXPIRE', key, window)
    return 1  -- allowed
else
    return 0  -- rejected
end
"""

class SlidingWindowRateLimiter:
    def __init__(self, redis_client: redis.Redis, limit: int, window_seconds: int):
        self.redis = redis_client
        self.limit = limit
        self.window = window_seconds
        # Pre-load script for efficiency (reduces round-trips)
        self.script = self.redis.register_script(RATE_LIMIT_SCRIPT)

    def is_allowed(self, user_id: str) -> bool:
        key = f"rate_limit:{user_id}"
        now = int(time.time() * 1000)  # millisecond precision
        result = self.script(keys=[key], args=[self.limit, self.window * 1000, now])
        return result == 1

In an interview, you would say: "Our NFR was sub-5ms overhead at 100k requests per second. A distributed lock with a central counter would create a write bottleneck. Instead, I'm using a Redis sorted set with a Lua script — the atomicity of Lua means no race conditions, and Redis handles this throughput easily within our latency budget." Notice how the NFR justified the choice. This is the pattern interviewers reward.

Mistake 5: Treating the Process as Linear Instead of Iterative ⚠️

The five-step design framework introduced earlier in this lesson is sequential by default but iterative by necessity. Candidates who treat it as a strict one-way pipeline get into trouble the moment they discover, ten minutes into architecture design, that one of their earlier assumptions was wrong. They either ignore the contradiction (building a fragile design) or restart entirely (wasting time and signaling poor adaptability).

LINEAR MENTAL MODEL (fragile)

Requirements --> Estimates --> API --> Architecture --> Deep Dive
     ↓               ↓          ↓           ↓              ↓
  [done]          [done]     [done]       [done]         [done]

ITERATIVE MENTAL MODEL (robust)

Requirements <--> Estimates <--> API <--> Architecture <--> Deep Dive
       ↑_____________↑___________↑____________↑________________↑
                    Graceful back-references allowed

The skill is graceful backtracking — returning to an earlier step without losing momentum or confidence. The key is to make the revision explicit and frame it positively.

Before (clumsy backtracking)

"Oh wait, I forgot to ask — how many users do we have? Because I think my whole architecture might be wrong."

This signals that the earlier work was wasted and creates anxiety. The interviewer now wonders what else was forgotten.

After (graceful backtracking)

"As I'm thinking through the fan-out logic, I realize our earlier estimate of 50 million daily active users changes things significantly here. Let me flag a revision: if 1% of users are 'celebrities' with 10 million followers each, our fan-out-on-write approach creates a hot-write problem. I want to revise this component to use a hybrid model — fan-out-on-write for regular users, fan-out-on-read for high-follower accounts. Does that change align with what you had in mind?"

This version demonstrates:

🧠 Awareness that new information requires revisiting earlier decisions
📚 Understanding of the actual architectural implication
🔧 Ability to revise precisely, not wholesale
🎯 Continued collaboration with the interviewer

Here is how a hybrid fan-out service might look in pseudocode, showing the iterative revision in action:

## Revised fan-out service — result of iterative backtracking
## Original design: always fan-out-on-write
## Revised design: hybrid based on follower count threshold

FAN_OUT_THRESHOLD = 10_000  # users with more followers use read-time fan-out

class HybridFanOutService:
    def __init__(self, timeline_cache, follower_service, message_queue):
        self.timeline_cache = timeline_cache
        self.follower_service = follower_service
        self.message_queue = message_queue

    def publish_tweet(self, author_id: str, tweet: dict) -> None:
        follower_count = self.follower_service.get_follower_count(author_id)

        if follower_count <= FAN_OUT_THRESHOLD:
            # Regular user: write tweet to each follower's timeline cache immediately
            # Acceptable because fan-out is bounded (max 10k writes)
            self._fan_out_on_write(author_id, tweet)
        else:
            # Celebrity user: store tweet only in author's timeline
            # Followers pull it at read time — avoids thundering herd on write
            self._store_for_read_time_fan_out(author_id, tweet)

    def _fan_out_on_write(self, author_id: str, tweet: dict) -> None:
        follower_ids = self.follower_service.get_all_followers(author_id)
        for follower_id in follower_ids:
            # Push tweet ID into each follower's pre-computed timeline cache
            self.timeline_cache.prepend(f"timeline:{follower_id}", tweet['id'])

    def _store_for_read_time_fan_out(self, author_id: str, tweet: dict) -> None:
        # Only store in author's own tweet index
        # Timeline reads will merge this in at query time
        self.timeline_cache.prepend(f"tweets:{author_id}", tweet['id'])
        # Notify downstream systems that a high-follower tweet was posted
        self.message_queue.publish('celebrity_tweet', {'author_id': author_id, 'tweet_id': tweet['id']})

Pointing to this kind of code in an interview, you would say: "This is the revision I mentioned — the publish_tweet method now routes based on follower count. Regular users get synchronous write-time fan-out, which keeps reads simple. High-follower authors use read-time fan-out to prevent the write amplification problem we'd hit if someone like a celebrity tweeted to 50 million followers simultaneously."

💡 Real-World Example: Twitter's own engineering blog documented this exact evolution — they started with pure fan-out-on-write and had to revise to a hybrid model when celebrity accounts created write storms. Your iterative revision in the interview mirrors how real systems evolve.

Putting It All Together: The Self-Monitoring Checklist

The most effective way to avoid these mistakes is to build a self-monitoring habit — a brief internal checkpoint you run every ten minutes during the interview. Think of it as a background process that keeps your foreground work on track.

SELF-MONITORING LOOP (run every ~10 minutes)

┌─────────────────────────────────────────────────┐
│  1. Did I clarify before I drew?               │
│     → If no: pause and clarify now             │
│                                                 │
│  2. Am I in detail before high-level is done?  │
│     → If yes: zoom out and finish the sketch   │
│                                                 │
│  3. Have I spoken in the last 60 seconds?      │
│     → If no: narrate my current thinking       │
│                                                 │
│  4. Are my decisions tied to NFRs?             │
│     → If no: reference the relevant NFR now    │
│                                                 │
│  5. Did new info change earlier assumptions?   │
│     → If yes: backtrack gracefully             │
└─────────────────────────────────────────────────┘

📋 Quick Reference Card: The Five Process Mistakes

	Mistake	Signal You're In It	Quick Fix
🎯	Premature solutioning	Drawing boxes before asking questions	Stop, restate scope, ask 3 clarifying questions
🔧	Over-engineering early	Explaining schema before block diagram is done	Zoom out, finish high-level, then dive
🧠	Staying silent	Interviewer asks "What are you thinking?"	Narrate the tradeoff you are currently evaluating
⚡	Ignoring NFRs	Choices lack justification	Reference an NFR when stating each architectural decision
🔄	Linear process thinking	Ignoring contradictions rather than revising	Explicitly backtrack, frame revision as a positive insight

🤔 Did you know? Research on expert problem-solving consistently shows that the difference between novices and experts is not knowledge — it is metacognition: the ability to monitor and regulate your own thinking process in real time. The self-monitoring loop above is a structured form of engineering metacognition.

These five mistakes share a common root: they all happen when a candidate prioritizes appearing to make progress over actually making progress. The paradox of system design interviews is that the meta-skills — structuring your approach, narrating your thinking, validating before building — are more visible to the interviewer than the object-level architectural knowledge you are demonstrating. A candidate who designs a slightly suboptimal cache invalidation strategy while executing the process flawlessly will typically score higher than a candidate who knows every cache eviction policy by name but designs without asking a single clarifying question.

Process is not the scaffolding around your knowledge. It is the thing being evaluated.

Key Takeaways and Interview Process Cheat Sheet

You started this lesson not knowing why some candidates walk out of system design interviews feeling like they nailed it while others — equally talented engineers — leave feeling like they were improvising the whole time. Now you know the answer: process. A repeatable, structured process is what separates the candidate who designs systems confidently from the one who draws boxes and hopes for the best. This final section locks in everything you've learned, gives you a ready-to-use interview toolkit, and points the way forward into the deeper architectural territory ahead.

The Five-Step Framework: A Complete Recap

Every system design interview, regardless of the specific system being designed, can be navigated with the same five sequential steps. Think of these steps not as rigid checkboxes but as a cognitive scaffold — a mental structure that keeps you organized, communicative, and focused even under interview pressure.

Here is the complete framework in one place:

┌─────────────────────────────────────────────────────────┐
│           THE FIVE-STEP DESIGN PROCESS                  │
├────┬────────────────────────┬──────────────────────────┤
│ #  │ Step                   │ Core Question Answered   │
├────┼────────────────────────┼──────────────────────────┤
│ 1  │ Clarify Requirements   │ What are we building?    │
│ 2  │ Estimate Scale         │ How big is this system?  │
│ 3  │ Define API Contract    │ How do components talk?  │
│ 4  │ High-Level Architecture│ What are the major parts?│
│ 5  │ Identify Bottlenecks   │ Where will it break?     │
└────┴────────────────────────┴──────────────────────────┘

Each step answers a fundamentally different question. Step 1 prevents you from solving the wrong problem. Step 2 grounds your design in mathematical reality. Step 3 forces precision about interfaces before you commit to implementation. Step 4 communicates the big picture. Step 5 demonstrates senior engineering instincts by proactively finding weaknesses.

🎯 Key Principle: The steps are sequential because each one provides the input for the next. Skipping steps doesn't save time — it creates confusion that costs far more time to unravel later.

The Interview Process Cheat Sheet

Print this, save it to your phone, or write it on a sticky note on your monitor. Use it before every practice session until the sequence becomes automatic.

📋 Quick Reference Card: System Design Interview Playbook

⏱️ Time	📍 Step	🎯 Goal	✅ Done When...
🕐 0–5 min	Clarify Requirements	Pin down functional + non-functional scope	You can restate the problem back to the interviewer in one sentence
🕐 5–10 min	Estimate Scale	Size users, QPS, storage, bandwidth	You have concrete numbers driving your architecture decisions
🕐 10–15 min	API Contract	Define key endpoints or interfaces	A junior engineer could implement the client from your spec
🕐 15–30 min	High-Level Architecture	Draw the major components and data flows	The interviewer can trace a request from client to response
🕐 30–45 min	Bottleneck Analysis	Identify and resolve weaknesses	You've addressed the top 2–3 failure points with concrete solutions

🧠 Mnemonic: C-E-A-A-B — Clarify, Estimate, API, Architecture, Bottlenecks. Or remember the sentence: "Can Every Architect Always Build?"

💡 Pro Tip: The time allocations above are for a 45-minute interview. For a 60-minute interview, add roughly 5 minutes to the architecture and bottleneck phases. Never shrink the requirements or estimation phases — they protect everything that follows.

A Working Example: The Framework Applied to URL Shortener

To make the cheat sheet concrete, here's a compressed end-to-end walkthrough of designing a URL shortener using the exact framework. Notice how each step produces an artifact that feeds the next.

Step 1 — Clarify (5 min): Functional: shorten URLs, redirect on access, optional custom aliases. Non-functional: 100M daily active users, <100ms redirect latency, 99.99% uptime. Out of scope: analytics dashboard, user authentication.

Step 2 — Estimate (5 min):

Write QPS:  100M new URLs / day ÷ 86,400 sec ≈ 1,200 writes/sec
Read QPS:   Assume 100:1 read/write ratio → 120,000 reads/sec
Storage:    1,200 writes/sec × 86,400 × 365 × 5 years × 500 bytes ≈ ~10 TB
Bandwidth:  120,000 reads/sec × 500 bytes ≈ 60 MB/s outbound

Step 3 — API Contract (5 min):

## POST /api/v1/shorten
## Creates a short URL alias for a long URL
## Request body:
{
    "long_url": "https://example.com/some/very/long/path",
    "custom_alias": "mylink",    # optional
    "expiry_days": 30            # optional, null = permanent
}
## Response (201 Created):
{
    "short_url": "https://short.ly/abc123",
    "alias": "abc123",
    "created_at": "2024-01-15T10:30:00Z",
    "expires_at": "2024-02-14T10:30:00Z"
}

## GET /{alias}
## Redirects to the original long URL
## Response: HTTP 301 (permanent) or 302 (temporary) redirect
## Location: https://example.com/some/very/long/path

Notice how the API contract makes a concrete decision — 301 vs 302 redirect — that directly impacts caching behavior at the architecture level. This is the kind of precision that impresses interviewers.

Step 4 — Architecture (15 min):

 Client
   │
   ▼
[CDN / Edge Cache]  ◄── caches 301 redirects for hot URLs
   │
   ▼
[Load Balancer]
   │
   ├──► [Write Service] ──► [ID Generator] ──► [Primary DB]
   │         │                                        │
   │         └──────────────────────────────► [Cache (Redis)]
   │
   └──► [Redirect Service] ──► [Cache (Redis)] ──► [Read Replica DB]

Step 5 — Bottlenecks (10 min):

Hot URLs: Top 1% of URLs get 50% of traffic → solve with Redis cache + CDN edge caching
ID collision: Two servers generate the same alias → solve with pre-generated ID ranges per server (ticket server pattern)
DB write throughput: 1,200 writes/sec is manageable now, but sharding by alias hash prepares for 10x growth

How Process Mastery Frees Up Cognitive Bandwidth

Here is one of the most important — and least discussed — benefits of having a memorized process: it offloads navigation to autopilot so your brain can focus on the interesting technical problems.

Think about what happens when you first learn to drive. You consciously think about every action: check mirrors, signal, check blind spot, turn the wheel a precise amount. It's exhausting. After years of driving, those actions become automatic, and you can hold a conversation, plan your route, and notice a child stepping off a curb — all simultaneously.

System design interviews work the same way. When you internalize the five steps, you stop burning mental energy on "what should I talk about next?" and start spending it on questions like:

🧠 "Should I use a relational or document database for this access pattern?"
🧠 "What consistency model do I need between these services?"
🧠 "Is eventual consistency acceptable here, or does this require strong consistency?"

Those are the questions that differentiate a good design from a great one. A candidate who is improvising their structure cannot think about those questions deeply — they're too busy worrying about whether they've forgotten something important.

❌ Wrong thinking: "I'll figure out the structure as I go and spend more time on the technical details." ✅ Correct thinking: "A locked-in process is what creates the space for technical depth."

💡 Mental Model: Think of the five-step framework as the operating system of your interview. Once the OS is running reliably in the background, you can open demanding applications — database selection, consistency tradeoffs, caching strategies — without crashing.

A Practical Code Template: Your Personal Design Notes Structure

Many candidates find it helpful to have a structured template they fill in during the interview. Here is a minimal working template you can adapt for any system:

## System Design: [System Name]

### 1. Requirements
**Functional:**
- [ ] Core feature 1
- [ ] Core feature 2

**Non-Functional:**
- [ ] Scale: _____ DAU
- [ ] Latency: p99 < _____ ms
- [ ] Availability: _____ % uptime

**Out of Scope:**
- Feature X (explicitly confirmed with interviewer)

### 2. Estimates
- Write QPS: _____ / sec
- Read QPS:  _____ / sec  (___:1 ratio)
- Storage:   _____ GB/year
- Bandwidth: _____ MB/s

### 3. API Contract
- POST /resource  → creates, returns {id, ...}
- GET  /resource/:id → returns {data}

### 4. High-Level Architecture
[Sketch: Client → LB → Services → DB]

### 5. Bottlenecks
1. Bottleneck: _____ → Solution: _____
2. Bottleneck: _____ → Solution: _____
3. Bottleneck: _____ → Solution: _____

This isn't just a note-taking tool — it's a communication device. Narrate as you fill it in. The interviewer sees a structured thinker building toward a solution, not a candidate frantically drawing random components.

💡 Pro Tip: In a virtual whiteboard interview (Miro, Excalidraw, etc.), create five labeled zones before you start talking. Just labeling the zones signals immediately that you have a process — and it gives the interviewer confidence before you've said a single technical word.

Bridge to Upcoming Lessons: What Comes Next

This lesson established the process layer — the how of approaching any system design interview. Everything that follows in this course is the content layer — the what you plug into that process.

Here is how the upcoming lessons map directly onto the framework you've just learned:

🗺️ Upcoming Lesson	🔗 Framework Step It Serves	💡 What You'll Be Able to Do
🏗️ High-Level Architecture Patterns	Step 4: Architecture	Choose between monolith, microservices, event-driven architectures with justification
🔍 Bottleneck Deep-Dive: Databases	Step 5: Bottlenecks	Identify read/write hotspots and select sharding, replication, or caching strategies
⚡ Caching Strategies	Step 5: Bottlenecks	Apply cache-aside, write-through, and write-behind patterns to the right scenarios
📡 API Design Patterns	Step 3: API Contract	Design REST, GraphQL, and gRPC interfaces with real tradeoff analysis
📏 Scaling Strategies	Steps 2 + 5	Connect your estimation numbers to concrete horizontal/vertical scaling decisions

🎯 Key Principle: The process is your skeleton; the upcoming lessons are your muscle and organs. Neither works without the other. A candidate who knows every caching strategy but has no process will still fumble. A candidate with perfect process but thin technical knowledge will plateau. You need both — and now you have the foundation.

The Daily Practice Drill

Knowledge without repetition doesn't become skill. Here is the single most effective practice habit for internalizing the five-step framework:

Every day, pick one system you already use and run it through all five steps in 30 minutes.

The system should be familiar enough that you don't get stuck on domain knowledge — the goal is to practice the process, not research a new domain. Good candidates:

📱 Instagram's photo feed
💬 WhatsApp message delivery
🔍 Google autocomplete
🎵 Spotify's play queue
📦 Amazon's order status tracker

Here is a sample timer for your daily drill:

⏱️  0:00 –  5:00  →  Requirements (set a 5-min timer, stop when it rings)
⏱️  5:00 – 10:00  →  Estimates (calculate on paper, show your work)
⏱️ 10:00 – 15:00  →  API contract (write 2–3 endpoints with request/response)
⏱️ 15:00 – 25:00  →  Architecture diagram (whiteboard or notebook)
⏱️ 25:00 – 30:00  →  Bottleneck identification (name 3, solve 2)

After two weeks of daily drills, the sequence will be as automatic as typing — and your interview anxiety will drop significantly because you'll know exactly what you're going to do from the first second.

⚠️ Critical Point: Don't skip the timer. The time pressure is the point. In a real interview, you will feel compelled to keep talking about requirements forever, or to jump straight to architecture because it feels more impressive. The timer trains you to move deliberately.

💡 Real-World Example: Competitive chess players don't just study openings — they practice them under time pressure until the moves are tactile memory. The time pressure in their training is exactly what makes their thinking automatic during a match. Treat your daily system design drill the same way.

What You Now Know That You Didn't Before

Let's make the learning gain explicit. Before this lesson, a typical developer asked to "design Twitter" might:

Jump immediately to drawing boxes
Spend 20 minutes on database schema before knowing the scale
Never define what "Twitter" means in this context (timeline? DMs? trending topics?)
Arrive at bottlenecks only if time remains — which it usually doesn't

After this lesson, you know:

🔧 Improvising signals junior thinking. Interviewers evaluate process explicitly, not just technical output.
🎯 The five steps are sequential for a reason. Each step's output is the next step's input.
📚 Requirements protect you from scope creep. Explicit out-of-scope items are as valuable as in-scope ones.
🔒 Estimation grounds your architecture. Decisions without numbers are opinions; decisions with numbers are engineering.
🧠 API contracts force interface precision. Writing endpoints before components prevents premature implementation details.
🔧 Bottleneck analysis is where you demonstrate seniority. It shows you build for failure, not just for the happy path.

⚠️ Final Critical Point to Remember: The framework does not guarantee the right answer — no framework does. What it guarantees is that you will work systematically toward a well-reasoned answer while demonstrating exactly the communication and problem-decomposition skills that senior engineering roles require. Interviewers are not expecting you to produce a production-grade architecture in 45 minutes. They are watching how you think.

Summary Table: Core Concepts at a Glance

📋 Quick Reference Card: Everything You Learned

📌 Concept	💬 One-Line Definition	⚠️ If You Skip It...
🎯 Structured Process	A repeatable five-step sequence for any system design problem	You improvise, panic, and miss critical design dimensions
🔍 Requirements Clarification	Pinning down what's in scope before designing anything	You build the wrong system confidently
📏 Back-of-Envelope Estimation	Quantifying scale to drive architecture choices	Your architecture is untethered from reality
📡 API Contract	Defining interfaces before implementing components	Your components are tightly coupled and ambiguous
🏗️ High-Level Architecture	The major components and how data flows between them	The interviewer can't follow your design narrative
🔥 Bottleneck Analysis	Proactively identifying and resolving failure points	You present a naive design with no depth
🧠 Cognitive Bandwidth	Mental capacity freed up by automating process navigation	Technical depth crowds out structured thinking

Three Practical Next Steps

Run the daily 30-minute drill starting today. Choose a familiar app. Set five timers. Go through all five steps without skipping. The first session will feel awkward — that discomfort is learning.
Watch one recorded system design interview (Mock interviews from Google engineers, ex-FAANG engineers on YouTube, etc.) and evaluate the candidate using the five-step checklist. Notice which steps they skip, where they lose structure, and how the interviewer reacts.
Continue to the next lesson on High-Level Architecture Patterns with this question in mind: "For which types of requirements and scale estimates does each architecture pattern become the right choice?" That framing — connecting patterns back to the process — is how you build an integrated mental model rather than isolated knowledge.

🤔 Did you know? Research on expert performance consistently shows that structured problem-solving frameworks improve performance under stress more than raw knowledge does. Surgeons, pilots, and software engineers all benefit from checklists not because they don't know what to do — but because checklists prevent the cognitive shortcuts that emerge under pressure from causing errors. Your five-step framework is exactly this kind of professional-grade checklist.

You now have the process. Everything ahead is about filling that process with the technical depth it deserves.

📝

Ready to practice?

This lesson has 15 questions to help you learn