You are viewing a preview of this lesson. Sign in to start learning
Back to Mastering AWS

Compute & Runtime Architecture

Master EC2, containers, serverless computing, and event-driven architectures for scalable applications

AWS Compute & Runtime Architecture

Master AWS compute services with free flashcards and proven study techniques. This lesson covers EC2 instance types, Lambda serverless architecture, container orchestration with ECS and EKS, and compute selection strategiesโ€”essential concepts for building scalable cloud applications and passing AWS certification exams.

Welcome to AWS Compute Services ๐Ÿ’ป

Compute is the foundation of cloud infrastructure. AWS offers multiple compute services, each optimized for different workload patterns. Understanding when to use virtual machines, containers, or serverless functions is critical for building cost-effective, performant applications.

In this lesson, you'll learn:

  • EC2 fundamentals and instance family selection
  • Lambda serverless architecture patterns
  • Container services (ECS, EKS, Fargate)
  • Compute optimization strategies for different workloads

๐Ÿ’ก Pro tip: Think of compute services as different types of restaurantsโ€”EC2 is like owning a full kitchen (maximum control), containers are like meal prep kits (balanced control and convenience), and Lambda is like ordering delivery (pay only for what you consume).


Core Concepts: The AWS Compute Spectrum

๐Ÿ–ฅ๏ธ EC2: Elastic Compute Cloud

EC2 provides resizable virtual machines in the cloud. You have complete control over the operating system, networking, and storage configuration.

Instance Families (named with pattern: Family + Generation + Size)

FamilyOptimized ForUse CasesExample Type
T (T2, T3, T3a)Burstable performanceWeb servers, dev environmentst3.medium
M (M5, M6i)General purpose (balanced)Application servers, databasesm5.large
C (C5, C6g)Compute-optimizedBatch processing, gaming serversc5.xlarge
R (R5, R6g)Memory-optimizedIn-memory caches, real-time analyticsr5.2xlarge
I (I3, I3en)Storage-optimizedNoSQL databases, data warehousesi3.xlarge
P (P3, P4d)GPU instancesMachine learning, renderingp3.2xlarge

๐Ÿง  Memory Aid: "TMC RI P" = Teeny burstable, Medium balanced, Compute heavy, RAM heavy, IOPS heavy, Parallel processing

EC2 Pricing Models:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚         EC2 PRICING COMPARISON                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                 โ”‚
โ”‚  ๐Ÿ’ฐ On-Demand                                   โ”‚
โ”‚     Pay per hour/second                         โ”‚
โ”‚     No commitment                               โ”‚
โ”‚     $$$$  (Most expensive)                      โ”‚
โ”‚     โ†“                                           โ”‚
โ”‚  ๐ŸŽŸ๏ธ Reserved Instances (1-3 years)             โ”‚
โ”‚     Up to 75% discount                          โ”‚
โ”‚     Steady-state workloads                      โ”‚
โ”‚     $$$                                         โ”‚
โ”‚     โ†“                                           โ”‚
โ”‚  ๐Ÿท๏ธ Savings Plans                              โ”‚
โ”‚     Flexible across instance families           โ”‚
โ”‚     Commit to $/hour usage                      โ”‚
โ”‚     $$                                          โ”‚
โ”‚     โ†“                                           โ”‚
โ”‚  ๐ŸŽฏ Spot Instances                              โ”‚
โ”‚     Up to 90% discount                          โ”‚
โ”‚     Can be interrupted                          โ”‚
โ”‚     $  (Cheapest)                               โ”‚
โ”‚                                                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โš ๏ธ Common Mistake: Using On-Demand pricing for predictable, long-running workloads. Reserved Instances or Savings Plans can save 40-75% for stable production workloads.

โšก Lambda: Serverless Compute

AWS Lambda runs code without provisioning servers. You pay only for compute time consumed (per 100ms increments).

Lambda Architecture Pattern:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚          LAMBDA EXECUTION FLOW                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

    ๐Ÿ“ก Event Source
    (API Gateway, S3, DynamoDB, etc.)
           โ”‚
           โ†“
    ๐Ÿ”” Event Trigger
           โ”‚
           โ†“
    โš™๏ธ Lambda Service
       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
       โ”‚ Cold Start?    โ”‚
       โ”‚  โ”œโ”€Yesโ†’ Init  โ”‚  (1-5 seconds)
       โ”‚  โ””โ”€Noโ†’ Reuse  โ”‚  (milliseconds)
       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                โ”‚
                โ†“
       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
       โ”‚ Execute Code   โ”‚
       โ”‚ (max 15 min)   โ”‚
       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                โ”‚
                โ†“
    ๐Ÿ“ค Return Response
           โ”‚
           โ†“
    ๐Ÿ“Š CloudWatch Logs

Lambda Limits (Critical for Architecture Decisions):

ResourceDefault LimitCan Increase?
Execution timeout15 minutes maxNo
Memory128 MB - 10,240 MBNo
Deployment package50 MB (zipped), 250 MB (unzipped)No
Concurrent executions1,000 per regionYes (request increase)
Environment variables4 KB totalNo
/tmp storage512 MB - 10,240 MBNo

๐Ÿ’ก When NOT to use Lambda:

  • Long-running processes (> 15 minutes)
  • Stateful applications requiring persistent connections
  • High-performance computing needing low-latency responses
  • Workloads with constant, predictable traffic (EC2 may be cheaper)

๐Ÿณ Container Services: ECS, EKS, and Fargate

AWS offers three primary container orchestration services:

Container Service Comparison:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚           CONTAINER SERVICE DECISION TREE        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

                "I need containers"
                        โ”‚
                        โ†“
            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
            โ”‚                       โ”‚
    "Need Kubernetes?"       "Simple container
            โ”‚                 orchestration?"
            โ”‚                       โ”‚
      โ”Œโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”                 โ†“
      โ”‚           โ”‚           โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   "Yes"       "No"          โ”‚   ECS   โ”‚ โ† AWS-native
      โ”‚           โ”‚           โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
      โ†“           โ†“                 โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”            โ†“
โ”‚   EKS   โ”‚  โ”‚   ECS   โ”‚    "Who manages servers?"
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜            โ”‚
      โ”‚           โ”‚           โ”Œโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”
      โ”‚           โ”‚           โ”‚           โ”‚
      โ†“           โ†“         "AWS"       "Me"
 "Launch type?" "Launch type?"  โ”‚           โ”‚
      โ”‚           โ”‚           โ†“           โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚           โ”‚ โ”‚        โ”‚  โ”‚ Fargate โ”‚ โ”‚ EC2  โ”‚
โ”‚  Fargate  โ”‚ โ”‚  EC2   โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚    or     โ”‚ โ”‚   or   โ”‚  (Serverless) (Self-managed)
โ”‚   EC2     โ”‚ โ”‚Fargate โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
ServiceWhat It IsBest ForControl Level
ECSAWS-native container orchestrationAWS-centric architectures, simpler setupHigh
EKSManaged KubernetesMulti-cloud, Kubernetes expertise, complex workloadsHighest
FargateServerless compute for containersNo server management, pay-per-useMedium

ECS Task Definition Example (Simplified):

{
  "family": "web-app",
  "containerDefinitions": [
    {
      "name": "nginx",
      "image": "nginx:latest",
      "memory": 512,
      "cpu": 256,
      "portMappings": [
        {
          "containerPort": 80,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {"name": "ENV", "value": "production"}
      ]
    }
  ],
  "requiresCompatibilities": ["FARGATE"],
  "networkMode": "awsvpc",
  "cpu": "256",
  "memory": "512"
}

๐Ÿ”ง Try this: The requiresCompatibilities field determines whether the task runs on Fargate (serverless) or EC2 (self-managed). Changing this one field shifts your entire infrastructure model.

๐ŸŽฏ Fargate: Serverless Containers

Fargate is a launch type for ECS and EKS that removes server management entirely.

Traditional ECS on EC2 vs. Fargate:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚        ECS ON EC2 (You Manage Servers)          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                 โ”‚
โ”‚  You provision: EC2 instances                   โ”‚
โ”‚                 โ†“                               โ”‚
โ”‚  You install:   ECS agent                       โ”‚
โ”‚                 โ†“                               โ”‚
โ”‚  You manage:    OS patches, scaling             โ”‚
โ”‚                 โ†“                               โ”‚
โ”‚  ECS schedules: Containers on your instances    โ”‚
โ”‚                                                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚         FARGATE (AWS Manages Servers)           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                 โ”‚
โ”‚  You define:    Task definition (CPU, memory)   โ”‚
โ”‚                 โ†“                               โ”‚
โ”‚  AWS provides:  Compute capacity automatically  โ”‚
โ”‚                 โ†“                               โ”‚
โ”‚  You pay:       Per vCPU/GB per second          โ”‚
โ”‚                                                 โ”‚
โ”‚  No servers, no patches, no capacity planning   โ”‚
โ”‚                                                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Fargate Pricing Model:

  • vCPU: $0.04048 per vCPU per hour
  • Memory: $0.004445 per GB per hour
  • Billed per second, 1-minute minimum

๐Ÿ’ฐ Cost Comparison Example: Running a 0.25 vCPU, 0.5 GB task for 30 days:

  • Fargate: (~$9/month)
  • t3.micro EC2 (assuming 50% utilization): (~$7.50/month)
  • But EC2 requires management overhead, patching, monitoring

๐Ÿ—๏ธ Compute Selection Framework

Choosing the right compute service depends on multiple factors:

FactorEC2LambdaECS/EKSFargate
ControlFull OS accessCode onlyContainer + orchestrationContainer only
ScalingManual/Auto Scaling GroupsAutomaticService-basedAutomatic
ManagementYou patch/maintainAWS manages allYou manage instancesAWS manages compute
PricingPer hour/secondPer 100ms executionPer instance hourPer vCPU/GB second
Cold StartMinutes (boot)100ms-5sSeconds (container)Seconds (container)
StatePersistentEphemeralFlexibleEphemeral

Decision Matrix:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚         WORKLOAD โ†’ COMPUTE MAPPING              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                 โ”‚
โ”‚  ๐ŸŒ Web Application (steady traffic)            โ”‚
โ”‚     โ†’ EC2 with Auto Scaling                     โ”‚
โ”‚     โ†’ ECS on EC2 (containerized)                โ”‚
โ”‚                                                 โ”‚
โ”‚  ๐Ÿ“Š API with variable traffic                   โ”‚
โ”‚     โ†’ Lambda (event-driven)                     โ”‚
โ”‚     โ†’ ECS with Fargate (containerized)          โ”‚
โ”‚                                                 โ”‚
โ”‚  ๐ŸŽฎ Real-time multiplayer game server           โ”‚
โ”‚     โ†’ EC2 (persistent connections)              โ”‚
โ”‚     โ†’ ECS on EC2 (for multi-region)             โ”‚
โ”‚                                                 โ”‚
โ”‚  ๐Ÿ“ธ Image processing (triggered by uploads)     โ”‚
โ”‚     โ†’ Lambda (< 15 min per image)               โ”‚
โ”‚     โ†’ EC2 Spot (batch, cost-sensitive)          โ”‚
โ”‚                                                 โ”‚
โ”‚  ๐Ÿค– Machine Learning training                   โ”‚
โ”‚     โ†’ EC2 P3/P4 instances (GPU)                 โ”‚
โ”‚     โ†’ EKS with GPU nodes (distributed)          โ”‚
โ”‚                                                 โ”‚
โ”‚  ๐Ÿ—„๏ธ Database server                             โ”‚
โ”‚     โ†’ EC2 (use RDS instead if possible)         โ”‚
โ”‚     โ†’ Never Lambda (stateful, long connections) โ”‚
โ”‚                                                 โ”‚
โ”‚  ๐Ÿ”„ Microservices architecture                  โ”‚
โ”‚     โ†’ ECS Fargate (managed scaling)             โ”‚
โ”‚     โ†’ EKS (if Kubernetes required)              โ”‚
โ”‚                                                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Real-World Examples

Example 1: E-commerce Website Architecture ๐Ÿ›’

Scenario: You're building an online store with:

  • Web frontend (React SPA)
  • REST API backend (Node.js)
  • Order processing (batch jobs)
  • Product image resizing

Optimal Compute Strategy:

## CloudFormation/Architecture Snippet (Conceptual)
Components:
  StaticWebsite:
    Service: S3 + CloudFront
    Reason: No compute needed for static assets
  
  APIBackend:
    Service: ECS Fargate
    Configuration:
      - CPU: 0.5 vCPU
      - Memory: 1 GB
      - Auto Scaling: 2-10 tasks based on CPU
    Reason: Containerized, variable traffic, no server management
  
  OrderProcessing:
    Service: Lambda
    Trigger: SQS queue
    Configuration:
      - Memory: 512 MB
      - Timeout: 5 minutes
      - Reserved Concurrency: 10
    Reason: Event-driven, short processing time, pay-per-execution
  
  ImageResizing:
    Service: Lambda
    Trigger: S3 upload event
    Configuration:
      - Memory: 2048 MB (more memory = more CPU)
      - Timeout: 1 minute
      - Concurrent executions: 100
    Reason: Event-driven, parallel processing, scales automatically

Architecture Flow:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚      E-COMMERCE COMPUTE ARCHITECTURE           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

  ๐Ÿ‘ค User Browser
       โ”‚
       โ†“
  ๐Ÿ“ฆ CloudFront (CDN)
       โ”‚
       โ”œโ”€โ”€โ†’ S3 (static files)
       โ”‚
       โ””โ”€โ”€โ†’ API Gateway
             โ”‚
             โ†“
       โš™๏ธ ECS Fargate Cluster
          (API containers)
             โ”‚
             โ”œโ”€โ”€โ†’ RDS (product data)
             โ”‚
             โ””โ”€โ”€โ†’ SQS Queue (orders)
                    โ”‚
                    โ†“
              โšก Lambda Function
                (process orders)
                    โ”‚
                    โ†“
              ๐Ÿ“ง SNS (notifications)

  Separate flow:
  ๐Ÿ“ท Image Upload โ†’ S3 โ†’ โšก Lambda (resize) โ†’ S3

Why This Mix?

  • Fargate for API: Handles HTTP requests efficiently, scales based on traffic
  • Lambda for processing: Decouples order processing, scales per message
  • Lambda for images: Processes in parallel, only pays when images uploaded

๐Ÿ’ฐ Cost Estimation (10,000 API requests/day, 100 orders/day, 50 image uploads/day):

  • Fargate: ~$30/month (2 tasks running 24/7)
  • Lambda (orders): ~$0.20/month (100 * 30 * $0.0000002)
  • Lambda (images): ~$0.50/month (higher memory, more CPU time)
  • Total compute: ~$31/month

Example 2: Video Processing Pipeline ๐ŸŽฌ

Scenario: Users upload videos that need transcoding to multiple resolutions.

Challenge: Video transcoding can take 10-60 minutes per video, exceeding Lambda's 15-minute limit.

Solution Architecture:

## Step Functions workflow definition (conceptual)
import json

def video_processing_pipeline():
    """
    Step Functions state machine for video processing
    """
    workflow = {
        "StartAt": "ValidateVideo",
        "States": {
            "ValidateVideo": {
                "Type": "Task",
                "Resource": "arn:aws:lambda:...:function:ValidateVideo",
                "Next": "IsValidVideo",
                "TimeoutSeconds": 60
            },
            "IsValidVideo": {
                "Type": "Choice",
                "Choices": [
                    {
                        "Variable": "$.valid",
                        "BooleanEquals": True,
                        "Next": "StartECSTask"
                    }
                ],
                "Default": "VideoRejected"
            },
            "StartECSTask": {
                "Type": "Task",
                "Resource": "arn:aws:states:::ecs:runTask.sync",
                "Parameters": {
                    "LaunchType": "FARGATE",
                    "Cluster": "video-processing-cluster",
                    "TaskDefinition": "ffmpeg-transcoder"
                },
                "Next": "NotifyComplete",
                "TimeoutSeconds": 3600
            },
            "NotifyComplete": {
                "Type": "Task",
                "Resource": "arn:aws:lambda:...:function:SendNotification",
                "End": True
            },
            "VideoRejected": {
                "Type": "Fail",
                "Cause": "Invalid video format"
            }
        }
    }
    return workflow

Component Breakdown:

StepServiceWhy
1. Upload detectionS3 Event โ†’ LambdaTrigger workflow on new uploads
2. Video validationLambda (30s)Quick check of format/codec
3. TranscodingECS Fargate TaskNo time limit, can run 1+ hours
4. NotificationLambda (5s)Quick email/SMS via SNS
OrchestrationStep FunctionsCoordinates multi-step workflow

ECS Task for Transcoding:

## Dockerfile for video transcoding container
FROM jrottenberg/ffmpeg:4.4-alpine

## Install AWS CLI and Python
RUN apk add --no-cache python3 py3-pip && \
    pip3 install boto3

## Copy transcoding script
COPY transcode.py /app/

ENTRYPOINT ["python3", "/app/transcode.py"]
## transcode.py - runs in ECS container
import boto3
import subprocess
import os

s3 = boto3.client('s3')

def transcode_video(input_bucket, input_key, output_bucket):
    # Download from S3
    local_input = '/tmp/input.mp4'
    s3.download_file(input_bucket, input_key, local_input)
    
    # Transcode to multiple resolutions
    resolutions = ['1080p', '720p', '480p']
    
    for res in resolutions:
        output_file = f'/tmp/output_{res}.mp4'
        
        # Use ffmpeg to transcode
        if res == '1080p':
            cmd = ['ffmpeg', '-i', local_input, '-vf', 'scale=1920:1080', 
                   '-c:v', 'libx264', '-crf', '23', output_file]
        elif res == '720p':
            cmd = ['ffmpeg', '-i', local_input, '-vf', 'scale=1280:720',
                   '-c:v', 'libx264', '-crf', '23', output_file]
        else:  # 480p
            cmd = ['ffmpeg', '-i', local_input, '-vf', 'scale=854:480',
                   '-c:v', 'libx264', '-crf', '23', output_file]
        
        subprocess.run(cmd, check=True)
        
        # Upload to S3
        output_key = f'processed/{res}/{input_key}'
        s3.upload_file(output_file, output_bucket, output_key)
        
        os.remove(output_file)
    
    os.remove(local_input)
    return {'status': 'success', 'resolutions': resolutions}

if __name__ == '__main__':
    # Get parameters from environment (set by ECS task)
    input_bucket = os.environ['INPUT_BUCKET']
    input_key = os.environ['INPUT_KEY']
    output_bucket = os.environ['OUTPUT_BUCKET']
    
    result = transcode_video(input_bucket, input_key, output_bucket)
    print(result)

Why This Works:

  • โฑ๏ธ No time limits on ECS Fargate tasks
  • ๐Ÿ’ฐ Pay only when transcoding (task runs then stops)
  • ๐Ÿ“ˆ Scales automatically based on queue depth
  • ๐Ÿ”„ Step Functions coordinates the workflow

Example 3: Auto Scaling Web Application ๐Ÿ“Š

Scenario: News website with traffic spikes during breaking news.

Requirements:

  • Handle 1,000 req/sec during normal hours
  • Scale to 50,000 req/sec during news events
  • Minimize cost during low traffic

Solution: EC2 Auto Scaling with Mixed Instance Types

## Auto Scaling Group Configuration
AutoScalingGroup:
  LaunchTemplate:
    InstanceType: 
      - t3.medium      # Burstable for baseline
      - m5.large       # General purpose for peaks
      - c5.large       # Compute-optimized for high load
    
  MixedInstancesPolicy:
    InstancesDistribution:
      OnDemandBaseCapacity: 2
      OnDemandPercentageAboveBaseCapacity: 20
      SpotInstancePools: 3
      SpotAllocationStrategy: capacity-optimized
  
  Scaling:
    MinSize: 2
    MaxSize: 50
    DesiredCapacity: 2
    
    TargetTrackingPolicies:
      - MetricType: ALBRequestCountPerTarget
        TargetValue: 1000  # requests per instance
      
      - MetricType: CPUUtilization
        TargetValue: 70

Cost Optimization Strategy:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚       AUTO SCALING INSTANCE MIX                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

 Low Traffic (2-5 instances):
 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
 โ”‚ 2ร— On-Demand t3.medium (baseline) โ”‚  ๐Ÿ’ฐ $$$
 โ”‚ 0-3ร— Spot instances (opportunistic)โ”‚  ๐Ÿ’ฐ $
 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

 Medium Traffic (6-20 instances):
 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
 โ”‚ 2ร— On-Demand t3.medium (baseline) โ”‚  ๐Ÿ’ฐ $$$
 โ”‚ 1ร— On-Demand m5.large (stable)    โ”‚  ๐Ÿ’ฐ $$$
 โ”‚ 3-17ร— Spot (bulk capacity)        โ”‚  ๐Ÿ’ฐ $
 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

 High Traffic (21-50 instances):
 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
 โ”‚ 2ร— On-Demand t3.medium            โ”‚  ๐Ÿ’ฐ $$$
 โ”‚ 4ร— On-Demand m5/c5 (20% of extra)โ”‚  ๐Ÿ’ฐ $$$
 โ”‚ 15-44ร— Spot (80% of scaling)      โ”‚  ๐Ÿ’ฐ $
 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

 Spot instances = up to 90% cost savings
 Diversified pools reduce interruption risk

Scaling Behavior:

## CloudWatch Alarm โ†’ Auto Scaling scaling policy
import boto3
import datetime

def calculate_required_capacity(current_rps, avg_rps_per_instance=1000):
    """
    Determine number of instances needed
    """
    required_instances = current_rps / avg_rps_per_instance
    
    # Add 20% buffer for headroom
    buffered_instances = int(required_instances * 1.2)
    
    return max(2, buffered_instances)  # Minimum 2 instances

## Example scaling events
events = [
    {"time": "09:00", "rps": 800, "instances": 2},
    {"time": "12:00", "rps": 5000, "instances": 6},
    {"time": "14:30", "rps": 45000, "instances": 54},  # Breaking news!
    {"time": "16:00", "rps": 8000, "instances": 10},
    {"time": "22:00", "rps": 1200, "instances": 2}
]

for event in events:
    required = calculate_required_capacity(event['rps'])
    print(f"{event['time']}: {event['rps']} req/s โ†’ {required} instances")

๐Ÿ’ก Key Insight: Using 80% Spot instances can reduce compute costs by 60-70% overall, while 20% On-Demand ensures availability if Spot capacity is interrupted.

Example 4: Machine Learning Inference Pipeline ๐Ÿค–

Scenario: Image classification service that receives photos and returns detected objects.

Requirements:

  • Inference time: 200-500ms per image
  • Variable traffic: 10-10,000 images/hour
  • Need GPU for fast inference

Solution Comparison:

ApproachServiceCost/PerformanceComplexity
Option A: LambdaLambda + SageMaker Endpoint$$ / Good for variable loadLow
Option B: ECSECS Fargate + CPU inference$$$ / Slower inferenceMedium
Option C: EC2EC2 P3 instances + Auto Scaling$$$$ / Fastest, expensive idleHigh
Option D: HybridLambda โ†’ SageMaker Serverless$ / Best for sporadic useLow

Recommended: Hybrid Approach with SageMaker Serverless

## Lambda function for inference
import json
import boto3
import base64

sagemaker_runtime = boto3.client('sagemaker-runtime')

def lambda_handler(event, context):
    """
    Receive image, invoke SageMaker endpoint, return predictions
    """
    # Extract base64-encoded image from API Gateway
    image_data = base64.b64decode(event['body'])
    
    # Invoke SageMaker Serverless endpoint
    response = sagemaker_runtime.invoke_endpoint(
        EndpointName='image-classifier-serverless',
        ContentType='application/x-image',
        Body=image_data
    )
    
    # Parse predictions
    predictions = json.loads(response['Body'].read().decode())
    
    # Return top 3 predictions
    top_predictions = sorted(
        predictions, 
        key=lambda x: x['score'], 
        reverse=True
    )[:3]
    
    return {
        'statusCode': 200,
        'body': json.dumps({
            'predictions': top_predictions
        })
    }

SageMaker Serverless Inference Configuration:

## SageMaker endpoint configuration
import sagemaker
from sagemaker.serverless import ServerlessInferenceConfig

serverless_config = ServerlessInferenceConfig(
    memory_size_in_mb=4096,  # 2GB, 4GB, or 6GB
    max_concurrency=10       # Max concurrent requests
)

predictor = model.deploy(
    serverless_inference_config=serverless_config,
    endpoint_name='image-classifier-serverless'
)

Why This Architecture?

  • ๐Ÿ’ฐ Cost: Pay only for inference time (per second), not idle GPU time
  • โšก Performance: SageMaker handles model loading, scales automatically
  • ๐Ÿ”„ Simplicity: Lambda + managed endpoint, no infrastructure
  • ๐Ÿ“Š Scaling: Handles variable traffic without pre-provisioning

Cost Comparison (1,000 inferences/day):

  • Lambda + SageMaker Serverless: ~$15/month
  • EC2 P3 instance (24/7): ~$2,200/month
  • Savings: 93%

Common Mistakes to Avoid โš ๏ธ

Mistake 1: Using Lambda for Long-Running Tasks

โŒ Wrong:

## Lambda function that processes large videos
def lambda_handler(event, context):
    video_url = event['video_url']
    download_video(video_url)  # 5 minutes
    transcode_video()          # 20 minutes โ† FAILS! Max 15 min
    upload_result()
    return {'status': 'success'}

โœ… Correct:

## Lambda triggers ECS Fargate task for long processing
import boto3

ecs = boto3.client('ecs')

def lambda_handler(event, context):
    video_url = event['video_url']
    
    # Start ECS task (no time limit)
    response = ecs.run_task(
        cluster='video-processing',
        taskDefinition='video-transcoder',
        launchType='FARGATE',
        overrides={
            'containerOverrides': [{
                'name': 'transcoder',
                'environment': [
                    {'name': 'VIDEO_URL', 'value': video_url}
                ]
            }]
        }
    )
    
    return {'status': 'task_started', 'taskArn': response['tasks'][0]['taskArn']}

Mistake 2: Ignoring Cold Starts

โŒ Problem: Lambda cold starts causing 2-5 second delays for latency-sensitive APIs.

โœ… Solutions:

## Option 1: Provisioned Concurrency (keeps functions warm)
import boto3

lambda_client = boto3.client('lambda')

## Reserve 5 always-warm instances
lambda_client.put_provisioned_concurrency_config(
    FunctionName='latency-sensitive-api',
    ProvisionedConcurrentExecutions=5,
    Qualifier='$LATEST'
)

## Option 2: Use Lambda SnapStart (Java only)
## Snapshots initialized function state for instant startup

## Option 3: For <100ms latency requirement, use:
## - API Gateway โ†’ ALB โ†’ ECS Fargate (persistent containers)
## - API Gateway โ†’ EC2 instances (always running)

Mistake 3: Wrong EC2 Instance Family

โŒ Wrong: Using compute-optimized (C5) instances for memory-intensive caching

Instance TypevCPURAMCost/hourRAM per $
c5.2xlarge816 GB$0.3447 GB/$
r5.2xlarge864 GB$0.50128 GB/$

โœ… Correct: For Redis/Memcached, use R5 (memory-optimized) for 2.7ร— more RAM per dollar.

Mistake 4: Not Using Spot Instances for Fault-Tolerant Workloads

โŒ Wrong: Running batch jobs on On-Demand instances

## Batch processing job - NOT cost-optimized
BatchJob:
  InstanceType: m5.4xlarge
  PricingModel: OnDemand
  Cost: $0.768/hour
  Annual: $6,730 (if running 24/7)

โœ… Correct: Use Spot instances for 90% discount

## AWS Batch job definition with Spot
import boto3

batch = boto3.client('batch')

compute_environment = {
    'type': 'MANAGED',
    'computeResources': {
        'type': 'SPOT',
        'allocationStrategy': 'SPOT_CAPACITY_OPTIMIZED',
        'instanceTypes': ['m5.4xlarge', 'm5a.4xlarge', 'm5n.4xlarge'],
        'minvCpus': 0,
        'maxvCpus': 256,
        'spotIamFleetRole': 'arn:aws:iam::...:role/SpotFleetRole',
        'bidPercentage': 100  # Pay up to 100% of On-Demand price
    }
}

## Cost: ~$0.08/hour (spot price varies)
## Annual savings: ~$6,000 (89% reduction)

Mistake 5: Over-Provisioning Fargate Memory

โŒ Wrong: Setting memory to 4 GB when container uses 512 MB

{
  "family": "api-server",
  "cpu": "1024",
  "memory": "4096",
  "containerDefinitions": [{
    "name": "app",
    "memory": 512
  }]
}

Cost: 1 vCPU ($0.04048/hr) + 4 GB ($0.01778/hr) = $0.42/day

โœ… Correct: Right-size memory to actual usage

{
  "family": "api-server",
  "cpu": "1024",
  "memory": "1024",
  "containerDefinitions": [{
    "name": "app",
    "memory": 512
  }]
}

Cost: 1 vCPU ($0.04048/hr) + 1 GB ($0.00445/hr) = $0.32/day (24% savings)

๐Ÿ’ก Monitoring Tip: Use Container Insights to track actual memory usage and right-size accordingly.


Key Takeaways ๐ŸŽฏ

๐Ÿ“‹ Quick Reference Card: AWS Compute Services

ServiceUse WhenAvoid WhenPricing
EC2 โ€ข Full OS control needed
โ€ข Persistent state
โ€ข Specialized software
โ€ข Variable traffic
โ€ข Event-driven
โ€ข Want zero ops
$/hour
Lambda โ€ข Event-driven
โ€ข < 15 min execution
โ€ข Variable load
โ€ข Long-running tasks
โ€ข Need persistent state
โ€ข Low latency critical
$/100ms
ECS โ€ข Containerized apps
โ€ข AWS-native stack
โ€ข Microservices
โ€ข Need Kubernetes
โ€ข Multi-cloud required
Fargate: $/vCPU-sec
EC2: $/hour
EKS โ€ข Kubernetes required
โ€ข Multi-cloud
โ€ข Complex orchestration
โ€ข Simple workloads
โ€ข Small teams
โ€ข Avoid K8s complexity
$0.10/hr + compute
Fargate โ€ข Containers without servers
โ€ข Variable load
โ€ข Ops simplicity
โ€ข Need EC2 features
โ€ข Cost-sensitive 24/7 workloads
$/vCPU-sec + $/GB-sec

๐Ÿง  Memory Devices:

  • ELEFF = EC2, Lambda, ECS, EKS, Fargate, Flexibility
  • "The Compute Spectrum": Full Control (EC2) โ†’ Container Control (ECS/EKS) โ†’ No Control (Lambda)

๐Ÿ’ฐ Cost Optimization Rules:

  1. Reserved Instances for predictable workloads (40-75% savings)
  2. Spot Instances for fault-tolerant jobs (70-90% savings)
  3. Right-size instance types (match CPU/memory to workload)
  4. Use Fargate for variable container workloads
  5. Use Lambda for event-driven, short-duration tasks

โšก Performance Guidelines:

  • Latency < 100ms: EC2 or persistent containers (avoid Lambda cold starts)
  • Throughput-focused: Use compute-optimized (C5/C6) instances
  • Memory-intensive: Use memory-optimized (R5/R6) instances
  • GPU workloads: P3/P4 instances or SageMaker

๐Ÿ”’ Security Best Practices:

  • Always use IAM roles for EC2/ECS (never hardcode credentials)
  • Enable VPC Flow Logs for network monitoring
  • Use Security Groups as stateful firewalls
  • Encrypt data at rest (EBS volumes) and in transit (TLS)
  • Patch EC2 instances regularly (use Systems Manager Patch Manager)

๐Ÿ“š Further Study

  1. AWS Official Documentation: EC2 Instance Types - Comprehensive guide to all instance families
  2. AWS Compute Blog: AWS Compute Blog - Real-world architecture patterns and best practices
  3. AWS Well-Architected Framework: Compute Best Practices - Official architectural guidance

๐ŸŽ“ You've completed AWS Compute & Runtime Architecture! You now understand how to choose between EC2, Lambda, ECS, EKS, and Fargate based on workload characteristics, cost requirements, and operational complexity. Practice architecting solutions for different scenarios to solidify your knowledge.

๐Ÿ’ก Next Steps: Explore AWS networking (VPC, Load Balancers) to understand how compute services communicate, or dive into storage services (S3, EBS, EFS) to complete your infrastructure knowledge.