Compute & Runtime Architecture

Master EC2, containers, serverless computing, and event-driven architectures for scalable applications

AWS Compute & Runtime Architecture

Master AWS compute services with free flashcards and proven study techniques. This lesson covers EC2 instance types, Lambda serverless architecture, container orchestration with ECS and EKS, and compute selection strategies—essential concepts for building scalable cloud applications and passing AWS certification exams.

Welcome to AWS Compute Services 💻

Compute is the foundation of cloud infrastructure. AWS offers multiple compute services, each optimized for different workload patterns. Understanding when to use virtual machines, containers, or serverless functions is critical for building cost-effective, performant applications.

In this lesson, you'll learn:

EC2 fundamentals and instance family selection
Lambda serverless architecture patterns
Container services (ECS, EKS, Fargate)
Compute optimization strategies for different workloads

💡 Pro tip: Think of compute services as different types of restaurants—EC2 is like owning a full kitchen (maximum control), containers are like meal prep kits (balanced control and convenience), and Lambda is like ordering delivery (pay only for what you consume).

Core Concepts: The AWS Compute Spectrum

🖥️ EC2: Elastic Compute Cloud

EC2 provides resizable virtual machines in the cloud. You have complete control over the operating system, networking, and storage configuration.

Instance Families (named with pattern: Family + Generation + Size)

Family	Optimized For	Use Cases	Example Type
T (T2, T3, T3a)	Burstable performance	Web servers, dev environments	t3.medium
M (M5, M6i)	General purpose (balanced)	Application servers, databases	m5.large
C (C5, C6g)	Compute-optimized	Batch processing, gaming servers	c5.xlarge
R (R5, R6g)	Memory-optimized	In-memory caches, real-time analytics	r5.2xlarge
I (I3, I3en)	Storage-optimized	NoSQL databases, data warehouses	i3.xlarge
P (P3, P4d)	GPU instances	Machine learning, rendering	p3.2xlarge

🧠 Memory Aid: "TMC RI P" = Teeny burstable, Medium balanced, Compute heavy, RAM heavy, IOPS heavy, Parallel processing

EC2 Pricing Models:

┌─────────────────────────────────────────────────┐
│         EC2 PRICING COMPARISON                  │
├─────────────────────────────────────────────────┤
│                                                 │
│  💰 On-Demand                                   │
│     Pay per hour/second                         │
│     No commitment                               │
│     $$$$  (Most expensive)                      │
│     ↓                                           │
│  🎟️ Reserved Instances (1-3 years)             │
│     Up to 75% discount                          │
│     Steady-state workloads                      │
│     $$$                                         │
│     ↓                                           │
│  🏷️ Savings Plans                              │
│     Flexible across instance families           │
│     Commit to $/hour usage                      │
│     $$                                          │
│     ↓                                           │
│  🎯 Spot Instances                              │
│     Up to 90% discount                          │
│     Can be interrupted                          │
│     $  (Cheapest)                               │
│                                                 │
└─────────────────────────────────────────────────┘

⚠️ Common Mistake: Using On-Demand pricing for predictable, long-running workloads. Reserved Instances or Savings Plans can save 40-75% for stable production workloads.

⚡ Lambda: Serverless Compute

AWS Lambda runs code without provisioning servers. You pay only for compute time consumed (per 100ms increments).

Lambda Architecture Pattern:

┌────────────────────────────────────────────────┐
│          LAMBDA EXECUTION FLOW                 │
└────────────────────────────────────────────────┘

    📡 Event Source
    (API Gateway, S3, DynamoDB, etc.)
           │
           ↓
    🔔 Event Trigger
           │
           ↓
    ⚙️ Lambda Service
       ┌────────────────┐
       │ Cold Start?    │
       │  ├─Yes→ Init  │  (1-5 seconds)
       │  └─No→ Reuse  │  (milliseconds)
       └────────┬───────┘
                │
                ↓
       ┌────────────────┐
       │ Execute Code   │
       │ (max 15 min)   │
       └────────┬───────┘
                │
                ↓
    📤 Return Response
           │
           ↓
    📊 CloudWatch Logs

Lambda Limits (Critical for Architecture Decisions):

Resource	Default Limit	Can Increase?
Execution timeout	15 minutes max	No
Memory	128 MB - 10,240 MB	No
Deployment package	50 MB (zipped), 250 MB (unzipped)	No
Concurrent executions	1,000 per region	Yes (request increase)
Environment variables	4 KB total	No
/tmp storage	512 MB - 10,240 MB	No

💡 When NOT to use Lambda:

Long-running processes (> 15 minutes)
Stateful applications requiring persistent connections
High-performance computing needing low-latency responses
Workloads with constant, predictable traffic (EC2 may be cheaper)

🐳 Container Services: ECS, EKS, and Fargate

AWS offers three primary container orchestration services:

Container Service Comparison:

┌──────────────────────────────────────────────────┐
│           CONTAINER SERVICE DECISION TREE        │
└──────────────────────────────────────────────────┘

                "I need containers"
                        │
                        ↓
            ┌───────────┴───────────┐
            │                       │
    "Need Kubernetes?"       "Simple container
            │                 orchestration?"
            │                       │
      ┌─────┴─────┐                 ↓
      │           │           ┌─────────┐
   "Yes"       "No"          │   ECS   │ ← AWS-native
      │           │           └─────────┘
      ↓           ↓                 │
┌─────────┐  ┌─────────┐            ↓
│   EKS   │  │   ECS   │    "Who manages servers?"
└─────────┘  └─────────┘            │
      │           │           ┌─────┴─────┐
      │           │           │           │
      ↓           ↓         "AWS"       "Me"
 "Launch type?" "Launch type?"  │           │
      │           │           ↓           ↓
┌─────┴─────┐ ┌───┴────┐  ┌─────────┐ ┌──────┐
│           │ │        │  │ Fargate │ │ EC2  │
│  Fargate  │ │  EC2   │  └─────────┘ └──────┘
│    or     │ │   or   │  (Serverless) (Self-managed)
│   EC2     │ │Fargate │
└───────────┘ └────────┘

Service	What It Is	Best For	Control Level
ECS	AWS-native container orchestration	AWS-centric architectures, simpler setup	High
EKS	Managed Kubernetes	Multi-cloud, Kubernetes expertise, complex workloads	Highest
Fargate	Serverless compute for containers	No server management, pay-per-use	Medium

ECS Task Definition Example (Simplified):

{
  "family": "web-app",
  "containerDefinitions": [
    {
      "name": "nginx",
      "image": "nginx:latest",
      "memory": 512,
      "cpu": 256,
      "portMappings": [
        {
          "containerPort": 80,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {"name": "ENV", "value": "production"}
      ]
    }
  ],
  "requiresCompatibilities": ["FARGATE"],
  "networkMode": "awsvpc",
  "cpu": "256",
  "memory": "512"
}

🔧 Try this: The requiresCompatibilities field determines whether the task runs on Fargate (serverless) or EC2 (self-managed). Changing this one field shifts your entire infrastructure model.

🎯 Fargate: Serverless Containers

Fargate is a launch type for ECS and EKS that removes server management entirely.

Traditional ECS on EC2 vs. Fargate:

┌─────────────────────────────────────────────────┐
│        ECS ON EC2 (You Manage Servers)          │
├─────────────────────────────────────────────────┤
│                                                 │
│  You provision: EC2 instances                   │
│                 ↓                               │
│  You install:   ECS agent                       │
│                 ↓                               │
│  You manage:    OS patches, scaling             │
│                 ↓                               │
│  ECS schedules: Containers on your instances    │
│                                                 │
└─────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────┐
│         FARGATE (AWS Manages Servers)           │
├─────────────────────────────────────────────────┤
│                                                 │
│  You define:    Task definition (CPU, memory)   │
│                 ↓                               │
│  AWS provides:  Compute capacity automatically  │
│                 ↓                               │
│  You pay:       Per vCPU/GB per second          │
│                                                 │
│  No servers, no patches, no capacity planning   │
│                                                 │
└─────────────────────────────────────────────────┘

Fargate Pricing Model:

vCPU: $0.04048 per vCPU per hour
Memory: $0.004445 per GB per hour
Billed per second, 1-minute minimum

💰 Cost Comparison Example: Running a 0.25 vCPU, 0.5 GB task for 30 days:

Fargate: (~$9/month)
t3.micro EC2 (assuming 50% utilization): (~$7.50/month)
But EC2 requires management overhead, patching, monitoring

🏗️ Compute Selection Framework

Choosing the right compute service depends on multiple factors:

Factor	EC2	Lambda	ECS/EKS	Fargate
Control	Full OS access	Code only	Container + orchestration	Container only
Scaling	Manual/Auto Scaling Groups	Automatic	Service-based	Automatic
Management	You patch/maintain	AWS manages all	You manage instances	AWS manages compute
Pricing	Per hour/second	Per 100ms execution	Per instance hour	Per vCPU/GB second
Cold Start	Minutes (boot)	100ms-5s	Seconds (container)	Seconds (container)
State	Persistent	Ephemeral	Flexible	Ephemeral

Decision Matrix:

┌─────────────────────────────────────────────────┐
│         WORKLOAD → COMPUTE MAPPING              │
├─────────────────────────────────────────────────┤
│                                                 │
│  🌐 Web Application (steady traffic)            │
│     → EC2 with Auto Scaling                     │
│     → ECS on EC2 (containerized)                │
│                                                 │
│  📊 API with variable traffic                   │
│     → Lambda (event-driven)                     │
│     → ECS with Fargate (containerized)          │
│                                                 │
│  🎮 Real-time multiplayer game server           │
│     → EC2 (persistent connections)              │
│     → ECS on EC2 (for multi-region)             │
│                                                 │
│  📸 Image processing (triggered by uploads)     │
│     → Lambda (< 15 min per image)               │
│     → EC2 Spot (batch, cost-sensitive)          │
│                                                 │
│  🤖 Machine Learning training                   │
│     → EC2 P3/P4 instances (GPU)                 │
│     → EKS with GPU nodes (distributed)          │
│                                                 │
│  🗄️ Database server                             │
│     → EC2 (use RDS instead if possible)         │
│     → Never Lambda (stateful, long connections) │
│                                                 │
│  🔄 Microservices architecture                  │
│     → ECS Fargate (managed scaling)             │
│     → EKS (if Kubernetes required)              │
│                                                 │
└─────────────────────────────────────────────────┘

Real-World Examples

Example 1: E-commerce Website Architecture 🛒

Scenario: You're building an online store with:

Web frontend (React SPA)
REST API backend (Node.js)
Order processing (batch jobs)
Product image resizing

Optimal Compute Strategy:

## CloudFormation/Architecture Snippet (Conceptual)
Components:
  StaticWebsite:
    Service: S3 + CloudFront
    Reason: No compute needed for static assets
  
  APIBackend:
    Service: ECS Fargate
    Configuration:
      - CPU: 0.5 vCPU
      - Memory: 1 GB
      - Auto Scaling: 2-10 tasks based on CPU
    Reason: Containerized, variable traffic, no server management
  
  OrderProcessing:
    Service: Lambda
    Trigger: SQS queue
    Configuration:
      - Memory: 512 MB
      - Timeout: 5 minutes
      - Reserved Concurrency: 10
    Reason: Event-driven, short processing time, pay-per-execution
  
  ImageResizing:
    Service: Lambda
    Trigger: S3 upload event
    Configuration:
      - Memory: 2048 MB (more memory = more CPU)
      - Timeout: 1 minute
      - Concurrent executions: 100
    Reason: Event-driven, parallel processing, scales automatically

Architecture Flow:

┌────────────────────────────────────────────────┐
│      E-COMMERCE COMPUTE ARCHITECTURE           │
└────────────────────────────────────────────────┘

  👤 User Browser
       │
       ↓
  📦 CloudFront (CDN)
       │
       ├──→ S3 (static files)
       │
       └──→ API Gateway
             │
             ↓
       ⚙️ ECS Fargate Cluster
          (API containers)
             │
             ├──→ RDS (product data)
             │
             └──→ SQS Queue (orders)
                    │
                    ↓
              ⚡ Lambda Function
                (process orders)
                    │
                    ↓
              📧 SNS (notifications)

  Separate flow:
  📷 Image Upload → S3 → ⚡ Lambda (resize) → S3

Why This Mix?

Fargate for API: Handles HTTP requests efficiently, scales based on traffic
Lambda for processing: Decouples order processing, scales per message
Lambda for images: Processes in parallel, only pays when images uploaded

💰 Cost Estimation (10,000 API requests/day, 100 orders/day, 50 image uploads/day):

Fargate: ~$30/month (2 tasks running 24/7)
Lambda (orders): ~$0.20/month (100 * 30 * $0.0000002)
Lambda (images): ~$0.50/month (higher memory, more CPU time)
Total compute: ~$31/month

Example 2: Video Processing Pipeline 🎬

Scenario: Users upload videos that need transcoding to multiple resolutions.

Challenge: Video transcoding can take 10-60 minutes per video, exceeding Lambda's 15-minute limit.

Solution Architecture:

## Step Functions workflow definition (conceptual)
import json

def video_processing_pipeline():
    """
    Step Functions state machine for video processing
    """
    workflow = {
        "StartAt": "ValidateVideo",
        "States": {
            "ValidateVideo": {
                "Type": "Task",
                "Resource": "arn:aws:lambda:...:function:ValidateVideo",
                "Next": "IsValidVideo",
                "TimeoutSeconds": 60
            },
            "IsValidVideo": {
                "Type": "Choice",
                "Choices": [
                    {
                        "Variable": "$.valid",
                        "BooleanEquals": True,
                        "Next": "StartECSTask"
                    }
                ],
                "Default": "VideoRejected"
            },
            "StartECSTask": {
                "Type": "Task",
                "Resource": "arn:aws:states:::ecs:runTask.sync",
                "Parameters": {
                    "LaunchType": "FARGATE",
                    "Cluster": "video-processing-cluster",
                    "TaskDefinition": "ffmpeg-transcoder"
                },
                "Next": "NotifyComplete",
                "TimeoutSeconds": 3600
            },
            "NotifyComplete": {
                "Type": "Task",
                "Resource": "arn:aws:lambda:...:function:SendNotification",
                "End": True
            },
            "VideoRejected": {
                "Type": "Fail",
                "Cause": "Invalid video format"
            }
        }
    }
    return workflow

Component Breakdown:

Step	Service	Why
1. Upload detection	S3 Event → Lambda	Trigger workflow on new uploads
2. Video validation	Lambda (30s)	Quick check of format/codec
3. Transcoding	ECS Fargate Task	No time limit, can run 1+ hours
4. Notification	Lambda (5s)	Quick email/SMS via SNS
Orchestration	Step Functions	Coordinates multi-step workflow

ECS Task for Transcoding:

## Dockerfile for video transcoding container
FROM jrottenberg/ffmpeg:4.4-alpine

## Install AWS CLI and Python
RUN apk add --no-cache python3 py3-pip && \
    pip3 install boto3

## Copy transcoding script
COPY transcode.py /app/

ENTRYPOINT ["python3", "/app/transcode.py"]

## transcode.py - runs in ECS container
import boto3
import subprocess
import os

s3 = boto3.client('s3')

def transcode_video(input_bucket, input_key, output_bucket):
    # Download from S3
    local_input = '/tmp/input.mp4'
    s3.download_file(input_bucket, input_key, local_input)
    
    # Transcode to multiple resolutions
    resolutions = ['1080p', '720p', '480p']
    
    for res in resolutions:
        output_file = f'/tmp/output_{res}.mp4'
        
        # Use ffmpeg to transcode
        if res == '1080p':
            cmd = ['ffmpeg', '-i', local_input, '-vf', 'scale=1920:1080', 
                   '-c:v', 'libx264', '-crf', '23', output_file]
        elif res == '720p':
            cmd = ['ffmpeg', '-i', local_input, '-vf', 'scale=1280:720',
                   '-c:v', 'libx264', '-crf', '23', output_file]
        else:  # 480p
            cmd = ['ffmpeg', '-i', local_input, '-vf', 'scale=854:480',
                   '-c:v', 'libx264', '-crf', '23', output_file]
        
        subprocess.run(cmd, check=True)
        
        # Upload to S3
        output_key = f'processed/{res}/{input_key}'
        s3.upload_file(output_file, output_bucket, output_key)
        
        os.remove(output_file)
    
    os.remove(local_input)
    return {'status': 'success', 'resolutions': resolutions}

if __name__ == '__main__':
    # Get parameters from environment (set by ECS task)
    input_bucket = os.environ['INPUT_BUCKET']
    input_key = os.environ['INPUT_KEY']
    output_bucket = os.environ['OUTPUT_BUCKET']
    
    result = transcode_video(input_bucket, input_key, output_bucket)
    print(result)

Why This Works:

⏱️ No time limits on ECS Fargate tasks
💰 Pay only when transcoding (task runs then stops)
📈 Scales automatically based on queue depth
🔄 Step Functions coordinates the workflow

Example 3: Auto Scaling Web Application 📊

Scenario: News website with traffic spikes during breaking news.

Requirements:

Handle 1,000 req/sec during normal hours
Scale to 50,000 req/sec during news events
Minimize cost during low traffic

Solution: EC2 Auto Scaling with Mixed Instance Types

## Auto Scaling Group Configuration
AutoScalingGroup:
  LaunchTemplate:
    InstanceType: 
      - t3.medium      # Burstable for baseline
      - m5.large       # General purpose for peaks
      - c5.large       # Compute-optimized for high load
    
  MixedInstancesPolicy:
    InstancesDistribution:
      OnDemandBaseCapacity: 2
      OnDemandPercentageAboveBaseCapacity: 20
      SpotInstancePools: 3
      SpotAllocationStrategy: capacity-optimized
  
  Scaling:
    MinSize: 2
    MaxSize: 50
    DesiredCapacity: 2
    
    TargetTrackingPolicies:
      - MetricType: ALBRequestCountPerTarget
        TargetValue: 1000  # requests per instance
      
      - MetricType: CPUUtilization
        TargetValue: 70

Cost Optimization Strategy:

┌────────────────────────────────────────────────┐
│       AUTO SCALING INSTANCE MIX                │
└────────────────────────────────────────────────┘

 Low Traffic (2-5 instances):
 ┌────────────────────────────────────┐
 │ 2× On-Demand t3.medium (baseline) │  💰 $$$
 │ 0-3× Spot instances (opportunistic)│  💰 $
 └────────────────────────────────────┘

 Medium Traffic (6-20 instances):
 ┌────────────────────────────────────┐
 │ 2× On-Demand t3.medium (baseline) │  💰 $$$
 │ 1× On-Demand m5.large (stable)    │  💰 $$$
 │ 3-17× Spot (bulk capacity)        │  💰 $
 └────────────────────────────────────┘

 High Traffic (21-50 instances):
 ┌────────────────────────────────────┐
 │ 2× On-Demand t3.medium            │  💰 $$$
 │ 4× On-Demand m5/c5 (20% of extra)│  💰 $$$
 │ 15-44× Spot (80% of scaling)      │  💰 $
 └────────────────────────────────────┘

 Spot instances = up to 90% cost savings
 Diversified pools reduce interruption risk

Scaling Behavior:

## CloudWatch Alarm → Auto Scaling scaling policy
import boto3
import datetime

def calculate_required_capacity(current_rps, avg_rps_per_instance=1000):
    """
    Determine number of instances needed
    """
    required_instances = current_rps / avg_rps_per_instance
    
    # Add 20% buffer for headroom
    buffered_instances = int(required_instances * 1.2)
    
    return max(2, buffered_instances)  # Minimum 2 instances

## Example scaling events
events = [
    {"time": "09:00", "rps": 800, "instances": 2},
    {"time": "12:00", "rps": 5000, "instances": 6},
    {"time": "14:30", "rps": 45000, "instances": 54},  # Breaking news!
    {"time": "16:00", "rps": 8000, "instances": 10},
    {"time": "22:00", "rps": 1200, "instances": 2}
]

for event in events:
    required = calculate_required_capacity(event['rps'])
    print(f"{event['time']}: {event['rps']} req/s → {required} instances")

💡 Key Insight: Using 80% Spot instances can reduce compute costs by 60-70% overall, while 20% On-Demand ensures availability if Spot capacity is interrupted.

Example 4: Machine Learning Inference Pipeline 🤖

Scenario: Image classification service that receives photos and returns detected objects.

Requirements:

Inference time: 200-500ms per image
Variable traffic: 10-10,000 images/hour
Need GPU for fast inference

Solution Comparison:

Approach	Service	Cost/Performance	Complexity
Option A: Lambda	Lambda + SageMaker Endpoint	$$ / Good for variable load	Low
Option B: ECS	ECS Fargate + CPU inference	$$$ / Slower inference	Medium
Option C: EC2	EC2 P3 instances + Auto Scaling	$$$$ / Fastest, expensive idle	High
Option D: Hybrid	Lambda → SageMaker Serverless	$ / Best for sporadic use	Low

Recommended: Hybrid Approach with SageMaker Serverless

## Lambda function for inference
import json
import boto3
import base64

sagemaker_runtime = boto3.client('sagemaker-runtime')

def lambda_handler(event, context):
    """
    Receive image, invoke SageMaker endpoint, return predictions
    """
    # Extract base64-encoded image from API Gateway
    image_data = base64.b64decode(event['body'])
    
    # Invoke SageMaker Serverless endpoint
    response = sagemaker_runtime.invoke_endpoint(
        EndpointName='image-classifier-serverless',
        ContentType='application/x-image',
        Body=image_data
    )
    
    # Parse predictions
    predictions = json.loads(response['Body'].read().decode())
    
    # Return top 3 predictions
    top_predictions = sorted(
        predictions, 
        key=lambda x: x['score'], 
        reverse=True
    )[:3]
    
    return {
        'statusCode': 200,
        'body': json.dumps({
            'predictions': top_predictions
        })
    }

SageMaker Serverless Inference Configuration:

## SageMaker endpoint configuration
import sagemaker
from sagemaker.serverless import ServerlessInferenceConfig

serverless_config = ServerlessInferenceConfig(
    memory_size_in_mb=4096,  # 2GB, 4GB, or 6GB
    max_concurrency=10       # Max concurrent requests
)

predictor = model.deploy(
    serverless_inference_config=serverless_config,
    endpoint_name='image-classifier-serverless'
)

Why This Architecture?

💰 Cost: Pay only for inference time (per second), not idle GPU time
⚡ Performance: SageMaker handles model loading, scales automatically
🔄 Simplicity: Lambda + managed endpoint, no infrastructure
📊 Scaling: Handles variable traffic without pre-provisioning

Cost Comparison (1,000 inferences/day):

Lambda + SageMaker Serverless: ~$15/month
EC2 P3 instance (24/7): ~$2,200/month
Savings: 93%

Common Mistakes to Avoid ⚠️

Mistake 1: Using Lambda for Long-Running Tasks

❌ Wrong:

## Lambda function that processes large videos
def lambda_handler(event, context):
    video_url = event['video_url']
    download_video(video_url)  # 5 minutes
    transcode_video()          # 20 minutes ← FAILS! Max 15 min
    upload_result()
    return {'status': 'success'}

✅ Correct:

## Lambda triggers ECS Fargate task for long processing
import boto3

ecs = boto3.client('ecs')

def lambda_handler(event, context):
    video_url = event['video_url']
    
    # Start ECS task (no time limit)
    response = ecs.run_task(
        cluster='video-processing',
        taskDefinition='video-transcoder',
        launchType='FARGATE',
        overrides={
            'containerOverrides': [{
                'name': 'transcoder',
                'environment': [
                    {'name': 'VIDEO_URL', 'value': video_url}
                ]
            }]
        }
    )
    
    return {'status': 'task_started', 'taskArn': response['tasks'][0]['taskArn']}

Mistake 2: Ignoring Cold Starts

❌ Problem: Lambda cold starts causing 2-5 second delays for latency-sensitive APIs.

✅ Solutions:

## Option 1: Provisioned Concurrency (keeps functions warm)
import boto3

lambda_client = boto3.client('lambda')

## Reserve 5 always-warm instances
lambda_client.put_provisioned_concurrency_config(
    FunctionName='latency-sensitive-api',
    ProvisionedConcurrentExecutions=5,
    Qualifier='$LATEST'
)

## Option 2: Use Lambda SnapStart (Java only)
## Snapshots initialized function state for instant startup

## Option 3: For <100ms latency requirement, use:
## - API Gateway → ALB → ECS Fargate (persistent containers)
## - API Gateway → EC2 instances (always running)

Mistake 3: Wrong EC2 Instance Family

❌ Wrong: Using compute-optimized (C5) instances for memory-intensive caching

Instance Type	vCPU	RAM	Cost/hour	RAM per $
c5.2xlarge	8	16 GB	$0.34	47 GB/$
r5.2xlarge	8	64 GB	$0.50	128 GB/$

✅ Correct: For Redis/Memcached, use R5 (memory-optimized) for 2.7× more RAM per dollar.

Mistake 4: Not Using Spot Instances for Fault-Tolerant Workloads

❌ Wrong: Running batch jobs on On-Demand instances

## Batch processing job - NOT cost-optimized
BatchJob:
  InstanceType: m5.4xlarge
  PricingModel: OnDemand
  Cost: $0.768/hour
  Annual: $6,730 (if running 24/7)

✅ Correct: Use Spot instances for 90% discount

## AWS Batch job definition with Spot
import boto3

batch = boto3.client('batch')

compute_environment = {
    'type': 'MANAGED',
    'computeResources': {
        'type': 'SPOT',
        'allocationStrategy': 'SPOT_CAPACITY_OPTIMIZED',
        'instanceTypes': ['m5.4xlarge', 'm5a.4xlarge', 'm5n.4xlarge'],
        'minvCpus': 0,
        'maxvCpus': 256,
        'spotIamFleetRole': 'arn:aws:iam::...:role/SpotFleetRole',
        'bidPercentage': 100  # Pay up to 100% of On-Demand price
    }
}

## Cost: ~$0.08/hour (spot price varies)
## Annual savings: ~$6,000 (89% reduction)

Mistake 5: Over-Provisioning Fargate Memory

❌ Wrong: Setting memory to 4 GB when container uses 512 MB

{
  "family": "api-server",
  "cpu": "1024",
  "memory": "4096",
  "containerDefinitions": [{
    "name": "app",
    "memory": 512
  }]
}

Cost: 1 vCPU ($0.04048/hr) + 4 GB ($0.01778/hr) = $0.42/day

✅ Correct: Right-size memory to actual usage

{
  "family": "api-server",
  "cpu": "1024",
  "memory": "1024",
  "containerDefinitions": [{
    "name": "app",
    "memory": 512
  }]
}

Cost: 1 vCPU ($0.04048/hr) + 1 GB ($0.00445/hr) = $0.32/day (24% savings)

💡 Monitoring Tip: Use Container Insights to track actual memory usage and right-size accordingly.

Key Takeaways 🎯

📋 Quick Reference Card: AWS Compute Services

Service	Use When	Avoid When	Pricing
EC2	• Full OS control needed • Persistent state • Specialized software	• Variable traffic • Event-driven • Want zero ops	$/hour
Lambda	• Event-driven • < 15 min execution • Variable load	• Long-running tasks • Need persistent state • Low latency critical	$/100ms
ECS	• Containerized apps • AWS-native stack • Microservices	• Need Kubernetes • Multi-cloud required	Fargate: $/vCPU-sec EC2: $/hour
EKS	• Kubernetes required • Multi-cloud • Complex orchestration	• Simple workloads • Small teams • Avoid K8s complexity	$0.10/hr + compute
Fargate	• Containers without servers • Variable load • Ops simplicity	• Need EC2 features • Cost-sensitive 24/7 workloads	$/vCPU-sec + $/GB-sec

🧠 Memory Devices:

ELEFF = EC2, Lambda, ECS, EKS, Fargate, Flexibility
"The Compute Spectrum": Full Control (EC2) → Container Control (ECS/EKS) → No Control (Lambda)

💰 Cost Optimization Rules:

Reserved Instances for predictable workloads (40-75% savings)
Spot Instances for fault-tolerant jobs (70-90% savings)
Right-size instance types (match CPU/memory to workload)
Use Fargate for variable container workloads
Use Lambda for event-driven, short-duration tasks

⚡ Performance Guidelines:

Latency < 100ms: EC2 or persistent containers (avoid Lambda cold starts)
Throughput-focused: Use compute-optimized (C5/C6) instances
Memory-intensive: Use memory-optimized (R5/R6) instances
GPU workloads: P3/P4 instances or SageMaker

🔒 Security Best Practices:

Always use IAM roles for EC2/ECS (never hardcode credentials)
Enable VPC Flow Logs for network monitoring
Use Security Groups as stateful firewalls
Encrypt data at rest (EBS volumes) and in transit (TLS)
Patch EC2 instances regularly (use Systems Manager Patch Manager)

📚 Further Study

AWS Official Documentation: EC2 Instance Types - Comprehensive guide to all instance families
AWS Compute Blog: AWS Compute Blog - Real-world architecture patterns and best practices
AWS Well-Architected Framework: Compute Best Practices - Official architectural guidance

🎓 You've completed AWS Compute & Runtime Architecture! You now understand how to choose between EC2, Lambda, ECS, EKS, and Fargate based on workload characteristics, cost requirements, and operational complexity. Practice architecting solutions for different scenarios to solidify your knowledge.

💡 Next Steps: Explore AWS networking (VPC, Load Balancers) to understand how compute services communicate, or dive into storage services (S3, EBS, EFS) to complete your infrastructure knowledge.

📝

Ready to practice?

This lesson has 15 questions to help you learn