Compute & Runtime Architecture
Master EC2, containers, serverless computing, and event-driven architectures for scalable applications
AWS Compute & Runtime Architecture
Master AWS compute services with free flashcards and proven study techniques. This lesson covers EC2 instance types, Lambda serverless architecture, container orchestration with ECS and EKS, and compute selection strategiesโessential concepts for building scalable cloud applications and passing AWS certification exams.
Welcome to AWS Compute Services ๐ป
Compute is the foundation of cloud infrastructure. AWS offers multiple compute services, each optimized for different workload patterns. Understanding when to use virtual machines, containers, or serverless functions is critical for building cost-effective, performant applications.
In this lesson, you'll learn:
- EC2 fundamentals and instance family selection
- Lambda serverless architecture patterns
- Container services (ECS, EKS, Fargate)
- Compute optimization strategies for different workloads
๐ก Pro tip: Think of compute services as different types of restaurantsโEC2 is like owning a full kitchen (maximum control), containers are like meal prep kits (balanced control and convenience), and Lambda is like ordering delivery (pay only for what you consume).
Core Concepts: The AWS Compute Spectrum
๐ฅ๏ธ EC2: Elastic Compute Cloud
EC2 provides resizable virtual machines in the cloud. You have complete control over the operating system, networking, and storage configuration.
Instance Families (named with pattern: Family + Generation + Size)
| Family | Optimized For | Use Cases | Example Type |
|---|---|---|---|
| T (T2, T3, T3a) | Burstable performance | Web servers, dev environments | t3.medium |
| M (M5, M6i) | General purpose (balanced) | Application servers, databases | m5.large |
| C (C5, C6g) | Compute-optimized | Batch processing, gaming servers | c5.xlarge |
| R (R5, R6g) | Memory-optimized | In-memory caches, real-time analytics | r5.2xlarge |
| I (I3, I3en) | Storage-optimized | NoSQL databases, data warehouses | i3.xlarge |
| P (P3, P4d) | GPU instances | Machine learning, rendering | p3.2xlarge |
๐ง Memory Aid: "TMC RI P" = Teeny burstable, Medium balanced, Compute heavy, RAM heavy, IOPS heavy, Parallel processing
EC2 Pricing Models:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ EC2 PRICING COMPARISON โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ โ โ ๐ฐ On-Demand โ โ Pay per hour/second โ โ No commitment โ โ $$$$ (Most expensive) โ โ โ โ โ ๐๏ธ Reserved Instances (1-3 years) โ โ Up to 75% discount โ โ Steady-state workloads โ โ $$$ โ โ โ โ โ ๐ท๏ธ Savings Plans โ โ Flexible across instance families โ โ Commit to $/hour usage โ โ $$ โ โ โ โ โ ๐ฏ Spot Instances โ โ Up to 90% discount โ โ Can be interrupted โ โ $ (Cheapest) โ โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ๏ธ Common Mistake: Using On-Demand pricing for predictable, long-running workloads. Reserved Instances or Savings Plans can save 40-75% for stable production workloads.
โก Lambda: Serverless Compute
AWS Lambda runs code without provisioning servers. You pay only for compute time consumed (per 100ms increments).
Lambda Architecture Pattern:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ LAMBDA EXECUTION FLOW โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ก Event Source
(API Gateway, S3, DynamoDB, etc.)
โ
โ
๐ Event Trigger
โ
โ
โ๏ธ Lambda Service
โโโโโโโโโโโโโโโโโโ
โ Cold Start? โ
โ โโYesโ Init โ (1-5 seconds)
โ โโNoโ Reuse โ (milliseconds)
โโโโโโโโโโฌโโโโโโโโ
โ
โ
โโโโโโโโโโโโโโโโโโ
โ Execute Code โ
โ (max 15 min) โ
โโโโโโโโโโฌโโโโโโโโ
โ
โ
๐ค Return Response
โ
โ
๐ CloudWatch Logs
Lambda Limits (Critical for Architecture Decisions):
| Resource | Default Limit | Can Increase? |
|---|---|---|
| Execution timeout | 15 minutes max | No |
| Memory | 128 MB - 10,240 MB | No |
| Deployment package | 50 MB (zipped), 250 MB (unzipped) | No |
| Concurrent executions | 1,000 per region | Yes (request increase) |
| Environment variables | 4 KB total | No |
| /tmp storage | 512 MB - 10,240 MB | No |
๐ก When NOT to use Lambda:
- Long-running processes (> 15 minutes)
- Stateful applications requiring persistent connections
- High-performance computing needing low-latency responses
- Workloads with constant, predictable traffic (EC2 may be cheaper)
๐ณ Container Services: ECS, EKS, and Fargate
AWS offers three primary container orchestration services:
Container Service Comparison:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CONTAINER SERVICE DECISION TREE โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
"I need containers"
โ
โ
โโโโโโโโโโโโโดโโโโโโโโโโโโ
โ โ
"Need Kubernetes?" "Simple container
โ orchestration?"
โ โ
โโโโโโโดโโโโโโ โ
โ โ โโโโโโโโโโโ
"Yes" "No" โ ECS โ โ AWS-native
โ โ โโโโโโโโโโโ
โ โ โ
โโโโโโโโโโโ โโโโโโโโโโโ โ
โ EKS โ โ ECS โ "Who manages servers?"
โโโโโโโโโโโ โโโโโโโโโโโ โ
โ โ โโโโโโโดโโโโโโ
โ โ โ โ
โ โ "AWS" "Me"
"Launch type?" "Launch type?" โ โ
โ โ โ โ
โโโโโโโดโโโโโโ โโโโโดโโโโโ โโโโโโโโโโโ โโโโโโโโ
โ โ โ โ โ Fargate โ โ EC2 โ
โ Fargate โ โ EC2 โ โโโโโโโโโโโ โโโโโโโโ
โ or โ โ or โ (Serverless) (Self-managed)
โ EC2 โ โFargate โ
โโโโโโโโโโโโโ โโโโโโโโโโ
| Service | What It Is | Best For | Control Level |
|---|---|---|---|
| ECS | AWS-native container orchestration | AWS-centric architectures, simpler setup | High |
| EKS | Managed Kubernetes | Multi-cloud, Kubernetes expertise, complex workloads | Highest |
| Fargate | Serverless compute for containers | No server management, pay-per-use | Medium |
ECS Task Definition Example (Simplified):
{
"family": "web-app",
"containerDefinitions": [
{
"name": "nginx",
"image": "nginx:latest",
"memory": 512,
"cpu": 256,
"portMappings": [
{
"containerPort": 80,
"protocol": "tcp"
}
],
"environment": [
{"name": "ENV", "value": "production"}
]
}
],
"requiresCompatibilities": ["FARGATE"],
"networkMode": "awsvpc",
"cpu": "256",
"memory": "512"
}
๐ง Try this: The requiresCompatibilities field determines whether the task runs on Fargate (serverless) or EC2 (self-managed). Changing this one field shifts your entire infrastructure model.
๐ฏ Fargate: Serverless Containers
Fargate is a launch type for ECS and EKS that removes server management entirely.
Traditional ECS on EC2 vs. Fargate:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ ECS ON EC2 (You Manage Servers) โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ โ โ You provision: EC2 instances โ โ โ โ โ You install: ECS agent โ โ โ โ โ You manage: OS patches, scaling โ โ โ โ โ ECS schedules: Containers on your instances โ โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ FARGATE (AWS Manages Servers) โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ โ โ You define: Task definition (CPU, memory) โ โ โ โ โ AWS provides: Compute capacity automatically โ โ โ โ โ You pay: Per vCPU/GB per second โ โ โ โ No servers, no patches, no capacity planning โ โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Fargate Pricing Model:
- vCPU: $0.04048 per vCPU per hour
- Memory: $0.004445 per GB per hour
- Billed per second, 1-minute minimum
๐ฐ Cost Comparison Example: Running a 0.25 vCPU, 0.5 GB task for 30 days:
- Fargate: (~$9/month)
- t3.micro EC2 (assuming 50% utilization): (~$7.50/month)
- But EC2 requires management overhead, patching, monitoring
๐๏ธ Compute Selection Framework
Choosing the right compute service depends on multiple factors:
| Factor | EC2 | Lambda | ECS/EKS | Fargate |
|---|---|---|---|---|
| Control | Full OS access | Code only | Container + orchestration | Container only |
| Scaling | Manual/Auto Scaling Groups | Automatic | Service-based | Automatic |
| Management | You patch/maintain | AWS manages all | You manage instances | AWS manages compute |
| Pricing | Per hour/second | Per 100ms execution | Per instance hour | Per vCPU/GB second |
| Cold Start | Minutes (boot) | 100ms-5s | Seconds (container) | Seconds (container) |
| State | Persistent | Ephemeral | Flexible | Ephemeral |
Decision Matrix:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ WORKLOAD โ COMPUTE MAPPING โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ โ โ ๐ Web Application (steady traffic) โ โ โ EC2 with Auto Scaling โ โ โ ECS on EC2 (containerized) โ โ โ โ ๐ API with variable traffic โ โ โ Lambda (event-driven) โ โ โ ECS with Fargate (containerized) โ โ โ โ ๐ฎ Real-time multiplayer game server โ โ โ EC2 (persistent connections) โ โ โ ECS on EC2 (for multi-region) โ โ โ โ ๐ธ Image processing (triggered by uploads) โ โ โ Lambda (< 15 min per image) โ โ โ EC2 Spot (batch, cost-sensitive) โ โ โ โ ๐ค Machine Learning training โ โ โ EC2 P3/P4 instances (GPU) โ โ โ EKS with GPU nodes (distributed) โ โ โ โ ๐๏ธ Database server โ โ โ EC2 (use RDS instead if possible) โ โ โ Never Lambda (stateful, long connections) โ โ โ โ ๐ Microservices architecture โ โ โ ECS Fargate (managed scaling) โ โ โ EKS (if Kubernetes required) โ โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Real-World Examples
Example 1: E-commerce Website Architecture ๐
Scenario: You're building an online store with:
- Web frontend (React SPA)
- REST API backend (Node.js)
- Order processing (batch jobs)
- Product image resizing
Optimal Compute Strategy:
## CloudFormation/Architecture Snippet (Conceptual)
Components:
StaticWebsite:
Service: S3 + CloudFront
Reason: No compute needed for static assets
APIBackend:
Service: ECS Fargate
Configuration:
- CPU: 0.5 vCPU
- Memory: 1 GB
- Auto Scaling: 2-10 tasks based on CPU
Reason: Containerized, variable traffic, no server management
OrderProcessing:
Service: Lambda
Trigger: SQS queue
Configuration:
- Memory: 512 MB
- Timeout: 5 minutes
- Reserved Concurrency: 10
Reason: Event-driven, short processing time, pay-per-execution
ImageResizing:
Service: Lambda
Trigger: S3 upload event
Configuration:
- Memory: 2048 MB (more memory = more CPU)
- Timeout: 1 minute
- Concurrent executions: 100
Reason: Event-driven, parallel processing, scales automatically
Architecture Flow:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ E-COMMERCE COMPUTE ARCHITECTURE โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ค User Browser
โ
โ
๐ฆ CloudFront (CDN)
โ
โโโโ S3 (static files)
โ
โโโโ API Gateway
โ
โ
โ๏ธ ECS Fargate Cluster
(API containers)
โ
โโโโ RDS (product data)
โ
โโโโ SQS Queue (orders)
โ
โ
โก Lambda Function
(process orders)
โ
โ
๐ง SNS (notifications)
Separate flow:
๐ท Image Upload โ S3 โ โก Lambda (resize) โ S3
Why This Mix?
- Fargate for API: Handles HTTP requests efficiently, scales based on traffic
- Lambda for processing: Decouples order processing, scales per message
- Lambda for images: Processes in parallel, only pays when images uploaded
๐ฐ Cost Estimation (10,000 API requests/day, 100 orders/day, 50 image uploads/day):
- Fargate: ~$30/month (2 tasks running 24/7)
- Lambda (orders): ~$0.20/month (100 * 30 * $0.0000002)
- Lambda (images): ~$0.50/month (higher memory, more CPU time)
- Total compute: ~$31/month
Example 2: Video Processing Pipeline ๐ฌ
Scenario: Users upload videos that need transcoding to multiple resolutions.
Challenge: Video transcoding can take 10-60 minutes per video, exceeding Lambda's 15-minute limit.
Solution Architecture:
## Step Functions workflow definition (conceptual)
import json
def video_processing_pipeline():
"""
Step Functions state machine for video processing
"""
workflow = {
"StartAt": "ValidateVideo",
"States": {
"ValidateVideo": {
"Type": "Task",
"Resource": "arn:aws:lambda:...:function:ValidateVideo",
"Next": "IsValidVideo",
"TimeoutSeconds": 60
},
"IsValidVideo": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.valid",
"BooleanEquals": True,
"Next": "StartECSTask"
}
],
"Default": "VideoRejected"
},
"StartECSTask": {
"Type": "Task",
"Resource": "arn:aws:states:::ecs:runTask.sync",
"Parameters": {
"LaunchType": "FARGATE",
"Cluster": "video-processing-cluster",
"TaskDefinition": "ffmpeg-transcoder"
},
"Next": "NotifyComplete",
"TimeoutSeconds": 3600
},
"NotifyComplete": {
"Type": "Task",
"Resource": "arn:aws:lambda:...:function:SendNotification",
"End": True
},
"VideoRejected": {
"Type": "Fail",
"Cause": "Invalid video format"
}
}
}
return workflow
Component Breakdown:
| Step | Service | Why |
|---|---|---|
| 1. Upload detection | S3 Event โ Lambda | Trigger workflow on new uploads |
| 2. Video validation | Lambda (30s) | Quick check of format/codec |
| 3. Transcoding | ECS Fargate Task | No time limit, can run 1+ hours |
| 4. Notification | Lambda (5s) | Quick email/SMS via SNS |
| Orchestration | Step Functions | Coordinates multi-step workflow |
ECS Task for Transcoding:
## Dockerfile for video transcoding container
FROM jrottenberg/ffmpeg:4.4-alpine
## Install AWS CLI and Python
RUN apk add --no-cache python3 py3-pip && \
pip3 install boto3
## Copy transcoding script
COPY transcode.py /app/
ENTRYPOINT ["python3", "/app/transcode.py"]
## transcode.py - runs in ECS container
import boto3
import subprocess
import os
s3 = boto3.client('s3')
def transcode_video(input_bucket, input_key, output_bucket):
# Download from S3
local_input = '/tmp/input.mp4'
s3.download_file(input_bucket, input_key, local_input)
# Transcode to multiple resolutions
resolutions = ['1080p', '720p', '480p']
for res in resolutions:
output_file = f'/tmp/output_{res}.mp4'
# Use ffmpeg to transcode
if res == '1080p':
cmd = ['ffmpeg', '-i', local_input, '-vf', 'scale=1920:1080',
'-c:v', 'libx264', '-crf', '23', output_file]
elif res == '720p':
cmd = ['ffmpeg', '-i', local_input, '-vf', 'scale=1280:720',
'-c:v', 'libx264', '-crf', '23', output_file]
else: # 480p
cmd = ['ffmpeg', '-i', local_input, '-vf', 'scale=854:480',
'-c:v', 'libx264', '-crf', '23', output_file]
subprocess.run(cmd, check=True)
# Upload to S3
output_key = f'processed/{res}/{input_key}'
s3.upload_file(output_file, output_bucket, output_key)
os.remove(output_file)
os.remove(local_input)
return {'status': 'success', 'resolutions': resolutions}
if __name__ == '__main__':
# Get parameters from environment (set by ECS task)
input_bucket = os.environ['INPUT_BUCKET']
input_key = os.environ['INPUT_KEY']
output_bucket = os.environ['OUTPUT_BUCKET']
result = transcode_video(input_bucket, input_key, output_bucket)
print(result)
Why This Works:
- โฑ๏ธ No time limits on ECS Fargate tasks
- ๐ฐ Pay only when transcoding (task runs then stops)
- ๐ Scales automatically based on queue depth
- ๐ Step Functions coordinates the workflow
Example 3: Auto Scaling Web Application ๐
Scenario: News website with traffic spikes during breaking news.
Requirements:
- Handle 1,000 req/sec during normal hours
- Scale to 50,000 req/sec during news events
- Minimize cost during low traffic
Solution: EC2 Auto Scaling with Mixed Instance Types
## Auto Scaling Group Configuration
AutoScalingGroup:
LaunchTemplate:
InstanceType:
- t3.medium # Burstable for baseline
- m5.large # General purpose for peaks
- c5.large # Compute-optimized for high load
MixedInstancesPolicy:
InstancesDistribution:
OnDemandBaseCapacity: 2
OnDemandPercentageAboveBaseCapacity: 20
SpotInstancePools: 3
SpotAllocationStrategy: capacity-optimized
Scaling:
MinSize: 2
MaxSize: 50
DesiredCapacity: 2
TargetTrackingPolicies:
- MetricType: ALBRequestCountPerTarget
TargetValue: 1000 # requests per instance
- MetricType: CPUUtilization
TargetValue: 70
Cost Optimization Strategy:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ AUTO SCALING INSTANCE MIX โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Low Traffic (2-5 instances): โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ 2ร On-Demand t3.medium (baseline) โ ๐ฐ $$$ โ 0-3ร Spot instances (opportunistic)โ ๐ฐ $ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Medium Traffic (6-20 instances): โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ 2ร On-Demand t3.medium (baseline) โ ๐ฐ $$$ โ 1ร On-Demand m5.large (stable) โ ๐ฐ $$$ โ 3-17ร Spot (bulk capacity) โ ๐ฐ $ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ High Traffic (21-50 instances): โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ 2ร On-Demand t3.medium โ ๐ฐ $$$ โ 4ร On-Demand m5/c5 (20% of extra)โ ๐ฐ $$$ โ 15-44ร Spot (80% of scaling) โ ๐ฐ $ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Spot instances = up to 90% cost savings Diversified pools reduce interruption risk
Scaling Behavior:
## CloudWatch Alarm โ Auto Scaling scaling policy
import boto3
import datetime
def calculate_required_capacity(current_rps, avg_rps_per_instance=1000):
"""
Determine number of instances needed
"""
required_instances = current_rps / avg_rps_per_instance
# Add 20% buffer for headroom
buffered_instances = int(required_instances * 1.2)
return max(2, buffered_instances) # Minimum 2 instances
## Example scaling events
events = [
{"time": "09:00", "rps": 800, "instances": 2},
{"time": "12:00", "rps": 5000, "instances": 6},
{"time": "14:30", "rps": 45000, "instances": 54}, # Breaking news!
{"time": "16:00", "rps": 8000, "instances": 10},
{"time": "22:00", "rps": 1200, "instances": 2}
]
for event in events:
required = calculate_required_capacity(event['rps'])
print(f"{event['time']}: {event['rps']} req/s โ {required} instances")
๐ก Key Insight: Using 80% Spot instances can reduce compute costs by 60-70% overall, while 20% On-Demand ensures availability if Spot capacity is interrupted.
Example 4: Machine Learning Inference Pipeline ๐ค
Scenario: Image classification service that receives photos and returns detected objects.
Requirements:
- Inference time: 200-500ms per image
- Variable traffic: 10-10,000 images/hour
- Need GPU for fast inference
Solution Comparison:
| Approach | Service | Cost/Performance | Complexity |
|---|---|---|---|
| Option A: Lambda | Lambda + SageMaker Endpoint | $$ / Good for variable load | Low |
| Option B: ECS | ECS Fargate + CPU inference | $$$ / Slower inference | Medium |
| Option C: EC2 | EC2 P3 instances + Auto Scaling | $$$$ / Fastest, expensive idle | High |
| Option D: Hybrid | Lambda โ SageMaker Serverless | $ / Best for sporadic use | Low |
Recommended: Hybrid Approach with SageMaker Serverless
## Lambda function for inference
import json
import boto3
import base64
sagemaker_runtime = boto3.client('sagemaker-runtime')
def lambda_handler(event, context):
"""
Receive image, invoke SageMaker endpoint, return predictions
"""
# Extract base64-encoded image from API Gateway
image_data = base64.b64decode(event['body'])
# Invoke SageMaker Serverless endpoint
response = sagemaker_runtime.invoke_endpoint(
EndpointName='image-classifier-serverless',
ContentType='application/x-image',
Body=image_data
)
# Parse predictions
predictions = json.loads(response['Body'].read().decode())
# Return top 3 predictions
top_predictions = sorted(
predictions,
key=lambda x: x['score'],
reverse=True
)[:3]
return {
'statusCode': 200,
'body': json.dumps({
'predictions': top_predictions
})
}
SageMaker Serverless Inference Configuration:
## SageMaker endpoint configuration
import sagemaker
from sagemaker.serverless import ServerlessInferenceConfig
serverless_config = ServerlessInferenceConfig(
memory_size_in_mb=4096, # 2GB, 4GB, or 6GB
max_concurrency=10 # Max concurrent requests
)
predictor = model.deploy(
serverless_inference_config=serverless_config,
endpoint_name='image-classifier-serverless'
)
Why This Architecture?
- ๐ฐ Cost: Pay only for inference time (per second), not idle GPU time
- โก Performance: SageMaker handles model loading, scales automatically
- ๐ Simplicity: Lambda + managed endpoint, no infrastructure
- ๐ Scaling: Handles variable traffic without pre-provisioning
Cost Comparison (1,000 inferences/day):
- Lambda + SageMaker Serverless: ~$15/month
- EC2 P3 instance (24/7): ~$2,200/month
- Savings: 93%
Common Mistakes to Avoid โ ๏ธ
Mistake 1: Using Lambda for Long-Running Tasks
โ Wrong:
## Lambda function that processes large videos
def lambda_handler(event, context):
video_url = event['video_url']
download_video(video_url) # 5 minutes
transcode_video() # 20 minutes โ FAILS! Max 15 min
upload_result()
return {'status': 'success'}
โ Correct:
## Lambda triggers ECS Fargate task for long processing
import boto3
ecs = boto3.client('ecs')
def lambda_handler(event, context):
video_url = event['video_url']
# Start ECS task (no time limit)
response = ecs.run_task(
cluster='video-processing',
taskDefinition='video-transcoder',
launchType='FARGATE',
overrides={
'containerOverrides': [{
'name': 'transcoder',
'environment': [
{'name': 'VIDEO_URL', 'value': video_url}
]
}]
}
)
return {'status': 'task_started', 'taskArn': response['tasks'][0]['taskArn']}
Mistake 2: Ignoring Cold Starts
โ Problem: Lambda cold starts causing 2-5 second delays for latency-sensitive APIs.
โ Solutions:
## Option 1: Provisioned Concurrency (keeps functions warm)
import boto3
lambda_client = boto3.client('lambda')
## Reserve 5 always-warm instances
lambda_client.put_provisioned_concurrency_config(
FunctionName='latency-sensitive-api',
ProvisionedConcurrentExecutions=5,
Qualifier='$LATEST'
)
## Option 2: Use Lambda SnapStart (Java only)
## Snapshots initialized function state for instant startup
## Option 3: For <100ms latency requirement, use:
## - API Gateway โ ALB โ ECS Fargate (persistent containers)
## - API Gateway โ EC2 instances (always running)
Mistake 3: Wrong EC2 Instance Family
โ Wrong: Using compute-optimized (C5) instances for memory-intensive caching
| Instance Type | vCPU | RAM | Cost/hour | RAM per $ |
|---|---|---|---|---|
| c5.2xlarge | 8 | 16 GB | $0.34 | 47 GB/$ |
| r5.2xlarge | 8 | 64 GB | $0.50 | 128 GB/$ |
โ Correct: For Redis/Memcached, use R5 (memory-optimized) for 2.7ร more RAM per dollar.
Mistake 4: Not Using Spot Instances for Fault-Tolerant Workloads
โ Wrong: Running batch jobs on On-Demand instances
## Batch processing job - NOT cost-optimized
BatchJob:
InstanceType: m5.4xlarge
PricingModel: OnDemand
Cost: $0.768/hour
Annual: $6,730 (if running 24/7)
โ Correct: Use Spot instances for 90% discount
## AWS Batch job definition with Spot
import boto3
batch = boto3.client('batch')
compute_environment = {
'type': 'MANAGED',
'computeResources': {
'type': 'SPOT',
'allocationStrategy': 'SPOT_CAPACITY_OPTIMIZED',
'instanceTypes': ['m5.4xlarge', 'm5a.4xlarge', 'm5n.4xlarge'],
'minvCpus': 0,
'maxvCpus': 256,
'spotIamFleetRole': 'arn:aws:iam::...:role/SpotFleetRole',
'bidPercentage': 100 # Pay up to 100% of On-Demand price
}
}
## Cost: ~$0.08/hour (spot price varies)
## Annual savings: ~$6,000 (89% reduction)
Mistake 5: Over-Provisioning Fargate Memory
โ Wrong: Setting memory to 4 GB when container uses 512 MB
{
"family": "api-server",
"cpu": "1024",
"memory": "4096",
"containerDefinitions": [{
"name": "app",
"memory": 512
}]
}
Cost: 1 vCPU ($0.04048/hr) + 4 GB ($0.01778/hr) = $0.42/day
โ Correct: Right-size memory to actual usage
{
"family": "api-server",
"cpu": "1024",
"memory": "1024",
"containerDefinitions": [{
"name": "app",
"memory": 512
}]
}
Cost: 1 vCPU ($0.04048/hr) + 1 GB ($0.00445/hr) = $0.32/day (24% savings)
๐ก Monitoring Tip: Use Container Insights to track actual memory usage and right-size accordingly.
Key Takeaways ๐ฏ
๐ Quick Reference Card: AWS Compute Services
| Service | Use When | Avoid When | Pricing |
|---|---|---|---|
| EC2 | โข Full OS control needed โข Persistent state โข Specialized software |
โข Variable traffic โข Event-driven โข Want zero ops |
$/hour |
| Lambda | โข Event-driven โข < 15 min execution โข Variable load |
โข Long-running tasks โข Need persistent state โข Low latency critical |
$/100ms |
| ECS | โข Containerized apps โข AWS-native stack โข Microservices |
โข Need Kubernetes โข Multi-cloud required |
Fargate: $/vCPU-sec EC2: $/hour |
| EKS | โข Kubernetes required โข Multi-cloud โข Complex orchestration |
โข Simple workloads โข Small teams โข Avoid K8s complexity |
$0.10/hr + compute |
| Fargate | โข Containers without servers โข Variable load โข Ops simplicity |
โข Need EC2 features โข Cost-sensitive 24/7 workloads |
$/vCPU-sec + $/GB-sec |
๐ง Memory Devices:
- ELEFF = EC2, Lambda, ECS, EKS, Fargate, Flexibility
- "The Compute Spectrum": Full Control (EC2) โ Container Control (ECS/EKS) โ No Control (Lambda)
๐ฐ Cost Optimization Rules:
- Reserved Instances for predictable workloads (40-75% savings)
- Spot Instances for fault-tolerant jobs (70-90% savings)
- Right-size instance types (match CPU/memory to workload)
- Use Fargate for variable container workloads
- Use Lambda for event-driven, short-duration tasks
โก Performance Guidelines:
- Latency < 100ms: EC2 or persistent containers (avoid Lambda cold starts)
- Throughput-focused: Use compute-optimized (C5/C6) instances
- Memory-intensive: Use memory-optimized (R5/R6) instances
- GPU workloads: P3/P4 instances or SageMaker
๐ Security Best Practices:
- Always use IAM roles for EC2/ECS (never hardcode credentials)
- Enable VPC Flow Logs for network monitoring
- Use Security Groups as stateful firewalls
- Encrypt data at rest (EBS volumes) and in transit (TLS)
- Patch EC2 instances regularly (use Systems Manager Patch Manager)
๐ Further Study
- AWS Official Documentation: EC2 Instance Types - Comprehensive guide to all instance families
- AWS Compute Blog: AWS Compute Blog - Real-world architecture patterns and best practices
- AWS Well-Architected Framework: Compute Best Practices - Official architectural guidance
๐ You've completed AWS Compute & Runtime Architecture! You now understand how to choose between EC2, Lambda, ECS, EKS, and Fargate based on workload characteristics, cost requirements, and operational complexity. Practice architecting solutions for different scenarios to solidify your knowledge.
๐ก Next Steps: Explore AWS networking (VPC, Load Balancers) to understand how compute services communicate, or dive into storage services (S3, EBS, EFS) to complete your infrastructure knowledge.