Queues & Pub/Sub
SQS standard vs FIFO, SNS fan-out patterns, and DLQ strategies
AWS Queues and Pub/Sub Messaging
Master event-driven architecture with free flashcards and hands-on AWS examples. This lesson covers Amazon SQS queues, Amazon SNS topics, and message routing patternsβessential concepts for building scalable, decoupled cloud applications.
Welcome to Event-Driven Messaging π
In modern cloud architectures, services need to communicate without being tightly coupled. Instead of Service A calling Service B directly, they exchange messages through intermediaries. Amazon SQS (Simple Queue Service) and Amazon SNS (Simple Notification Service) are AWS's managed messaging services that enable this decoupled communication.
Think of messaging like a postal system: SQS is like a post office box where messages wait until someone retrieves them, while SNS is like a broadcasting station that sends the same message to multiple subscribers simultaneously.
Core Concepts: Queues vs Pub/Sub π¬
Amazon SQS: The Queue Model
Amazon SQS implements a point-to-point messaging pattern. Messages are stored in a queue until a consumer retrieves and processes them.
Producer β [Queue] β Consumer
ββββββββββββ βββββββββββββββββββ ββββββββββββ
β Service β send β SQS Queue β poll β Service β
β A βββββββββββ βββββ βββββ βββββββββββ B β
β (writes) β β β M β β M β β β (reads) β
ββββββββββββ β βββββ βββββ β ββββββββββββ
β Messages wait β
βββββββββββββββββββ
Key characteristics:
- One consumer per message: Each message is processed by exactly one consumer
- Persistence: Messages remain in queue until explicitly deleted
- Pull-based: Consumers poll the queue for new messages
- Order: Standard queues offer best-effort ordering; FIFO queues guarantee order
π‘ When to use SQS: Workload distribution, task queues, buffering between services, decoupling microservices
Amazon SNS: The Pub/Sub Model
Amazon SNS implements a publish-subscribe pattern. Publishers send messages to topics, and all subscribers receive copies.
Publisher β [Topic] β Multiple Subscribers
βββββββββββββββββββ
β SNS Topic β
ββββββββββββ β "OrderEvent" β
βPublisher β publish β β
β Service βββββββββββ ββββββββββ β
ββββββββββββ β βMessage β β
β ββββββββββ β
ββββββ¬ββββ¬βββββ¬ββββ
β β β
ββββββββββββ β ββββββββββββ
β β β
ββββββββββββ ββββββββββββ ββββββββββββ
βSubscriberβ βSubscriberβ βSubscriberβ
β 1 β β 2 β β 3 β
β (Email) β β (Lambda) β β (SQS) β
ββββββββββββ ββββββββββββ ββββββββββββ
Key characteristics:
- Fan-out: One message reaches multiple subscribers
- Push-based: SNS pushes messages to subscribers
- Protocol flexibility: HTTP/S, Email, SMS, Lambda, SQS, Mobile push
- No persistence: Messages are delivered immediately or lost
π‘ When to use SNS: Broadcasting events, triggering multiple workflows, sending notifications, mobile push alerts
The SNS + SQS Fan-Out Pattern π
The most powerful pattern combines both services:
ββββββββββββ βββββββββββββββββββ
βPublisher βββββββββ SNS Topic β
ββββββββββββ β "OrderPlaced" β
ββββββ¬βββ¬βββ¬ββββββββ
β β β
βββββββββββββ β βββββββββββββ
β β β
ββββββββββ ββββββββββ ββββββββββ
βSQS β βSQS β βSQS β
βInvoice β βShip β βEmail β
βQueue β βQueue β βQueue β
ββββββ¬ββββ ββββββ¬ββββ ββββββ¬ββββ
β β β
ββββββββββ ββββββββββ ββββββββββ
βBilling β βFulfil β βNotify β
βService β βService β βService β
ββββββββββ ββββββββββ ββββββββββ
Benefits:
- Each service processes at its own pace
- Guaranteed delivery (SQS persists messages)
- Independent scaling per subscriber
- Failed consumers don't affect others
Message Properties and Configuration π§
SQS Queue Types
| Feature | Standard Queue | FIFO Queue |
|---|---|---|
| Throughput | Unlimited TPS | 300 TPS (3000 with batching) |
| Ordering | Best-effort ordering | Strict ordering guaranteed |
| Delivery | At-least-once (possible duplicates) | Exactly-once processing |
| Use Case | High throughput, order not critical | Banking, order processing |
| Naming | Any name | Must end with .fifo |
Critical Queue Parameters
Visibility Timeout β±οΈ
When a consumer receives a message, it becomes invisible to other consumers for a set duration:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Message Lifecycle with Visibility Timeout β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
1. Message in Queue (visible)
[π¬ Message]
β
β Consumer calls ReceiveMessage()
2. Processing (invisible for 30s)
[π Message - hidden from others]
β
βββ Success β DeleteMessage() β β
Done
β
βββ Failure β Timeout expires β π¬ Visible again
- Default: 30 seconds
- Range: 0 seconds to 12 hours
- Best practice: Set to 6x your average processing time
Message Retention Period
- Default: 4 days
- Range: 1 minute to 14 days
- Messages older than retention period are automatically deleted
Receive Message Wait Time (Long Polling)
- Short polling (0s): Returns immediately, may return empty even if messages arrive soon
- Long polling (1-20s): Waits for messages, reduces API calls and cost
- π‘ Always use long polling (set to 20s) for cost optimization
Dead-Letter Queue (DLQ) π
Messages that fail processing repeatedly get moved to a DLQ:
ββββββββββββββββ Failed 3x ββββββββββββββββ
β Main Queue βββββββββββββββββ Dead-Letter β
β β β Queue β
β Process here β β Inspect here β
ββββββββββββββββ ββββββββββββββββ
β β
Normal flow Manual review
- maxReceiveCount: How many failed attempts before moving to DLQ
- Use case: Debug failures, prevent poison messages from blocking queue
SNS Message Filtering π―
Subscribers can filter which messages they receive using filter policies:
{
"eventType": ["order_placed", "order_cancelled"],
"price": [{"numeric": [">", 100]}],
"region": ["us-east-1", "eu-west-1"]
}
Only messages matching the filter are delivered to that subscriber.
Real-World Examples π»
Example 1: Basic SQS Producer and Consumer
Producer (Node.js with AWS SDK):
const { SQSClient, SendMessageCommand } = require('@aws-sdk/client-sqs');
const sqsClient = new SQSClient({ region: 'us-east-1' });
const queueUrl = 'https://sqs.us-east-1.amazonaws.com/123456789/MyQueue';
async function sendOrder(orderId, customerId, amount) {
const message = {
orderId: orderId,
customerId: customerId,
amount: amount,
timestamp: new Date().toISOString()
};
const params = {
QueueUrl: queueUrl,
MessageBody: JSON.stringify(message),
MessageAttributes: {
'OrderType': {
DataType: 'String',
StringValue: 'Standard'
},
'Priority': {
DataType: 'Number',
StringValue: '1'
}
}
};
const result = await sqsClient.send(new SendMessageCommand(params));
console.log('Message sent:', result.MessageId);
return result.MessageId;
}
// Send a message
sendOrder('ORD-12345', 'CUST-789', 99.99);
Consumer (Node.js):
const { SQSClient, ReceiveMessageCommand, DeleteMessageCommand } = require('@aws-sdk/client-sqs');
const sqsClient = new SQSClient({ region: 'us-east-1' });
const queueUrl = 'https://sqs.us-east-1.amazonaws.com/123456789/MyQueue';
async function processMessages() {
const params = {
QueueUrl: queueUrl,
MaxNumberOfMessages: 10,
WaitTimeSeconds: 20, // Long polling
MessageAttributeNames: ['All']
};
while (true) {
const data = await sqsClient.send(new ReceiveMessageCommand(params));
if (data.Messages) {
for (const message of data.Messages) {
try {
const order = JSON.parse(message.Body);
console.log('Processing order:', order.orderId);
// Business logic here
await processOrder(order);
// Delete message after successful processing
await sqsClient.send(new DeleteMessageCommand({
QueueUrl: queueUrl,
ReceiptHandle: message.ReceiptHandle
}));
console.log('Order processed and deleted');
} catch (error) {
console.error('Processing failed:', error);
// Message stays in queue, will be retried
}
}
}
}
}
processMessages();
Key points:
- ReceiptHandle: Unique token needed to delete/modify the message
- Long polling (20s) reduces empty responses
- Batch processing: Retrieve up to 10 messages at once
- Error handling: Failed messages automatically reappear
Example 2: SNS Topic Publishing with Multiple Subscribers
Create topic and subscriptions:
import boto3
import json
sns_client = boto3.client('sns', region_name='us-east-1')
sqs_client = boto3.client('sqs', region_name='us-east-1')
## Create SNS topic
topic_response = sns_client.create_topic(Name='OrderEvents')
topic_arn = topic_response['TopicArn']
## Create three SQS queues for different services
queues = {}
for queue_name in ['InvoiceQueue', 'ShippingQueue', 'EmailQueue']:
queue_response = sqs_client.create_queue(QueueName=queue_name)
queues[queue_name] = queue_response['QueueUrl']
# Get queue ARN
attrs = sqs_client.get_queue_attributes(
QueueUrl=queue_response['QueueUrl'],
AttributeNames=['QueueArn']
)
queue_arn = attrs['Attributes']['QueueArn']
# Subscribe queue to topic
sns_client.subscribe(
TopicArn=topic_arn,
Protocol='sqs',
Endpoint=queue_arn
)
print(f'Subscribed {queue_name} to OrderEvents topic')
## Publish event to topic
def publish_order_event(order_id, customer_id, total):
message = {
'eventType': 'order_placed',
'orderId': order_id,
'customerId': customer_id,
'total': total,
'timestamp': '2024-01-15T10:30:00Z'
}
response = sns_client.publish(
TopicArn=topic_arn,
Message=json.dumps(message),
MessageAttributes={
'eventType': {'DataType': 'String', 'StringValue': 'order_placed'},
'priority': {'DataType': 'Number', 'StringValue': '1'}
},
Subject='New Order Notification'
)
print(f'Published to SNS: {response["MessageId"]}')
return response['MessageId']
publish_order_event('ORD-99999', 'CUST-456', 149.99)
Result: One published message fans out to three queues, each processed independently by its service.
Example 3: FIFO Queue with Message Groups
FIFO queues support message groups for parallel processing while maintaining order within each group:
import boto3
import json
from datetime import datetime
sqs = boto3.client('sqs', region_name='us-east-1')
## Create FIFO queue
queue_response = sqs.create_queue(
QueueName='OrderProcessing.fifo',
Attributes={
'FifoQueue': 'true',
'ContentBasedDeduplication': 'true',
'DeduplicationScope': 'messageGroup',
'FifoThroughputLimit': 'perMessageGroupId'
}
)
queue_url = queue_response['QueueUrl']
## Send messages for different customers (each customer is a message group)
def send_order_update(customer_id, order_id, status):
message = {
'customerId': customer_id,
'orderId': order_id,
'status': status,
'timestamp': datetime.now().isoformat()
}
response = sqs.send_message(
QueueUrl=queue_url,
MessageBody=json.dumps(message),
MessageGroupId=customer_id, # Orders for same customer stay ordered
MessageDeduplicationId=f'{order_id}-{status}' # Prevent duplicates
)
return response['MessageId']
## These messages for customer-123 will be processed in order
send_order_update('customer-123', 'ORD-001', 'placed')
send_order_update('customer-123', 'ORD-001', 'confirmed')
send_order_update('customer-123', 'ORD-001', 'shipped')
## These for customer-456 can process in parallel with customer-123's orders
send_order_update('customer-456', 'ORD-002', 'placed')
send_order_update('customer-456', 'ORD-002', 'confirmed')
How message groups work:
FIFO Queue with Message Groups
βββββββββββββββββββββββββββββββββββββββββββ
β OrderProcessing.fifo β
βββββββββββββββββββββββββββββββββββββββββββ€
β β
β Group: customer-123 β
β βββββ βββββ βββββ β
β β 1 βββ 2 βββ 3 β (strict order) β
β βββββ βββββ βββββ β
β β
β Group: customer-456 β
β βββββ βββββ β
β β 1 βββ 2 β (strict order) β
β βββββ βββββ β
β β
β Group: customer-789 β
β βββββ βββββ βββββ βββββ β
β β 1 βββ 2 βββ 3 βββ 4 β β
β βββββ βββββ βββββ βββββ β
βββββββββββββββββββββββββββββββββββββββββββ
β β β
Consumer Consumer Consumer
(parallel processing across groups)
- Within a group: Strict FIFO order
- Across groups: Parallel processing possible
- ContentBasedDeduplication: Uses SHA-256 hash of message body to detect duplicates
Example 4: Lambda Function Triggered by SQS
import json
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Orders')
def lambda_handler(event, context):
"""
Processes SQS messages in batch.
Event contains up to 10 messages.
"""
successful_messages = []
failed_messages = []
for record in event['Records']:
try:
# Parse message body
message_body = json.loads(record['body'])
order_id = message_body['orderId']
# Process the order
table.put_item(
Item={
'OrderId': order_id,
'CustomerId': message_body['customerId'],
'Amount': message_body['amount'],
'Status': 'processing',
'ProcessedAt': record['attributes']['SentTimestamp']
}
)
print(f'Successfully processed order: {order_id}')
successful_messages.append(record['messageId'])
except Exception as e:
print(f'Error processing message: {str(e)}')
failed_messages.append({
'itemIdentifier': record['messageId']
})
# Return partial batch failure response
# Failed messages will be retried
if failed_messages:
return {
'batchItemFailures': failed_messages
}
return {
'statusCode': 200,
'body': json.dumps(f'Processed {len(successful_messages)} messages')
}
Lambda + SQS integration features:
- Batch processing: Lambda receives up to 10 messages per invocation
- Automatic scaling: Lambda scales consumers based on queue depth
- Partial batch failures: Return failed message IDs to retry only those
- DLQ integration: Failed messages after max retries go to DLQ
Common Mistakes and How to Avoid Them β οΈ
Mistake 1: Not Deleting Messages After Processing
β Wrong:
const data = await sqs.receiveMessage({QueueUrl: queueUrl});
for (const message of data.Messages) {
processOrder(JSON.parse(message.Body));
// Forgot to delete - message will reappear!
}
β Correct:
const data = await sqs.receiveMessage({QueueUrl: queueUrl});
for (const message of data.Messages) {
await processOrder(JSON.parse(message.Body));
await sqs.deleteMessage({
QueueUrl: queueUrl,
ReceiptHandle: message.ReceiptHandle
});
}
Why it matters: Undeleted messages reappear after visibility timeout, causing duplicate processing.
Mistake 2: Visibility Timeout Too Short
β Wrong: Setting visibility timeout to 5 seconds when processing takes 30 seconds
What happens:
0s: Consumer A receives message 5s: Visibility timeout expires 5s: Consumer B receives same message 30s: Consumer A finishes, tries to delete (fails - ReceiptHandle expired) 35s: Consumer B finishes, deletes successfully Result: Message processed twice! π±
β
Correct: Set visibility timeout to at least 6Γ average processing time, or use changeMessageVisibility() to extend it during processing.
Mistake 3: Using Standard Queue When Order Matters
β Wrong:
## Using standard queue for bank transactions
sqs.send_message(QueueUrl=queue, MessageBody='DEPOSIT:$100')
sqs.send_message(QueueUrl=queue, MessageBody='WITHDRAW:$100')
sqs.send_message(QueueUrl=queue, MessageBody='WITHDRAW:$50')
## Consumer might process out of order:
## WITHDRAW:$100 β WITHDRAW:$50 β DEPOSIT:$100 β Overdraft!
β Correct: Use FIFO queue for ordered operations:
queue_url = 'https://sqs.us-east-1.amazonaws.com/123456789/Transactions.fifo'
sqs.send_message(
QueueUrl=queue_url,
MessageBody='DEPOSIT:$100',
MessageGroupId='account-12345'
)
Mistake 4: SNS Without Permission Policy on SQS
β Wrong: Subscribe SQS to SNS without granting permission
Result: Messages are lost silently - SNS can't write to queue!
β Correct:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"Service": "sns.amazonaws.com"},
"Action": "sqs:SendMessage",
"Resource": "arn:aws:sqs:us-east-1:123456789:MyQueue",
"Condition": {
"ArnEquals": {
"aws:SourceArn": "arn:aws:sns:us-east-1:123456789:MyTopic"
}
}
}]
}
Attach this policy to your SQS queue.
Mistake 5: Not Using Long Polling
β Wrong:
response = sqs.receive_message(
QueueUrl=queue_url,
WaitTimeSeconds=0 # Short polling - returns immediately
)
Cost impact:
- Short polling: 10,000 API calls/hour = $0.004/hour = $2.88/month
- Long polling: 180 API calls/hour = $0.00007/hour = $0.05/month
β Correct:
response = sqs.receive_message(
QueueUrl=queue_url,
WaitTimeSeconds=20 # Long polling - waits for messages
)
Savings: 98% reduction in API calls and cost! π°
Mistake 6: Forgetting Message Size Limits
Limits:
- SQS message body: 256 KB maximum
- SNS message: 256 KB maximum
β Wrong:
// Trying to send 2 MB image in message body
const imageData = fs.readFileSync('large-image.jpg', 'base64');
sqs.sendMessage({
QueueUrl: queueUrl,
MessageBody: JSON.stringify({image: imageData}) // β Fails!
});
β Correct pattern:
// Upload large payload to S3, send reference in message
const s3Key = `images/${orderId}.jpg`;
await s3.putObject({
Bucket: 'my-bucket',
Key: s3Key,
Body: imageBuffer
});
const message = {
orderId: orderId,
imageLocation: `s3://my-bucket/${s3Key}`,
imageSize: imageBuffer.length
};
await sqs.sendMessage({
QueueUrl: queueUrl,
MessageBody: JSON.stringify(message) // β
Only metadata
});
Extended Client Library: AWS provides SQS Extended Client Library that automatically uses S3 for large messages.
Key Takeaways π―
SQS (Queue) Architecture:
- β Point-to-point messaging (one consumer per message)
- β Messages persist until explicitly deleted
- β Pull-based (consumers poll for messages)
- β Standard queues: unlimited throughput, best-effort ordering
- β FIFO queues: strict ordering, exactly-once processing, 3000 TPS
- β Visibility timeout prevents duplicate processing
- β Dead-letter queues handle poison messages
SNS (Pub/Sub) Architecture:
- β Publish-subscribe pattern (fan-out to multiple subscribers)
- β Push-based delivery
- β Multiple protocols: HTTP, Email, SMS, Lambda, SQS, Mobile
- β Message filtering with filter policies
- β No message persistence (deliver or lose)
- β Ideal for broadcasting events
Best Practices:
- Always delete messages after successful processing
- Use long polling (20 seconds) to reduce costs
- Set visibility timeout to 6Γ processing time
- Implement DLQ to handle failures
- Use SNS+SQS fan-out for durable multi-subscriber patterns
- Store large payloads in S3, send references in messages
- Choose FIFO queues when order matters
- Apply SQS queue policies when subscribing to SNS
- Use message groups in FIFO for parallel processing
- Monitor queue depth with CloudWatch for auto-scaling
Cost optimization:
- Batch operations (send/receive up to 10 messages at once)
- Use long polling to reduce API calls
- Delete messages promptly to avoid reprocessing
- Set appropriate message retention (don't pay for stale messages)
π§ Memory device: "SQS is Storage (messages wait), SNS is Notification (immediate delivery)"
π Further Study
- AWS SQS Developer Guide - Official comprehensive documentation
- AWS SNS Developer Guide - Pub/Sub service deep dive
- Event-Driven Architecture Best Practices - AWS architectural guidance
π Quick Reference Card
| Concept | Key Point |
|---|---|
| SQS Standard | Unlimited throughput, at-least-once delivery, best-effort ordering |
| SQS FIFO | 3000 TPS, exactly-once, strict order, name ends with .fifo |
| Visibility Timeout | Default 30s, range 0-12h, set to 6Γ processing time |
| Message Retention | Default 4 days, max 14 days |
| Long Polling | Set WaitTimeSeconds to 20, reduces costs by 98% |
| Message Size | 256 KB max - use S3 for larger payloads |
| SNS Fan-Out | One publish β many subscribers (SQS, Lambda, HTTP, Email) |
| Message Groups | FIFO feature for parallel processing with per-group ordering |
| DLQ | Holds messages after maxReceiveCount failures |
| Batch Operations | Send/receive up to 10 messages per API call |