Security, Governance & Observability

Q: Complete the Terraform code to enable CloudTrail log file validation: ```hcl resource "aws_cloudtrail" "main" { name = "org-trail" s3_bucket_name = aws_s3_bucket.logs.id is_multi_region_trail = true {{1}} = {{2}} } ```

["enable_log_file_validation","true"]

Master zero-trust security, compliance automation, observability, and cost optimization for production systems

Last generated Jan 7, 2026 UTC

AWS Security, Governance & Observability

Master AWS security, governance, and observability with free flashcards and spaced repetition practice. This lesson covers IAM policies, CloudWatch monitoring, CloudTrail auditing, AWS Config compliance, and AWS Organizations—essential concepts for the AWS Solutions Architect and Security Specialty certifications.

Welcome 🔐

Security, governance, and observability form the foundation of well-architected AWS infrastructure. Without proper identity management, monitoring, and compliance mechanisms, even the most sophisticated applications become vulnerable to breaches, cost overruns, and regulatory violations.

This lesson takes you through the core AWS services that protect your infrastructure, ensure compliance, and provide visibility into system behavior. You'll learn practical implementation patterns, common misconfigurations to avoid, and how these services work together to create a comprehensive security posture.

Core Concepts 💡

Identity and Access Management (IAM) 🔑

AWS IAM controls who can access your AWS resources and what they can do with them. It's the first line of defense in your security architecture.

Key IAM Components:

Component	Purpose	Best Practice
Users	Individual identities	Enable MFA for all users
Groups	Collection of users	Assign permissions to groups, not users
Roles	Temporary credentials for services/apps	Use roles instead of access keys
Policies	JSON documents defining permissions	Follow least privilege principle

IAM Policy Structure:

Every IAM policy follows this JSON structure:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": "203.0.113.0/24"
        }
      }
    }
  ]
}

Policy Components Breakdown:

Effect: Allow or Deny (explicit deny always wins)
Action: What operations are permitted (e.g., s3:GetObject, ec2:*)
Resource: Which AWS resources the policy applies to (ARN format)
Condition: Optional constraints (IP addresses, time of day, MFA, etc.)

💡 Tip: Use the AWS Policy Simulator to test policies before deploying them to production.

IAM Best Practices:

Root account protection: Never use root for daily tasks; enable MFA
Least privilege: Grant only the permissions needed for a specific task
Role-based access: Use IAM roles for EC2 instances and Lambda functions
Password policies: Enforce strong passwords with rotation requirements
Access key rotation: Rotate credentials regularly (max 90 days)
CloudTrail logging: Monitor all IAM actions

IAM PERMISSION EVALUATION FLOW

    ┌─────────────────────────────────┐
    │  Is there an explicit DENY?     │
    └──────────────┬──────────────────┘
                   │
           ┌───────┴───────┐
           │               │
        ┌──┴──┐         ┌──┴──┐
        │ YES │         │ NO  │
        └──┬──┘         └──┬──┘
           │               │
           ▼               ▼
    ┌──────────┐   ┌────────────────┐
    │  ❌ DENY │   │ Is there an    │
    │          │   │ explicit ALLOW?│
    └──────────┘   └────────┬───────┘
                            │
                    ┌───────┴───────┐
                    │               │
                 ┌──┴──┐         ┌──┴──┐
                 │ YES │         │ NO  │
                 └──┬──┘         └──┬──┘
                    │               │
                    ▼               ▼
             ┌──────────┐    ┌──────────┐
             │ ✅ ALLOW │    │  ❌ DENY │
             └──────────┘    │(implicit)│
                             └──────────┘

AWS CloudWatch 📊

Amazon CloudWatch is AWS's comprehensive monitoring and observability service. It collects metrics, logs, and events from your AWS resources and applications.

CloudWatch Core Services:

Service	Purpose	Use Case
Metrics	Numerical data points over time	CPU utilization, network traffic
Logs	Text-based log file storage	Application logs, VPC flow logs
Alarms	Automated actions based on thresholds	Auto-scaling triggers, notifications
Dashboards	Visual representation of metrics	Operations center displays
Events/EventBridge	Event-driven automation	Respond to AWS API calls

CloudWatch Metrics Hierarchy:

Namespace (AWS/EC2)
  |
  ├─ Metric Name (CPUUtilization)
  |   |
  |   ├─ Dimension: InstanceId=i-1234567890abcdef0
  |   |   └─ Data Points: [(timestamp, value, unit)]
  |   |
  |   └─ Dimension: InstanceType=t3.micro
  |       └─ Data Points: [(timestamp, value, unit)]

Creating a CloudWatch Alarm:

import boto3

cloudwatch = boto3.client('cloudwatch')

cloudwatch.put_metric_alarm(
    AlarmName='HighCPUAlarm',
    ComparisonOperator='GreaterThanThreshold',
    EvaluationPeriods=2,
    MetricName='CPUUtilization',
    Namespace='AWS/EC2',
    Period=300,
    Statistic='Average',
    Threshold=80.0,
    ActionsEnabled=True,
    AlarmActions=['arn:aws:sns:us-east-1:123456789012:my-topic'],
    Dimensions=[
        {
            'Name': 'InstanceId',
            'Value': 'i-1234567890abcdef0'
        }
    ]
)

CloudWatch Logs Insights Query:

CloudWatch Logs Insights uses a SQL-like query language:

fields @timestamp, @message
| filter @message like /ERROR/
| stats count() by bin(5m)
| sort @timestamp desc
| limit 20

💡 Tip: Use CloudWatch Logs Insights to search and analyze logs without exporting them. It's significantly faster than downloading logs locally.

CloudWatch Agent for Custom Metrics:

{
  "metrics": {
    "namespace": "MyApp/Performance",
    "metrics_collected": {
      "mem": {
        "measurement": [
          {"name": "mem_used_percent", "rename": "MemoryUtilization", "unit": "Percent"}
        ],
        "metrics_collection_interval": 60
      },
      "disk": {
        "measurement": [
          {"name": "used_percent", "rename": "DiskUtilization", "unit": "Percent"}
        ],
        "metrics_collection_interval": 60
      }
    }
  }
}

AWS CloudTrail 🔍

AWS CloudTrail records AWS API calls and related events for your account. It's your audit log—essential for security analysis, compliance, and troubleshooting.

What CloudTrail Captures:

Who: IAM user/role that made the request
What: API action performed (e.g., RunInstances, PutObject)
When: Timestamp of the request
Where: Source IP address and AWS region
Result: Success or failure with error codes

CloudTrail Event Example:

{
  "eventVersion": "1.08",
  "userIdentity": {
    "type": "IAMUser",
    "principalId": "AIDACKCEVSQ6C2EXAMPLE",
    "arn": "arn:aws:iam::123456789012:user/Alice",
    "accountId": "123456789012",
    "userName": "Alice"
  },
  "eventTime": "2024-01-15T10:30:00Z",
  "eventSource": "s3.amazonaws.com",
  "eventName": "DeleteBucket",
  "awsRegion": "us-east-1",
  "sourceIPAddress": "203.0.113.42",
  "requestParameters": {
    "bucketName": "my-important-bucket"
  },
  "responseElements": null,
  "errorCode": "BucketNotEmpty",
  "errorMessage": "The bucket you tried to delete is not empty"
}

CloudTrail Trail Types:

Trail Type	Scope	Use Case
Management Events	Control plane operations	Track resource creation/deletion
Data Events	Resource-level operations	S3 object access, Lambda invocations
Insights Events	Unusual activity detection	Anomaly detection (requires additional cost)

Creating a Multi-Region Trail with Terraform:

resource "aws_cloudtrail" "main" {
  name                          = "organization-trail"
  s3_bucket_name                = aws_s3_bucket.cloudtrail.id
  include_global_service_events = true
  is_multi_region_trail         = true
  enable_log_file_validation    = true
  
  event_selector {
    read_write_type           = "All"
    include_management_events = true

    data_resource {
      type   = "AWS::S3::Object"
      values = ["arn:aws:s3:::sensitive-bucket/*"]
    }
  }
}

💡 Tip: Enable log file validation to ensure CloudTrail logs haven't been tampered with—critical for compliance.

AWS Config 📋

AWS Config continuously monitors and records AWS resource configurations. It evaluates whether resources comply with desired configurations and tracks changes over time.

AWS Config Workflow:

┌────────────────────────────────────────────────────┐
│           AWS CONFIG WORKFLOW                      │
└────────────────────────────────────────────────────┘

    📦 AWS Resource Created/Modified
           |
           ↓
    🔍 Config Records Configuration Snapshot
           |
           ↓
    📊 Config Stores in S3 Bucket
           |
           ↓
    📏 Config Rules Evaluate Compliance
           |
      ┌────┴────┐
      ↓         ↓
   ✅ Compliant  ❌ Non-Compliant
      |         |
      |         ↓
      |    🔔 SNS Notification
      |         |
      |         ↓
      |    🔧 Auto-Remediation (optional)
      |         |
      └────┬────┘
           ↓
    📈 Compliance Dashboard Updated

AWS Config Rules Examples:

Rule	Purpose	Remediation
encrypted-volumes	Ensure EBS volumes are encrypted	Enable encryption on non-compliant volumes
s3-bucket-public-read-prohibited	Block public S3 buckets	Remove public access permissions
rds-storage-encrypted	Require encrypted RDS instances	Snapshot, recreate with encryption
required-tags	Enforce tagging standards	Apply missing tags automatically
vpc-flow-logs-enabled	Ensure VPC logging	Enable flow logs for non-compliant VPCs

Creating a Custom Config Rule:

import boto3
import json

config = boto3.client('config')

config.put_config_rule(
    ConfigRule={
        'ConfigRuleName': 'require-ec2-instance-tags',
        'Description': 'Ensures EC2 instances have required tags',
        'Source': {
            'Owner': 'AWS',
            'SourceIdentifier': 'REQUIRED_TAGS'
        },
        'InputParameters': json.dumps({
            'tag1Key': 'Environment',
            'tag2Key': 'Owner',
            'tag3Key': 'CostCenter'
        }),
        'Scope': {
            'ComplianceResourceTypes': [
                'AWS::EC2::Instance'
            ]
        },
        'ConfigRuleState': 'ACTIVE'
    }
)

Config Aggregator for Multi-Account Visibility:

{
  "ConfigurationAggregatorName": "OrganizationAggregator",
  "OrganizationAggregationSource": {
    "RoleArn": "arn:aws:iam::123456789012:role/ConfigAggregatorRole",
    "AwsRegions": ["us-east-1", "us-west-2", "eu-west-1"],
    "AllAwsRegions": false
  }
}

AWS Organizations 🏢

AWS Organizations enables central governance and management across multiple AWS accounts. It provides consolidated billing, hierarchical account structure, and policy-based access controls.

Organization Hierarchy:

                    🏢 Root
                       |
        ┌──────────────┼──────────────┐
        │              │              │
    📁 Production   📁 Development  📁 Security
        |              |              |
    ┌───┴───┐      ┌───┴───┐         │
    │       │      │       │         │
  💼 App1 💼 App2 💼 Test1 💼 Test2  🔐 Audit

Service Control Policies (SCPs):

SCPs are JSON policies that set maximum permissions for accounts in an OU (Organizational Unit). Even if an IAM policy grants access, an SCP can deny it.

SCP Example - Deny Regions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Action": "*",
      "Resource": "*",
      "Condition": {
        "StringNotEquals": {
          "aws:RequestedRegion": [
            "us-east-1",
            "us-west-2",
            "eu-west-1"
          ]
        }
      }
    }
  ]
}

SCP Example - Prevent Root User Access:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Action": "*",
      "Resource": "*",
      "Condition": {
        "StringLike": {
          "aws:PrincipalArn": "arn:aws:iam::*:root"
        }
      }
    }
  ]
}

💡 Tip: SCPs affect all users and roles in an account, including administrators. Always test in a non-production environment first.

Organizations Best Practices:

Separate accounts by workload: Production, development, security, logging
Use SCPs at OU level: Apply policies to groups of accounts
Centralize logging: Send CloudTrail and Config data to a security account
Enable CloudTrail at organization level: Automatic for all accounts
Tag-based cost allocation: Track spending by project/team

AWS Security Hub 🛡️

AWS Security Hub aggregates security findings from multiple AWS services and third-party tools into a single dashboard. It provides automated compliance checks against industry standards.

Security Hub Integrations:

┌─────────────────────────────────────────────┐
│         AWS SECURITY HUB                    │
│    (Central Security Dashboard)             │
└──────────────┬──────────────────────────────┘
               │
    ┌──────────┼──────────┬──────────┐
    │          │          │          │
    ▼          ▼          ▼          ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│GuardDuty│ │Inspector│ │ Macie  │ │ Config │
│Threats │ │Vulns    │ │Data    │ │Compliance│
└────────┘ └────────┘ └────────┘ └────────┘

Security Standards Supported:

CIS AWS Foundations Benchmark: Industry best practices
PCI DSS: Payment card industry standards
AWS Foundational Security Best Practices: AWS-specific recommendations

Example Security Hub Finding:

{
  "SchemaVersion": "2018-10-08",
  "Id": "arn:aws:securityhub:us-east-1:123456789012:finding/...",
  "ProductArn": "arn:aws:securityhub:us-east-1::product/aws/guardduty",
  "GeneratorId": "arn:aws:guardduty:us-east-1:123456789012:detector/...",
  "AwsAccountId": "123456789012",
  "Types": ["TTPs/Initial Access/UnauthorizedAccess:IAMUser-InstanceCredentialExfiltration"],
  "CreatedAt": "2024-01-15T10:30:00.000Z",
  "UpdatedAt": "2024-01-15T10:30:00.000Z",
  "Severity": {
    "Product": 8,
    "Label": "HIGH",
    "Normalized": 70
  },
  "Title": "Unusual API call from EC2 instance",
  "Description": "EC2 instance credentials used from external IP",
  "Remediation": {
    "Recommendation": {
      "Text": "Rotate instance credentials and investigate access"
    }
  }
}

AWS Systems Manager 🔧

AWS Systems Manager provides unified operations management for AWS and on-premises resources. It includes capabilities for patching, configuration management, and operational insights.

Key Systems Manager Features:

Feature	Purpose	Common Use
Session Manager	Browser-based shell access	Connect to EC2 without SSH keys
Patch Manager	Automated OS patching	Schedule monthly security updates
Parameter Store	Secure configuration storage	Store database passwords, API keys
State Manager	Maintain configuration state	Ensure software always installed
Run Command	Execute commands at scale	Install software on 100s of instances

Parameter Store vs. Secrets Manager:

Feature	Parameter Store	Secrets Manager
Cost	Free (standard), $0.05/advanced	$0.40/secret/month
Rotation	Manual	Automatic (Lambda-based)
Size Limit	4KB (standard), 8KB (advanced)	64KB
Encryption	Optional (KMS)	Always encrypted (KMS)
Best For	Config values, non-sensitive data	Database credentials, API keys

Retrieving a Secure Parameter:

import boto3

ssm = boto3.client('ssm')

response = ssm.get_parameter(
    Name='/myapp/database/password',
    WithDecryption=True
)

db_password = response['Parameter']['Value']

Run Command Example:

aws ssm send-command \
    --document-name "AWS-RunShellScript" \
    --targets "Key=tag:Environment,Values=Production" \
    --parameters 'commands=["sudo yum update -y","sudo systemctl restart nginx"]' \
    --comment "Apply security patches to production web servers"

AWS Trusted Advisor ✅

AWS Trusted Advisor provides real-time guidance to help you provision resources following AWS best practices. It checks your AWS environment across five categories.

Trusted Advisor Categories:

        TRUSTED ADVISOR CHECK CATEGORIES

┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│ 💰 Cost      │  │ 🔒 Security  │  │ ⚡ Performance│
│ Optimization │  │              │  │              │
├──────────────┤  ├──────────────┤  ├──────────────┤
│•Idle RDS DBs │  │•S3 public    │  │•High util EC2│
│•Underused    │  │•IAM keys old │  │•EBS IOPS     │
│  EBS volumes │  │•Security     │  │•CloudFront   │
│•Reserved Inst│  │  groups open │  │  config      │
└──────────────┘  └──────────────┘  └──────────────┘

┌──────────────┐  ┌──────────────┐
│ 🛡️ Fault     │  │ 📊 Service   │
│ Tolerance    │  │ Limits       │
├──────────────┤  ├──────────────┤
│•EBS snapshots│  │•VPC limits   │
│•Multi-AZ RDS │  │•EC2 on-demand│
│•ELB health   │  │•IAM groups   │
│•Route 53     │  │•S3 buckets   │
└──────────────┘  └──────────────┘

💡 Tip: Business and Enterprise support plans get access to all Trusted Advisor checks. Basic/Developer plans only see 7 core checks.

Examples 📝

Example 1: Implementing Least Privilege with IAM Policies

Scenario: You need to grant a developer read-only access to S3 buckets in the development environment, but not production.

Poorly Designed Policy (Too Permissive):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:*",
      "Resource": "*"
    }
  ]
}

⚠️ Problem: This grants full S3 access to all buckets, including production data and the ability to delete objects.

Well-Designed Policy (Least Privilege):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:GetObjectVersion",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::dev-*",
        "arn:aws:s3:::dev-*/*"
      ],
      "Condition": {
        "StringEquals": {
          "aws:RequestedRegion": "us-east-1"
        }
      }
    },
    {
      "Effect": "Deny",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::prod-*",
        "arn:aws:s3:::prod-*/*"
      ]
    }
  ]
}

✅ Improvements:

Only read actions (no write/delete)
Limited to buckets starting with "dev-"
Explicit deny for production buckets
Region restriction

Example 2: Setting Up Multi-Account CloudTrail Logging

Scenario: Your organization has 50 AWS accounts. You need centralized audit logging with tamper-proof storage.

Architecture:

┌─────────────────────────────────────────────────┐
│         SECURITY ACCOUNT (123456789012)         │
│                                                 │
│  ┌───────────────────────────────────────┐     │
│  │   CloudTrail S3 Bucket                │     │
│  │   (organization-cloudtrail-logs)      │     │
│  │   • MFA Delete enabled                │     │
│  │   • Object Lock enabled               │     │
│  │   • Encryption: KMS                   │     │
│  └───────────────────────────────────────┘     │
│                    ↑                            │
└────────────────────┼────────────────────────────┘
                     │
    ┌────────────────┼────────────────┐
    │                │                │
    ▼                ▼                ▼
┌────────┐      ┌────────┐      ┌────────┐
│Account │      │Account │      │Account │
│  001   │      │  002   │  ... │  050   │
│        │      │        │      │        │
│CloudTrail      │CloudTrail      │CloudTrail
│ (disabled,     │ (disabled,     │ (disabled,
│ org trail      │ org trail      │ org trail
│ covers it)     │ covers it)     │ covers it)
└────────┘      └────────┘      └────────┘

Implementation Steps:

Step 1: Create S3 Bucket in Security Account

aws s3 mb s3://organization-cloudtrail-logs --region us-east-1

Step 2: Apply Bucket Policy

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AWSCloudTrailAclCheck",
      "Effect": "Allow",
      "Principal": {"Service": "cloudtrail.amazonaws.com"},
      "Action": "s3:GetBucketAcl",
      "Resource": "arn:aws:s3:::organization-cloudtrail-logs"
    },
    {
      "Sid": "AWSCloudTrailWrite",
      "Effect": "Allow",
      "Principal": {"Service": "cloudtrail.amazonaws.com"},
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::organization-cloudtrail-logs/AWSLogs/*",
      "Condition": {
        "StringEquals": {
          "s3:x-amz-acl": "bucket-owner-full-control"
        }
      }
    }
  ]
}

Step 3: Create Organization Trail

aws cloudtrail create-trail \
    --name organization-trail \
    --s3-bucket-name organization-cloudtrail-logs \
    --is-multi-region-trail \
    --is-organization-trail \
    --enable-log-file-validation

aws cloudtrail start-logging --name organization-trail

Step 4: Enable S3 Object Lock (Compliance Mode)

aws s3api put-object-lock-configuration \
    --bucket organization-cloudtrail-logs \
    --object-lock-configuration '{"ObjectLockEnabled":"Enabled","Rule":{"DefaultRetention":{"Mode":"COMPLIANCE","Years":7}}}'

✅ Result: All 50 accounts now have centralized, immutable audit logs stored for 7 years (meeting most compliance requirements).

Example 3: Automated Compliance Remediation with AWS Config

Scenario: Your security team requires all S3 buckets to have encryption enabled and public access blocked. You want automatic remediation.

Step 1: Deploy Config Rules

## CloudFormation template
Resources:
  S3EncryptionRule:
    Type: AWS::Config::ConfigRule
    Properties:
      ConfigRuleName: s3-bucket-server-side-encryption-enabled
      Source:
        Owner: AWS
        SourceIdentifier: S3_BUCKET_SERVER_SIDE_ENCRYPTION_ENABLED
      Scope:
        ComplianceResourceTypes:
          - AWS::S3::Bucket

  S3PublicAccessRule:
    Type: AWS::Config::ConfigRule
    Properties:
      ConfigRuleName: s3-bucket-public-read-prohibited
      Source:
        Owner: AWS
        SourceIdentifier: S3_BUCKET_PUBLIC_READ_PROHIBITED
      Scope:
        ComplianceResourceTypes:
          - AWS::S3::Bucket

Step 2: Create Remediation Lambda Function

import boto3
import json

s3 = boto3.client('s3')

def lambda_handler(event, context):
    # Extract bucket name from Config event
    config_item = json.loads(event['configurationItem'])
    bucket_name = config_item['resourceName']
    
    # Enable default encryption
    try:
        s3.put_bucket_encryption(
            Bucket=bucket_name,
            ServerSideEncryptionConfiguration={
                'Rules': [{
                    'ApplyServerSideEncryptionByDefault': {
                        'SSEAlgorithm': 'AES256'
                    },
                    'BucketKeyEnabled': True
                }]
            }
        )
        
        # Block public access
        s3.put_public_access_block(
            Bucket=bucket_name,
            PublicAccessBlockConfiguration={
                'BlockPublicAcls': True,
                'IgnorePublicAcls': True,
                'BlockPublicPolicy': True,
                'RestrictPublicBuckets': True
            }
        )
        
        return {
            'statusCode': 200,
            'body': f'Remediated bucket: {bucket_name}'
        }
    except Exception as e:
        print(f"Error remediating {bucket_name}: {str(e)}")
        raise

Step 3: Configure Auto-Remediation

aws configservice put-remediation-configurations \
    --remediation-configurations '[
      {
        "ConfigRuleName": "s3-bucket-server-side-encryption-enabled",
        "TargetType": "SSM_DOCUMENT",
        "TargetIdentifier": "AWS-PublishSNSNotification",
        "Automatic": true,
        "MaximumAutomaticAttempts": 3,
        "RetryAttemptSeconds": 60
      }
    ]'

✅ Result: Any new S3 bucket created without encryption is automatically remediated within 60 seconds.

Example 4: Cross-Account CloudWatch Dashboards

Scenario: Your operations team needs a single dashboard showing metrics from 10 different AWS accounts.

Solution Using CloudWatch Cross-Account Observability:

Step 1: Set Up Monitoring Account as Sink

## In monitoring account (999999999999)
aws oam create-sink \
    --name central-observability-sink \
    --tags Key=Environment,Value=Production

Step 2: Create Sink Policy

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::111111111111:root",
          "arn:aws:iam::222222222222:root",
          "arn:aws:iam::333333333333:root"
        ]
      },
      "Action": [
        "oam:CreateLink",
        "oam:UpdateLink"
      ],
      "Resource": "*",
      "Condition": {
        "ForAllValues:StringEquals": {
          "oam:ResourceTypes": [
            "AWS::CloudWatch::Metric",
            "AWS::Logs::LogGroup"
          ]
        }
      }
    }
  ]
}

Step 3: Link Source Accounts

## In each source account (111111111111, 222222222222, etc.)
aws oam create-link \
    --label-template '$AccountName-$Region' \
    --resource-types 'AWS::CloudWatch::Metric' 'AWS::Logs::LogGroup' \
    --sink-identifier 'arn:aws:oam:us-east-1:999999999999:sink/abc123'

Step 4: Create Unified Dashboard

{
  "widgets": [
    {
      "type": "metric",
      "properties": {
        "metrics": [
          [ { "expression": "SEARCH('{AWS/EC2,InstanceId} MetricName=\"CPUUtilization\"', 'Average', 300)", "id": "e1", "accountId": "111111111111" } ],
          [ { "expression": "SEARCH('{AWS/EC2,InstanceId} MetricName=\"CPUUtilization\"', 'Average', 300)", "id": "e2", "accountId": "222222222222" } ],
          [ { "expression": "SEARCH('{AWS/EC2,InstanceId} MetricName=\"CPUUtilization\"', 'Average', 300)", "id": "e3", "accountId": "333333333333" } ]
        ],
        "region": "us-east-1",
        "title": "Cross-Account EC2 CPU Utilization",
        "period": 300
      }
    }
  ]
}

✅ Result: Single dashboard shows real-time metrics from all 10 accounts without manual switching.

Common Mistakes ⚠️

1. Overly Permissive IAM Policies

❌ Mistake:

{
  "Effect": "Allow",
  "Action": "*",
  "Resource": "*"
}

✅ Fix: Always specify exact actions and resources:

{
  "Effect": "Allow",
  "Action": [
    "s3:GetObject",
    "s3:PutObject"
  ],
  "Resource": "arn:aws:s3:::my-app-bucket/*"
}

2. Not Enabling CloudTrail Log File Validation

❌ Mistake: Creating trails without log file validation allows attackers to modify audit logs.

✅ Fix: Always enable validation:

aws cloudtrail create-trail \
    --name my-trail \
    --s3-bucket-name my-bucket \
    --enable-log-file-validation

3. Ignoring CloudWatch Alarm States

❌ Mistake: Setting alarms to "INSUFFICIENT_DATA" action as OK state:

AlarmActions=['arn:aws:sns:us-east-1:123456789012:my-topic'],
InsufficientDataActions=[]  # Missing!

✅ Fix: Treat insufficient data as a problem:

AlarmActions=['arn:aws:sns:us-east-1:123456789012:my-topic'],
InsufficientDataActions=['arn:aws:sns:us-east-1:123456789012:my-topic']

4. Using Root Account for Daily Operations

❌ Mistake: Logging in with root account credentials for routine tasks.

✅ Fix:

Create IAM users with appropriate permissions
Enable MFA on root account
Store root credentials in a secure location (password manager)
Use root only for account-level tasks (billing, support plan changes)

5. Not Aggregating Config Data Across Accounts

❌ Mistake: Checking compliance in each account individually—time-consuming and error-prone.

✅ Fix: Use Config Aggregators:

aws configservice put-configuration-aggregator \
    --configuration-aggregator-name OrgAggregator \
    --organization-aggregation-source RoleArn=arn:aws:iam::123456789012:role/ConfigRole,AllAwsRegions=true

6. Storing Secrets in Parameter Store Without Encryption

❌ Mistake:

aws ssm put-parameter \
    --name /myapp/db/password \
    --value "plaintext_password" \
    --type String  # Not encrypted!

✅ Fix: Use SecureString with KMS:

aws ssm put-parameter \
    --name /myapp/db/password \
    --value "secure_password" \
    --type SecureString \
    --key-id alias/aws/ssm

7. Not Setting CloudWatch Log Retention

❌ Mistake: Leaving logs with infinite retention leads to unnecessary costs.

✅ Fix: Set appropriate retention periods:

logs = boto3.client('logs')
logs.put_retention_policy(
    logGroupName='/aws/lambda/my-function',
    retentionInDays=30  # or 90, 180, 365 based on requirements
)

8. Forgetting to Enable GuardDuty in All Regions

❌ Mistake: Enabling GuardDuty only in your primary region—threats can come from anywhere.

✅ Fix: Enable in all active regions:

for region in us-east-1 us-west-2 eu-west-1 ap-southeast-1; do
  aws guardduty create-detector --enable --region $region
done

Key Takeaways 🎯

📋 Quick Reference Card

Service	Primary Purpose	Key Feature
IAM	Identity & access control	Policies, roles, MFA
CloudWatch	Monitoring & observability	Metrics, logs, alarms
CloudTrail	Audit logging	API call recording
Config	Configuration compliance	Rules, remediation
Organizations	Multi-account management	SCPs, consolidated billing
Security Hub	Centralized security	Finding aggregation
Systems Manager	Operations management	Parameter Store, patching
Trusted Advisor	Best practice checks	Cost, security recommendations

Security Best Practices:

✅ Enable MFA on all accounts (especially root)
✅ Use IAM roles instead of access keys
✅ Enable CloudTrail in all regions with log validation
✅ Encrypt all data at rest and in transit
✅ Implement least privilege access
✅ Use SCPs to enforce organization-wide restrictions
✅ Set up automated Config remediation
✅ Monitor Security Hub findings daily
✅ Rotate credentials every 90 days
✅ Tag all resources for accountability

Monitoring Essentials:

📊 Create dashboards for business-critical metrics
🔔 Set up alarms with appropriate thresholds
📝 Centralize logs in a dedicated security account
🔍 Use Logs Insights for rapid troubleshooting
📈 Enable detailed monitoring for critical resources

Compliance Fundamentals:

📋 Deploy Config rules for your industry standards
🔄 Enable automatic remediation where possible
📊 Use Config Aggregators for multi-account visibility
🛡️ Subscribe to Security Hub security standards
📅 Schedule regular compliance reviews

🧠 Memory Device - The "COPS" Framework:

CloudTrail = Captures API calls (audit trail)
Organizations = Oversees multiple accounts
Parameter Store = Preserves configuration securely
Security Hub = Summarizes security findings

💡 Pro Tip: Always implement security in layers (defense in depth). No single service provides complete protection—use IAM + CloudTrail + Config + GuardDuty + Security Hub together.

📚 Further Study

AWS Security Best Practices: https://aws.amazon.com/architecture/security-identity-compliance/
AWS Well-Architected Framework - Security Pillar: https://docs.aws.amazon.com/wellarchitected/latest/security-pillar/welcome.html
AWS Config Developer Guide: https://docs.aws.amazon.com/config/latest/developerguide/WhatIsConfig.html

Ready to test your knowledge? Complete the practice questions below to reinforce these AWS security, governance, and observability concepts! 🚀

📝

Ready to practice?

This lesson has 15 questions to help you learn