You are viewing a preview of this lesson. Sign in to start learning
Back to Production Observability: From Signals to Root Cause (2026)

Cross-Service Causality

Build end-to-end request tracking that survives service boundaries and technology changes

Cross-Service Causality

πŸ’» Master distributed tracing's most powerful capability with free flashcards and practice exercises. This lesson covers causal relationships between services, trace context propagation, and root cause identificationβ€”essential concepts for anyone building or operating microservices architectures in production.

Welcome to Cross-Service Causality πŸ”

Welcome to the heart of distributed tracing! When a user clicks "Buy Now" and the request fails, was it the payment service, inventory check, or shipping calculator that caused the problem? In monolithic applications, you'd examine a single stack trace. In microservices, that single click triggers a cascade of events across dozens of services. Cross-service causality is the ability to understand not just what happened in each service, but why it happened and how events in one service caused specific behaviors in others.

Think of cross-service causality like following a relay race πŸƒβ€β™€οΈβ†’πŸƒβ€β™‚οΈβ†’πŸƒβ†’πŸƒβ€β™€οΈ. When the team loses, you need to know: Did someone drop the baton? Was there a slow handoff? Did one runner fall? You can't just look at individual lap timesβ€”you need to understand the causal chain that connects each runner's performance to the final outcome.

Core Concepts

What Is Causality in Distributed Systems? 🧩

Causality is the relationship between events where one event (the cause) influences or triggers another event (the effect). In distributed systems, causality helps us answer:

  • "Did slow database queries in Service A cause timeouts in Service B?"
  • "Which upstream service failure triggered this circuit breaker?"
  • "Did the cache miss lead to this downstream load spike?"

Cross-service causality specifically tracks how actions in one service propagate effects across service boundaries. The key challenge? Services run on different machines, have independent clocks, and process requests asynchronously. Without proper instrumentation, these causal relationships become invisible.

TRADITIONAL LOGS (No Causality)

[Service A] 10:23:45.123 - Order received
[Service C] 10:23:45.089 - Payment failed  ← Earlier timestamp!
[Service B] 10:23:45.234 - Inventory checked
[Service A] 10:23:45.456 - Order failed

Which happened first? What caused what? 🀷
WITH TRACE CONTEXT (Causality Clear)

[TraceID: abc123] [SpanID: 001] [Service A] Order received
                     β”‚
                     β”œβ”€β”€β†’ [SpanID: 002] [Service B] Inventory checked βœ“
                     β”‚
                     └──→ [SpanID: 003] [Service C] Payment failed ❌
                            (Parent: 001, Duration: 234ms)

Causality: Order (001) β†’ Payment attempt (003) β†’ Failure
The Causality Chain: Parent-Child Relationships πŸ‘ͺ

At the core of cross-service causality is the parent-child span relationship. Each span represents a unit of work (a function call, HTTP request, database query). When Service A calls Service B:

  1. Service A creates a span (the "parent")
  2. Service A propagates trace context to Service B
  3. Service B creates a child span that references the parent
  4. The parent-child link establishes causality

This creates a directed acyclic graph (DAG) where edges represent causal relationships:

CAUSAL GRAPH: E-commerce Order

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  HTTP Request   β”‚ (Root Span)
                    β”‚  POST /orders   β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”‚                β”‚                β”‚
     β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
     β”‚ Validate    β”‚  β”‚ Check       β”‚  β”‚ Reserve    β”‚
     β”‚ User        β”‚  β”‚ Inventory   β”‚  β”‚ Payment    β”‚
     β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
            β”‚                β”‚                β”‚
            β”‚         β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”         β”‚
            β”‚         β”‚ Query DB    β”‚         β”‚
            β”‚         β”‚ Stock Level β”‚         β”‚
            β”‚         β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜         β”‚
            β”‚                β”‚                β”‚
            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                      β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
                      β”‚ Finalize    β”‚
                      β”‚ Order       β”‚
                      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Each arrow represents a causal dependency

πŸ’‘ Key Insight: Without parent-child links, you have isolated events. With them, you have a causal narrative that explains system behavior.

Happens-Before Relationships ⏰

The happens-before relation (denoted as β†’) formalizes causality:

  • A β†’ B means "A causally precedes B"
  • If A sends a message to B, then A β†’ B
  • If A β†’ B and B β†’ C, then A β†’ C (transitivity)
RelationshipSymbolExample
Causally precedesA β†’ BRequest sent β†’ Response received
ConcurrentA βˆ₯ BTwo services independently cache
Causally followsA ← BResponse sent ← Request processed

Two events are concurrent (A βˆ₯ B) if neither happens-before the other. This is critical because:

❌ Timestamp comparison fails for concurrent events (clock skew) βœ… Trace context succeeds because it captures actual causal dependencies

Trace Context Propagation Mechanisms πŸ“‘

How does causality information travel across service boundaries? Through trace context propagation:

1. In-Band Propagation (embedded in the request):

ProtocolHeader/FieldContent
HTTPtraceparentVersion-TraceID-SpanID-Flags
gRPCgrpc-trace-binBinary trace context
KafkaMessage headersTraceID, SpanID pairs
AWS LambdaX-Amzn-Trace-IdRoot, Parent, Sampled

Example HTTP headers:

traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
tracestate: vendor1=value1,vendor2=value2

2. Out-of-Band Propagation (separate metadata channel):

  • Service mesh sidecars (Istio, Linkerd) inject context
  • Message queue metadata fields
  • Shared context stores (Redis with trace IDs as keys)

3. Baggage Items πŸŽ’: Key-value pairs propagated with the trace for contextual information:

  • userId=12345 - who initiated the request
  • experimentId=variantB - which A/B test variant
  • tenantId=acme-corp - multi-tenant isolation

⚠️ Warning: Baggage adds overhead to every request. Keep it minimal!

Causal Inference Patterns πŸ”¬

Once you have causal links, you can infer root causes:

Pattern 1: Direct Causation Service A calls Service B, B fails β†’ A's call caused B's failure

Pattern 2: Transitive Causation A β†’ B β†’ C β†’ D, D fails β†’ Trace back through C, B, A to find root cause

Pattern 3: Fan-Out Causation A calls B, C, D in parallel. C fails β†’ Analyze whether B or D's success/failure depended on C

Pattern 4: Contextual Causation A's slow response wasn't caused by A's code, but by upstream timeout propagated through trace context

FAN-OUT PATTERN: Parallel Service Calls

           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
           β”‚  Service A  β”‚ (300ms total)
           β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚         β”‚         β”‚
   β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β” β”Œβ”€β”€β–Όβ”€β”€β”€β” β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”
   β”‚Service β”‚ β”‚Serviceβ”‚ β”‚Service β”‚
   β”‚   B    β”‚ β”‚   C  β”‚ β”‚   D    β”‚
   β”‚ (50ms) β”‚ β”‚(280ms)β”‚ β”‚ (40ms) β”‚
   β””β”€β”€β”€β”€β”¬β”€β”€β”€β”˜ β””β”€β”€β”¬β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
        β”‚        β”‚         β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
          β”‚   Merge     β”‚
          β”‚   Results   β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Causal Conclusion: Service C (280ms) 
is the bottleneck causing A's latency
Critical Path Analysis πŸ›€οΈ

The critical path is the longest causal chain through your traceβ€”the sequence of operations that determines total latency. Optimizing non-critical-path services won't improve user experience.

SpanDurationOn Critical Path?Impact if Optimized
Auth validation20msβœ… YesReduces total latency
Fetch user profile150msβœ… YesHigh impact
Log analytics event200ms❌ No (async)Zero impact on response
Calculate recommendations300msβœ… YesHighest impact
Send marketing email500ms❌ No (background)Zero impact on response

πŸ’‘ Pro Tip: Color-code traces by critical path status. Focus optimization efforts only on critical-path operations.

Examples

Example 1: HTTP Request Cascade 🌊

Scenario: A mobile app calls API Gateway β†’ Auth Service β†’ User Service β†’ Database

Let's trace the causality:

Client Request:
  trace-id: abc123
  parent-span-id: (none - root)
  span-id: span-001

API Gateway receives request, creates child span:
  trace-id: abc123
  parent-span-id: span-001
  span-id: span-002
  operation: route-request

API Gateway calls Auth Service with headers:
  traceparent: 00-abc123-span-002-01

Auth Service creates child span:
  trace-id: abc123
  parent-span-id: span-002
  span-id: span-003
  operation: validate-token

Auth Service calls User Service:
  traceparent: 00-abc123-span-003-01

User Service creates child span:
  trace-id: abc123
  parent-span-id: span-003
  span-id: span-004
  operation: fetch-user-profile

User Service queries database:
  trace-id: abc123
  parent-span-id: span-004
  span-id: span-005
  operation: db-query
  query: SELECT * FROM users WHERE id=?
  duration: 250ms ⚠️ (SLOW!)

Causal Analysis:

  • Root cause: Database query (span-005) took 250ms
  • Effect: User Service (span-004) blocked waiting for DB
  • Upstream effect: Auth Service (span-003) waited for User Service
  • User impact: API Gateway (span-002) couldn't respond

The causal chain: Slow DB query β†’ Blocked User Service β†’ Delayed Auth β†’ Slow API response

VISUALIZED TIMELINE (β†’ = causation)

0ms   100ms  200ms  300ms  400ms
│─────│─────│─────│─────│─────│
β”‚                              β”‚ span-001 (Client)
β”‚  β”‚                           β”‚ span-002 (Gateway)
β”‚  β”‚  β”‚                        β”‚ span-003 (Auth)
β”‚  β”‚  β”‚   β”‚                    β”‚ span-004 (User Service)
β”‚  β”‚  β”‚   β”‚    │──────────│    β”‚ span-005 (DB) ← ROOT CAUSE
                    250ms!

Critical Path: 001β†’002β†’003β†’004β†’005
Example 2: Async Message Queue Causality πŸ“¨

Distributed systems often use message queues (Kafka, RabbitMQ, SQS) where causality isn't obvious.

Scenario: Order service publishes "OrderCreated" event β†’ Inventory service consumes β†’ Warehouse system updates

Order Service (Producer):
  Creates span-A: "publish-order-event"
  Attaches trace context to message:
    kafka-header: trace-id=xyz789
    kafka-header: parent-span-id=span-A

Message sits in queue for 3 seconds... ⏳

Inventory Service (Consumer):
  Receives message, extracts trace context
  Creates span-B: "process-order-event"
    parent-span-id: span-A  ← Establishes causality!
    trace-id: xyz789

  Calls Warehouse API:
    Creates span-C: "reserve-inventory"
    parent-span-id: span-B

Warehouse System:
  Creates span-D: "update-stock-levels"
  parent-span-id: span-C
  Fails with: "Insufficient inventory" ❌

Key Causality Insights:

  1. Queue latency (3s) is visible in the trace but doesn't break causality
  2. Async boundaries are maintained through message headers
  3. Root cause: Warehouse inventory failure (span-D)
  4. Causal path: Order publish (A) β†’ Inventory processing (B) β†’ Warehouse call (C) β†’ Stock update failure (D)
TIME-BASED VIEW (Queue creates gap)

Order Service:    │─span-A─│
                            ↓ (message)
Queue:                      [3 seconds]
                                      ↓
Inventory Service:                  │─span-B─│
                                             ↓
Warehouse:                                   │─span-C─│─span-D─│
                                                           ❌

CAUSAL VIEW (Gap doesn't matter)
span-A β†’ span-B β†’ span-C β†’ span-D (failure)

πŸ’‘ Best Practice: Always propagate trace context through message metadata, not message body (keeps payload clean).

Example 3: Circuit Breaker Causality πŸ”Œ

Circuit breakers complicate causality because failures propagate differently.

Scenario: Payment service is failing β†’ Gateway opens circuit breaker β†’ Orders fail without calling Payment service

Initial Failures (Circuit Closed):

Trace 1:
  Order Service (span-1) 
    β†’ Payment Service (span-2) - HTTP 500 error
    β†’ Database timeout (span-3) ← Root cause

Trace 2:
  Order Service (span-4)
    β†’ Payment Service (span-5) - HTTP 500 error
    β†’ Database timeout (span-6) ← Root cause

... 5 more failures ...

Circuit Opens! πŸ”΄

Subsequent Requests (Circuit Open):

Trace 10:
  Order Service (span-20)
    β†’ Circuit Breaker OPEN (span-21) ← No call to Payment!
    β†’ Error: "Service Unavailable" returned immediately
    
  Spans: order.create, circuit.open, fallback.execute
  Tags: circuit.state=open, circuit.reason=failure_threshold

Causal Analysis:

  • Primary cause: Database timeouts in Payment service
  • Secondary cause: Circuit breaker opening (protective mechanism)
  • Tertiary effect: New orders failing fast without calling Payment

Challenge: How do you link Trace 10 (circuit open) to Traces 1-9 (failures that opened circuit)?

Solution: Circuit breaker state changes are events with their own spans:

CAUSAL LINKAGE THROUGH CIRCUIT STATE

Trace 1-9:                  Trace 10-100:
  span: payment.call          span: order.create
  status: error               |
  error: DB timeout           ↓
         ↓                    span: circuit.check
    [Trigger]                 state: OPEN
         ↓                    reason_trace_id: trace-7 ← Reference!
  span: circuit.state_change  opened_at: timestamp
  event: CLOSED β†’ OPEN
  trigger_trace: trace-7
  trigger_span: span-13

Now when investigating Trace 10, you can:

  1. See circuit is open (span-21)
  2. Check circuit state-change event
  3. Follow trigger_trace to original failure (trace-7)
  4. Find root cause in trace-7's span-13 (database timeout)
Example 4: Multi-Tenant Causality with Baggage 🏒

Baggage items propagate contextual data across the entire trace.

Scenario: SaaS platform where tenant=ACME experiences slow response times

Incoming Request:
  traceparent: 00-def456-...-01
  baggage: tenantId=ACME,region=us-east,tier=premium

API Gateway (span-1):
  Extracts baggage β†’ All child spans inherit it
  
Auth Service (span-2):
  baggage.tenantId = ACME
  Queries: auth_db.tenant_ACME
  
Feature Service (span-3):
  baggage.tier = premium
  Enables: advanced_analytics_feature
  Calls: analytics.compute() (span-4)
  
Analytics Service (span-4):
  baggage.region = us-east  
  Queries: analytics_db.us_east
  Duration: 4.2 seconds! ⚠️
  
Cache Service (parallel, span-5):
  baggage.tenantId = ACME
  Cache key: cache:ACME:profile
  Duration: 50ms βœ…

Causal + Contextual Analysis:

ObservationCausal LinkBaggage Context
Slow analytics queryspan-3 β†’ span-4tier=premium enabled expensive feature
us-east databaseRegion routingbaggage.region determined DB selection
Tenant-specific impactACME onlybaggage.tenantId isolates issue

Root Cause: Premium tier feature (enabled by baggage.tier=premium) triggered expensive analytics computation. Only affects premium tenants in us-east region.

Without baggage, you'd see "Analytics Service is slow" but not understand:

  • WHY it's slow (premium feature)
  • WHO it affects (ACME tenant)
  • WHERE it's happening (us-east)
BAGGAGE PROPAGATION FLOW

  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚ tenantId=ACME, tier=premium, region=us β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      β”‚ (inherited by all)
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚             β”‚             β”‚
   β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”   β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
   β”‚  Auth   β”‚   β”‚Analyticsβ”‚   β”‚  Cache  β”‚
   β”‚ (span2) β”‚   β”‚ (span4) β”‚   β”‚ (span5) β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        ↓             ↓             ↓
   Uses tenant   Premium      Tenant-specific
   in DB query   feature      cache key

Common Mistakes

⚠️ Mistake 1: Breaking the Causality Chain

Problem: Failing to propagate trace context across all integration points.

## ❌ WRONG: Creating new trace instead of continuing
def call_downstream_service(data):
    tracer = Tracer()  # New tracer instance!
    with tracer.start_span('downstream_call'):  # New root span!
        response = http.post(url, json=data)
    return response

## βœ… RIGHT: Continuing existing trace
def call_downstream_service(data):
    with tracer.start_span('downstream_call') as span:
        # Trace context automatically propagated
        response = http.post(url, json=data, 
                           headers=tracer.inject_headers())
    return response

Impact: You get isolated spans instead of connected causal graphs.

⚠️ Mistake 2: Confusing Correlation with Causation

Problem: Two events happening near each other in time doesn't mean one caused the other.

Service A completes at: 10:23:45.123
Service B fails at:     10:23:45.125

❌ WRONG: "Service A caused Service B's failure" 
           (only 2ms apart, must be related!)

βœ… RIGHT: Check trace context. Do they share a trace-id?
          Is there a span parent-child relationship?
          If not, they're just coincidentally timed events.

Always verify causality through trace relationships, not timestamps!

⚠️ Mistake 3: Ignoring Async Causality

Problem: Treating async operations as if they break causality.

## ❌ WRONG: Losing context in background tasks
def process_order(order_id):
    with tracer.start_span('process_order'):
        # Do some work
        Thread(target=send_notification, args=(order_id,)).start()
        # Notification span will be orphaned! ❌

## βœ… RIGHT: Passing context to async operations
def process_order(order_id):
    with tracer.start_span('process_order') as span:
        ctx = tracer.extract_context()  # Capture current context
        Thread(target=send_notification, 
               args=(order_id, ctx)).start()

def send_notification(order_id, context):
    with tracer.start_span('send_notification', context=context):
        # Now properly linked to parent! βœ…
        email.send(...)

⚠️ Mistake 4: Excessive Baggage

Problem: Treating baggage as general-purpose distributed storage.

## ❌ WRONG: Putting entire objects in baggage
baggage = {
    'user': json.dumps(user_object),  # 2KB!
    'cart': json.dumps(shopping_cart),  # 5KB!
    'preferences': json.dumps(prefs),  # 1KB!
    'history': json.dumps(order_history)  # 10KB!
}
## This gets sent with EVERY service call!

## βœ… RIGHT: Only IDs and small flags
baggage = {
    'userId': '12345',
    'cartId': 'abc789',
    'experimentVariant': 'B',
    'tier': 'premium'
}
## Retrieve full objects from cache/DB when needed

πŸ’‘ Rule of thumb: Keep total baggage under 1KB per trace.

⚠️ Mistake 5: Not Tagging Critical Path Operations

Problem: Making all spans look equally important.

## ❌ WRONG: No indication of importance
with tracer.start_span('db_query'):
    result = db.query(...)

## βœ… RIGHT: Tag critical path operations
with tracer.start_span('db_query') as span:
    span.set_tag('critical_path', True)
    span.set_tag('operation.importance', 'high')
    result = db.query(...)

This enables filtering and prioritization in trace analysis tools.

Key Takeaways

🎯 Cross-service causality transforms isolated logs into a coherent narrative of system behavior. By establishing parent-child relationships between spans and propagating trace context across service boundaries, you can answer "why did this happen?" not just "what happened?"

🎯 Happens-before relationships (β†’) define causality more reliably than timestamps, which suffer from clock skew in distributed systems.

🎯 Trace context propagation requires instrumenting every integration point: HTTP headers, message queue metadata, gRPC context, async task handoffs.

🎯 Critical path analysis identifies which operations actually impact user experience, preventing wasted optimization efforts on non-blocking operations.

🎯 Baggage items provide contextual metadata that travels with the entire trace, enabling tenant-specific, region-specific, or experiment-specific analysis.

🎯 Async and queued operations don't break causality when properly instrumented with trace context in message headers.

🎯 Circuit breakers and fallbacks require special handling to link downstream effects back to upstream root causes.

πŸ“‹ Quick Reference: Cross-Service Causality

ConceptKey Point
CausalityEvent A influences event B (A β†’ B)
Parent-Child SpansEstablishes causal links across services
Trace ContextTraceID + ParentSpanID + SpanID propagated
Critical PathLongest causal chain determining latency
BaggageMetadata propagated across entire trace (<1KB)
Happens-BeforeCausal ordering independent of timestamps
PropagationHTTP headers, message metadata, gRPC context
Async OperationsExtract context, pass to background tasks

πŸ“š Further Study

Distributed Systems Theory:

OpenTelemetry Specifications:

Practical Implementation Guides:

πŸ”§ Try This: Instrument a simple microservices application (even just two services) with distributed tracing. Deliberately introduce a slow database query in the downstream service and observe how the causality chain reveals the root cause. Tools like Jaeger or Zipkin provide free, local trace visualization.

🧠 Memory Device: PCBH - Parent-Child relationships establish Baggage-carrying Happens-before causality. Think: "Please Carry Baggage Home" to remember the four pillars of cross-service causality.