Identity and Cache Truth
Master Relay's normalization system and understand how object identity drives cache consistency
Identity and Cache Truth in Relay
Master Relay's identity and cache truth concepts with free flashcards and spaced repetition practice. This lesson covers global object identification, normalization strategies, and cache consistency patternsβessential concepts for building performant GraphQL applications with Relay.
Welcome to Identity and Cache Truth
π» When building applications with Relay, understanding how data is stored, identified, and kept consistent is crucial. Relay's approach to caching and data management sets it apart from other GraphQL clients, offering automatic normalization and a sophisticated system for ensuring your cache represents the truth about your application's data state.
Think of Relay's cache as a single source of truth for all your dataβlike a well-organized library where every book has a unique catalog number. When you request the same book from different sections, you always get the exact same copy, never duplicates. This lesson will teach you how Relay achieves this through its identity system and cache management strategies.
Core Concepts
π― Global Object Identification
Relay requires that every object in your GraphQL schema that can be refetched has a globally unique identifier. This is the cornerstone of Relay's normalization strategy.
The id Field Convention
Every type that implements the Node interface must have an id field that is:
- Globally unique across your entire schema
- Opaque (clients shouldn't parse or construct IDs)
- Stable (the same object always has the same ID)
interface Node {
id: ID!
}
type User implements Node {
id: ID!
name: String!
email: String!
}
type Post implements Node {
id: ID!
title: String!
author: User!
}
π‘ Best Practice: Use base64-encoded strings that include the typename, like "User:123" encoded as "VXNlcjoxMjM=". This makes debugging easier while keeping IDs opaque to clients.
Why Global IDs Matter
When Relay fetches a User with id: "VXNlcjoxMjM=" from one query and the same user from another query, it recognizes they're the same object and merges the data automatically:
| Query 1 Result | Query 2 Result | Merged Cache Entry |
|---|---|---|
{
id: "VXNlcjoxMjM=",
name: "Alice"
} |
{
id: "VXNlcjoxMjM=",
email: "alice@example.com"
} |
{
id: "VXNlcjoxMjM=",
name: "Alice",
email: "alice@example.com"
} |
π¦ Normalization: The Heart of Relay's Cache
Normalization is the process of flattening nested GraphQL responses into a flat lookup table indexed by global IDs. This eliminates data duplication and ensures consistency.
Before Normalization (Denormalized)
{
"viewer": {
"name": "Alice",
"posts": [
{
"id": "post1",
"title": "Hello World",
"author": {
"id": "user123",
"name": "Alice"
}
}
]
},
"post": {
"id": "post1",
"title": "Hello World",
"author": {
"id": "user123",
"name": "Alice"
}
}
}
Notice how Alice and "Hello World" appear multiple times? If Alice changes her name, we'd need to update it in multiple places.
After Normalization (Relay's Store)
{
"user123": {
"__typename": "User",
"id": "user123",
"name": "Alice"
},
"post1": {
"__typename": "Post",
"id": "post1",
"title": "Hello World",
"author": {"__ref": "user123"}
},
"client:root": {
"viewer": {"__ref": "user123"},
"post": {"__ref": "post1"}
}
}
Now each object exists exactly once. The author field doesn't contain the full user objectβit contains a reference ({"__ref": "user123"}) pointing to the normalized record.
π§ Mental Model: Think of normalization like database normalization. Instead of storing the entire customer record with every order, you store the customer once and reference it by ID from each order.
π Cache as Truth: The Single Source Principle
In Relay, the store (cache) is the source of truth for your UI. Components don't hold their own copies of dataβthey read from and subscribe to the centralized store.
The Data Flow
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RELAY DATA FLOW β
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
1. Component renders
β
β
2. Relay reads from Store (cache)
β
β
3. Is data available?
β
ββββββ΄βββββ
β β
YES NO
β β
β β
β 4. Fetch from network
β β
β β
β 5. Normalize response
β β
β β
β 6. Update Store
β β
ββββββ¬βββββ
β
β
7. Notify subscribed components
β
β
8. Components re-render with new data
Benefits of Cache as Truth
β Consistency: When data updates in the store, ALL components using that data automatically see the update
β Efficiency: No duplicate data in memory
β Automatic updates: Update a user's name in one place, and it updates everywhere it's displayed
β Optimistic updates: You can immediately update the cache before the server responds, making your UI feel instant
π Cache Policies and Data Freshness
Relay provides several strategies for determining whether cached data is "fresh enough" or needs to be refetched.
Fetch Policies
| Policy | Behavior | Use Case |
|---|---|---|
store-or-network |
Use cache if available, otherwise fetch | Default - balanced approach |
store-and-network |
Use cache immediately, then fetch to update | Show something fast, then refresh |
network-only |
Always fetch from network, ignore cache | Critical real-time data |
store-only |
Only use cache, never fetch | Offline mode, static data |
const data = useLazyLoadQuery(
graphql`
query UserProfileQuery($id: ID!) {
user(id: $id) {
name
email
}
}
`,
{id: userId},
{fetchPolicy: 'store-and-network'} // Render cache, then update
);
π‘ Pro Tip: Use store-and-network for lists and feeds where you want to show cached content immediately but also want fresh data. Use network-only sparinglyβit defeats the purpose of caching.
π¨ Cache Updates: Mutations and the Store
When you perform a mutation (like creating, updating, or deleting data), Relay needs to update its cache to reflect the changes.
Automatic Updates
If your mutation returns an object with an id, Relay automatically updates that record in the cache:
mutation UpdateUserMutation($input: UpdateUserInput!) {
updateUser(input: $input) {
user {
id # Relay uses this to find the record
name # These fields get updated
email
}
}
}
Relay sees user.id, finds the existing User record in the cache with that ID, and merges the new fields. Every component displaying that user automatically re-renders with the updated data.
Manual Cache Updates with Updater Functions
For more complex scenarios (like adding items to a list), you need an updater function:
const [commitMutation] = useMutation(graphql`
mutation CreatePostMutation($input: CreatePostInput!) {
createPost(input: $input) {
post {
id
title
author {
id
}
}
}
}
`);
function createPost(title) {
commitMutation({
variables: {input: {title}},
updater: (store) => {
// Get the new post from the mutation response
const newPost = store.getRootField('createPost').getLinkedRecord('post');
// Get the current user's record
const user = store.get(currentUserId);
// Get the existing posts connection
const posts = user.getLinkedRecords('posts');
// Add the new post to the beginning
user.setLinkedRecords([newPost, ...posts], 'posts');
}
});
}
The updater function receives a store object that lets you imperatively modify the cache.
β‘ Garbage Collection and Cache Retention
Relay doesn't keep everything in memory forever. It uses garbage collection to remove data that's no longer being used.
Reference Counting
Relay tracks how many components are using each piece of data:
ββββββββββββββββββββββββββββββββββββββββββ
β USER RECORD: user123 β
β Reference Count: 2 β
ββββββββββββββββββββββββββββββββββββββββββ€
β Referenced by: β
β β’ ProfilePage component β
β β’ HeaderUserMenu component β
ββββββββββββββββββββββββββββββββββββββββββ
When both components unmount:
β
Reference count β 0
β
After GC timeout (default: 10s)
β
Record eligible for deletion
β
Freed from memory
Retention Strategies
// Keep data for 60 seconds after component unmounts
const data = useLazyLoadQuery(
query,
variables,
{fetchPolicy: 'store-or-network'}
);
// Manual retention - prevent GC
const {environment} = useRelayEnvironment();
const disposable = environment.retain(query, variables);
// Later: allow GC
disposable.dispose();
π‘ Best Practice: Let Relay handle GC automatically in most cases. Use manual retention only for data you know you'll need soon (like prefetching for the next page).
Detailed Examples
Example 1: Identity Collision and Resolution
Scenario: You fetch the same user from two different queries with different fields.
// Query 1: Get basic user info
const data1 = useLazyLoadQuery(
graphql`
query Example1_BasicQuery($id: ID!) {
user(id: $id) {
id
name
}
}
`,
{id: 'user123'}
);
// Later... Query 2: Get user with email
const data2 = useLazyLoadQuery(
graphql`
query Example1_DetailQuery($id: ID!) {
user(id: $id) {
id
email
profilePicture
}
}
`,
{id: 'user123'}
);
What happens in the cache:
| Step | Cache State | Explanation |
|---|---|---|
| 1 | {
"user123": {
"id": "user123",
"name": "Alice"
}
} |
First query stores basic info |
| 2 | {
"user123": {
"id": "user123",
"name": "Alice",
"email": "alice@example.com",
"profilePicture": "url..."
}
} |
Second query merges new fields |
Relay merges the data because both queries reference the same id. The cache now contains all fields from both queries. If a third component queries just name, Relay serves it from cache without a network request.
π§ Key Insight: This is why global IDs are so powerful. Relay automatically deduplicates and consolidates data across your entire application.
Example 2: Cache Invalidation with Updates
Scenario: A user updates their profile, and you want all components displaying that user to update immediately.
// Component A: Profile page
function ProfilePage({userId}) {
const data = useLazyLoadQuery(
graphql`
query ProfilePageQuery($id: ID!) {
user(id: $id) {
id
name
bio
}
}
`,
{id: userId}
);
return (
<div>
<h1>{data.user.name}</h1>
<p>{data.user.bio}</p>
</div>
);
}
// Component B: Header (different part of UI)
function Header({userId}) {
const data = useLazyLoadQuery(
graphql`
query HeaderQuery($id: ID!) {
user(id: $id) {
id
name
}
}
`,
{id: userId}
);
return <div>Welcome, {data.user.name}!</div>;
}
// Mutation: Update profile
function EditProfileForm({userId}) {
const [commit] = useMutation(graphql`
mutation UpdateProfileMutation($input: UpdateUserInput!) {
updateUser(input: $input) {
user {
id
name
bio
}
}
}
`);
function handleSubmit(newName, newBio) {
commit({
variables: {
input: {id: userId, name: newName, bio: newBio}
}
// No updater needed! Relay handles it automatically
});
}
return <form onSubmit={handleSubmit}>...</form>;
}
What happens:
- User submits form
- Mutation executes and returns updated user with
id: "user123" - Relay finds the
user123record in cache - Updates
nameandbiofields - Both
ProfilePageandHeaderautomatically re-render with new name - User sees instant updates everywhere
π‘ Why this works: Because both components use the same user(id: "user123"), they share the same cache entry. When that entry updates, Relay notifies all subscribers.
Example 3: Optimistic Updates for Instant UI
Scenario: When a user likes a post, you want the UI to update instantly without waiting for the server.
function LikeButton({postId, currentLikeCount, viewerHasLiked}) {
const [commit, isInFlight] = useMutation(graphql`
mutation LikePostMutation($input: LikePostInput!) {
likePost(input: $input) {
post {
id
likeCount
viewerHasLiked
}
}
}
`);
function handleLike() {
commit({
variables: {input: {postId}},
// Optimistic response - applied immediately
optimisticResponse: {
likePost: {
post: {
id: postId,
likeCount: currentLikeCount + 1,
viewerHasLiked: true
}
}
},
// Optional: handle server response different from optimistic
onCompleted: (response) => {
// Server confirmed the like
console.log('Like confirmed');
},
onError: (error) => {
// Server rejected - Relay automatically rolls back optimistic update
console.error('Like failed', error);
}
});
}
return (
<button onClick={handleLike} disabled={isInFlight}>
{viewerHasLiked ? 'β€οΈ' : 'π€'} {currentLikeCount}
</button>
);
}
Timeline of events:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β OPTIMISTIC UPDATE TIMELINE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
t=0ms: User clicks button
β
β
t=1ms: Optimistic response applied to cache
β
β
t=2ms: UI re-renders (likeCount: 42 β 43)
β (viewerHasLiked: false β true)
β
β
t=5ms: Network request sent to server
β
β ... network latency ...
β
β
t=150ms: Server responds with actual data
β
β
t=151ms: Cache updated with real response
β (optimistic update replaced)
β
β
t=152ms: UI re-renders if data differs
If the server response differs:
// Optimistic: likeCount = 43
// Server says: likeCount = 44 (someone else liked it too)
// Result: UI shows 44, not 43
Relay automatically replaces optimistic data with real server data, ensuring truth.
β οΈ Common Mistake: Making optimistic responses too complex. Keep them simple and only update fields you're certain about. Let the server response be the final truth.
Example 4: List Operations with Connections
Scenario: Adding a new comment to a post's comment list.
function AddCommentForm({postId}) {
const [commit] = useMutation(graphql`
mutation AddCommentMutation($input: AddCommentInput!) {
addComment(input: $input) {
commentEdge {
node {
id
text
author {
id
name
}
createdAt
}
}
}
}
`);
function handleSubmit(text) {
commit({
variables: {input: {postId, text}},
updater: (store) => {
// Get the post record from cache
const post = store.get(postId);
if (!post) return;
// Get the new comment from mutation response
const commentEdge = store
.getRootField('addComment')
.getLinkedRecord('commentEdge');
// Get existing comments connection
const connection = post.getLinkedRecord('comments');
if (!connection) return;
// Get current edges array
const edges = connection.getLinkedRecords('edges') || [];
// Prepend new comment to the list
connection.setLinkedRecords(
[commentEdge, ...edges],
'edges'
);
// Update total count
const count = connection.getValue('totalCount') || 0;
connection.setValue(count + 1, 'totalCount');
}
});
}
return <form onSubmit={handleSubmit}>...</form>;
}
Why manual update is needed:
Relay can't automatically know where to insert the new comment in the list. You must tell it:
- Which connection to update (
post.comments) - Where in the list to add it (beginning, end, or specific position)
- How to update metadata (like
totalCount)
π‘ Pro Tip: For simple list appends, consider using Relay's @appendNode or @prependNode directives in your schema design to avoid manual updaters.
Common Mistakes
β Mistake 1: Forgetting to Fetch the id Field
// WRONG - no id field
const data = useLazyLoadQuery(
graphql`
query BadQuery($userId: ID!) {
user(id: $userId) {
name
email
}
}
`,
{userId}
);
Problem: Relay can't normalize the user record without an id. The data will be stored under the query root, not as a reusable record.
Fix: Always include id for types implementing Node:
// CORRECT
const data = useLazyLoadQuery(
graphql`
query GoodQuery($userId: ID!) {
user(id: $userId) {
id # β
Always include id
name
email
}
}
`,
{userId}
);
β Mistake 2: Mutating Cache Data Directly
// WRONG - direct mutation
const data = useLazyLoadQuery(query, variables);
data.user.name = 'New Name'; // β This won't update the cache!
Problem: Relay data is read-only. Direct mutations don't trigger updates or re-renders.
Fix: Use mutations or updater functions:
// CORRECT
const [commit] = useMutation(updateUserMutation);
commit({
variables: {input: {id: userId, name: 'New Name'}}
});
β Mistake 3: Incorrect Optimistic Response Structure
// WRONG - mismatched structure
commit({
variables: {input: {postId}},
optimisticResponse: {
likeCount: 43 // β Doesn't match mutation shape
}
});
Problem: Optimistic response must exactly match the mutation's response shape.
Fix: Mirror the mutation response structure:
// CORRECT
commit({
variables: {input: {postId}},
optimisticResponse: {
likePost: { // β
Matches mutation field
post: { // β
Matches nested structure
id: postId,
likeCount: 43
}
}
}
});
β Mistake 4: Not Handling GC for Prefetched Data
// WRONG - prefetch without retention
function prefetchNextPage() {
fetchQuery(environment, nextPageQuery, variables);
// Data will be GC'd before user navigates!
}
Problem: Prefetched data is garbage collected if no component references it.
Fix: Retain the query:
// CORRECT
function prefetchNextPage() {
const disposable = fetchQuery(environment, nextPageQuery, variables).subscribe({});
// Keep for 30 seconds
setTimeout(() => disposable.dispose(), 30000);
return disposable;
}
β Mistake 5: Over-relying on network-only
// WRONG - unnecessary network requests
const data = useLazyLoadQuery(
query,
variables,
{fetchPolicy: 'network-only'} // β Ignores perfectly good cache
);
Problem: Defeats caching, causes unnecessary load, slower UI.
Fix: Use appropriate fetch policy:
// CORRECT - use cache intelligently
const data = useLazyLoadQuery(
query,
variables,
{fetchPolicy: 'store-and-network'} // β
Show cache, then refresh
);
Key Takeaways
π― Identity is Everything: Global IDs enable Relay's entire normalization system. Every refetchable object needs a unique id.
π― One Record, One Truth: Normalized cache means each object exists exactly once, eliminating duplication and inconsistency.
π― Automatic is Better: Relay automatically merges data, updates components, and handles most cache operationsβlet it do its job.
π― Cache as Source of Truth: Your components read from the store, not from local state. The store is the single source of truth.
π― Smart Fetching: Choose the right fetch policy for each use case. Default to store-or-network and only deviate with good reason.
π― Optimistic Updates for UX: Use optimistic responses to make your UI feel instant, but keep them simple and let server data override.
π― Manual Updates When Needed: For list operations and complex cache changes, use updater functions to explicitly modify the store.
π― GC is Your Friend: Let Relay clean up unused data automatically. Manually retain only when prefetching or caching for known future use.
π€ Did You Know?
Relay's normalization strategy is inspired by database normalization principles from the 1970s. The same concepts that prevent data anomalies in SQL databases (1NF, 2NF, 3NF) apply to Relay's cacheβone source of truth for each entity!
Facebook (now Meta) built Relay to handle their massive scale: millions of objects, thousands of components, all sharing and updating the same data. The identity system makes it possible to have a single "User" record that's referenced by posts, comments, likes, friend lists, and moreβall staying perfectly in sync.
π Quick Reference Card
| Concept | Key Point |
|---|---|
| Global ID | Unique identifier for every Node type object |
| Normalization | Flattening nested data into ID-indexed lookup table |
| Store | Relay's cache - single source of truth for all data |
| Reference | {"__ref": "id"} pointer to normalized record |
| Fetch Policy | Strategy for cache vs network (store-or-network, etc.) |
| Updater | Function to manually modify cache after mutations |
| Optimistic Update | Instant UI update before server confirms |
| GC | Automatic cleanup of unreferenced cached data |
| Retention | Keeping data in cache even when not actively used |
Cache Update Flow:
Mutation β Server Response β Normalize β Update Store β Notify Subscribers β Re-render
Always Include:
idfield in queries for Node types__typenamefor union/interface types (Relay adds automatically)- Proper error handling for mutations
π Further Study
- Relay Documentation - Guided Tour: https://relay.dev/docs/guided-tour/ - Official comprehensive guide to Relay concepts
- GraphQL Global Object Identification Specification: https://graphql.org/learn/global-object-identification/ - The spec behind Relay's ID system
- Relay Store API Reference: https://relay.dev/docs/api-reference/store/ - Detailed documentation on cache manipulation and updater functions