Performance & Memory
Optimize code with structs, spans, ref semantics, and low-level constructs
Performance and Memory Optimization in C#
Master C# performance optimization with free flashcards and spaced repetition practice. This lesson covers memory management fundamentals, garbage collection strategies, value vs. reference types, and performance profiling techniquesβessential concepts for building high-performance .NET applications.
Welcome to Performance & Memory π»
Performance optimization in C# isn't just about writing faster codeβit's about understanding how the .NET runtime manages memory, how the garbage collector works, and how to make informed architectural decisions. Whether you're building web APIs, desktop applications, or microservices, knowing these principles will help you create applications that scale efficiently and use resources wisely.
Core Concepts π§
Understanding the Stack and Heap π
C# uses two primary memory regions: the stack and the heap. Understanding their differences is fundamental to writing performant code.
| Feature | Stack π | Heap ποΈ |
|---|---|---|
| Speed | Very fast (LIFO access) | Slower (requires GC) |
| Size | Small (~1MB default) | Large (limited by RAM) |
| Lifetime | Method scope | Until GC collects |
| Storage | Value types, references | Reference type objects |
| Allocation | Automatic | Manual (new keyword) |
| Deallocation | Automatic (stack pop) | Garbage Collector |
π‘ Key Insight: When you declare int x = 5;, the value lives on the stack. When you declare var person = new Person();, the reference lives on the stack, but the Person object itself lives on the heap.
STACK vs HEAP VISUALIZATION
STACK HEAP
ββββββββββββ ββββββββββββββββββββ
β int x=5 β β β
ββββββββββββ€ β Person Object β
β ref ββββββΌβββββββββββ {Name="Alice"} β
ββββββββββββ€ β β
β int y=10 β ββββββββββββββββββββ€
ββββββββββββ€ β List Object β
β ref ββββββΌβββββββββββ {Count=50} β
ββββββββββββ ββββββββββββββββββββ
Fast Needs GC
Small Large
Auto-cleaned Managed
Value Types vs Reference Types βοΈ
This distinction affects both performance and behavior:
Value Types (stored on stack when local):
- Primitive types:
int,double,bool,char - Structs:
DateTime,Guid, custom structs - Enums
- Tuples:
(int, string)
Reference Types (stored on heap):
- Classes
- Strings (special caseβimmutable)
- Arrays
- Delegates
- Records (reference semantics by default)
// Value type behavior - COPY
int a = 10;
int b = a; // b gets a COPY of the value
b = 20; // a is still 10
// Reference type behavior - REFERENCE
var list1 = new List<int> { 1, 2, 3 };
var list2 = list1; // list2 references SAME object
list2.Add(4); // list1 also has 4 now!
π§ Memory Tip: "VCSR" - Value types Copy, Structures Stack (when local), Reference types Reference the same object.
The Garbage Collector (GC) ποΈ
The .NET garbage collector automatically manages heap memory, but understanding how it works helps you write more efficient code.
Generational Garbage Collection:
GARBAGE COLLECTOR GENERATIONS
ββββββββββββββββββββββββββββββββββββββββββ
β Gen 0 (Young objects) β
β ππππ Most objects die here β
β Collected frequently (~few ms) β
ββββββββββββββββββββββββββββββββββββββββββ€
β Gen 1 (Medium-lived objects) β
β β±οΈβ±οΈ Buffer between Gen 0 and 2 β
β Collected less frequently β
ββββββββββββββββββββββββββββββββββββββββββ€
β Gen 2 (Long-lived objects) β
β ποΈποΈ Static data, caches, singletons β
β Collected rarely (expensive) β
ββββββββββββββββββββββββββββββββββββββββββ
Object Promotion Path:
Gen 0 β survives collection β Gen 1
Gen 1 β survives collection β Gen 2
β οΈ Performance Impact: Gen 2 collections are expensive because they scan the entire heap. Avoid creating many long-lived objects unnecessarily.
GC Modes:
| Mode | Use Case | Behavior |
|---|---|---|
| Workstation GC | Desktop apps, client apps | Lower latency, less throughput |
| Server GC | Web servers, high-throughput | Higher throughput, more memory |
| Concurrent GC | Responsive UI apps | GC runs on background thread |
Memory Allocation Patterns π
Allocation Rate is one of the most critical performance metrics:
// β BAD: High allocation rate
public string BuildMessage()
{
string result = "";
for (int i = 0; i < 1000; i++)
{
result += i.ToString() + ","; // Creates 1000+ string objects!
}
return result;
}
// β
GOOD: Low allocation rate
public string BuildMessage()
{
var sb = new StringBuilder(5000); // Pre-allocate capacity
for (int i = 0; i < 1000; i++)
{
sb.Append(i);
sb.Append(',');
}
return sb.ToString();
}
π‘ Pro Tip: Use object pooling for frequently allocated/deallocated objects:
// Using ArrayPool to avoid allocations
var pool = ArrayPool<byte>.Shared;
byte[] buffer = pool.Rent(1024); // Rent from pool
try
{
// Use buffer...
}
finally
{
pool.Return(buffer); // Return to pool
}
Structs vs Classes: Performance Trade-offs βοΈ
When to use structs:
- Small data structures (β€16 bytes recommended)
- Immutable data
- Frequently created, short-lived objects
- Need value semantics
When to use classes:
- Large objects (>16 bytes)
- Need inheritance
- Need reference semantics
- Long-lived objects
// β
GOOD struct usage - small, immutable
public readonly struct Point
{
public readonly int X;
public readonly int Y;
public Point(int x, int y)
{
X = x;
Y = y;
}
}
// β BAD struct usage - too large, mutable
public struct HugeDataStructure // Don't do this!
{
public double Field1;
public double Field2;
// ... 20 more fields
public double Field20;
}
β οΈ Warning: Passing large structs by value copies all the data. Use ref or in parameters:
public void ProcessPoint(in Point p) // Pass by reference, no copying
{
Console.WriteLine($"X: {p.X}, Y: {p.Y}");
}
Span and Memory π
Introduced in C# 7.2, Span<T> and Memory<T> enable zero-allocation operations on contiguous memory:
// Traditional approach - allocates substring
string text = "Hello, World!";
string substring = text.Substring(7, 5); // Allocates new string
// Span approach - zero allocations
ReadOnlySpan<char> span = text.AsSpan();
ReadOnlySpan<char> slice = span.Slice(7, 5); // No allocation!
Key Benefits:
- Stack-allocated or wraps existing memory
- No heap allocations for slicing
- Unified API for arrays, strings, stack memory
- Bounds checking with near-zero overhead
// Working with stack memory
Span<int> numbers = stackalloc int[100]; // Stack allocation!
for (int i = 0; i < numbers.Length; i++)
{
numbers[i] = i * i;
}
π‘ Performance Insight: Span<T> is a ref struct, meaning it can only live on the stack. This restriction enables the compiler to make aggressive optimizations.
Avoiding Boxing π¦
Boxing occurs when a value type is converted to object or an interface type, forcing a heap allocation:
// β Boxing occurs here
int number = 42;
object boxed = number; // Heap allocation!
Console.WriteLine(boxed); // Boxing in WriteLine
// β
Avoid boxing with generics
public void Process<T>(T value) where T : struct
{
// No boxing - T stays as value type
}
// β Boxing with collections
var list = new ArrayList(); // Non-generic
list.Add(42); // Boxes int to object
// β
No boxing with generics
var list = new List<int>(); // Generic
list.Add(42); // No boxing
π Did you know? Each boxing operation allocates 12-24 bytes on the heap (depending on platform), plus the value size. In a tight loop, this can create millions of allocations!
LINQ Performance Considerations π
LINQ is powerful but can have hidden costs:
// β Multiple enumerations
var data = GetExpensiveData().Where(x => x.IsActive);
int count = data.Count(); // Enumerates once
var first = data.FirstOrDefault(); // Enumerates again!
// β
Materialize once
var data = GetExpensiveData()
.Where(x => x.IsActive)
.ToList(); // Single enumeration
int count = data.Count;
var first = data.FirstOrDefault();
// β Allocation-heavy
var result = numbers
.Where(x => x > 0) // Allocates enumerator
.Select(x => x * 2) // Allocates enumerator
.ToList(); // Allocates list
// β
More efficient for large collections
var result = new List<int>(numbers.Count);
for (int i = 0; i < numbers.Count; i++)
{
if (numbers[i] > 0)
result.Add(numbers[i] * 2);
}
π‘ LINQ Optimization Tips:
- Use
Any()instead ofCount() > 0 - Use
forloops for performance-critical code - Avoid
ToList()unless you need materialization - Consider LINQ2DB or Dapper for database queries (faster than EF Core for reads)
Detailed Examples π‘
Example 1: String Concatenation Performance
Let's compare different string building approaches:
public class StringBenchmark
{
private const int Iterations = 10000;
// β WORST: ~500ms, generates millions of objects
public string ConcatenationOperator()
{
string result = "";
for (int i = 0; i < Iterations; i++)
{
result += "item" + i; // Creates new string each iteration
}
return result;
}
// β
BETTER: ~15ms, single object
public string StringBuilderApproach()
{
var sb = new StringBuilder(Iterations * 10);
for (int i = 0; i < Iterations; i++)
{
sb.Append("item");
sb.Append(i);
}
return sb.ToString();
}
// β
BEST: ~8ms, zero-allocation for building
public string SpanApproach()
{
Span<char> buffer = stackalloc char[100];
var sb = new StringBuilder(Iterations * 10);
for (int i = 0; i < Iterations; i++)
{
sb.Append("item");
if (i.TryFormat(buffer, out int written))
{
sb.Append(buffer.Slice(0, written));
}
}
return sb.ToString();
}
}
π Performance Comparison:
| Method | Time | Allocations | Gen 0 | Gen 1 | Gen 2 |
|---|---|---|---|---|---|
| + operator | 500ms | ~400MB | 50000 | 2000 | 10 |
| StringBuilder | 15ms | ~1MB | 100 | 1 | 0 |
| Span + StringBuilder | 8ms | ~0.5MB | 50 | 0 | 0 |
Why such a huge difference? String immutability means s1 + s2 creates a new string object. With 10,000 iterations, that's 10,000 allocations!
Example 2: Collection Pre-allocation
Knowing the size beforehand dramatically improves performance:
public class CollectionBenchmark
{
// β BAD: List resizes multiple times
public List<int> WithoutCapacity(int count)
{
var list = new List<int>(); // Default capacity: 0
for (int i = 0; i < count; i++)
{
list.Add(i); // Resizes at 0β4β8β16β32β64...
}
return list;
// Resizes ~17 times for 100,000 items
}
// β
GOOD: Single allocation
public List<int> WithCapacity(int count)
{
var list = new List<int>(count); // Pre-allocate
for (int i = 0; i < count; i++)
{
list.Add(i); // No resizing!
}
return list;
}
// β
BEST: When you know exact size
public int[] UseArray(int count)
{
var array = new int[count]; // Single allocation, no overhead
for (int i = 0; i < count; i++)
{
array[i] = i;
}
return array;
}
}
π Under the Hood: When a List<T> grows, it:
- Allocates a new array (2Γ current size)
- Copies all existing elements
- Discards the old array (becomes garbage)
LIST GROWTH WITHOUT PRE-ALLOCATION Add item 0: [0] β Allocate 4 Add item 1: [0,1] Add item 2: [0,1,2] Add item 3: [0,1,2,3] Add item 4: [0,1,2,3,4,_,_,_] β Resize! Allocate 8, copy 4 Add item 5: [0,1,2,3,4,5,_,_] ... ... Add item 8: [0,1,2,3,4,5,6,7,8...] β Resize! Allocate 16, copy 8 Each resize = allocation + copy + garbage
Example 3: Avoiding Closures in Loops
Closures can create unexpected allocations:
public class ClosureBenchmark
{
// β BAD: Captures loop variable, allocates per iteration
public List<Action> CreateActionsClosure()
{
var actions = new List<Action>();
for (int i = 0; i < 1000; i++)
{
actions.Add(() => Console.WriteLine(i)); // Closure!
// Allocates delegate + display class per iteration
}
return actions;
}
// β
BETTER: Capture local copy
public List<Action> CreateActionsLocalCopy()
{
var actions = new List<Action>(1000);
for (int i = 0; i < 1000; i++)
{
int local = i; // Local copy
actions.Add(() => Console.WriteLine(local));
}
return actions;
}
// β
BEST: No closure needed
public List<Action<int>> CreateActionsWithParameter()
{
var actions = new List<Action<int>>(1000);
Action<int> action = i => Console.WriteLine(i);
for (int i = 0; i < 1000; i++)
{
actions.Add(action); // Reuse same delegate
}
return actions;
}
}
π§ Closure Pitfall: The infamous "loop variable capture" bug:
var actions = new List<Action>();
for (int i = 0; i < 3; i++)
{
actions.Add(() => Console.WriteLine(i));
}
foreach (var action in actions)
action(); // Prints: 3, 3, 3 (all reference same variable!)
Example 4: ValueTask for Async Performance
For frequently-called async methods that often complete synchronously:
public class AsyncBenchmark
{
private Dictionary<int, string> _cache = new();
// β Allocates Task even when cached (synchronous path)
public async Task<string> GetDataTaskAsync(int id)
{
if (_cache.TryGetValue(id, out var cached))
return cached; // Still allocates Task!
var data = await FetchFromDatabaseAsync(id);
_cache[id] = data;
return data;
}
// β
Zero allocation for synchronous path
public ValueTask<string> GetDataValueTaskAsync(int id)
{
if (_cache.TryGetValue(id, out var cached))
return new ValueTask<string>(cached); // No heap allocation!
return new ValueTask<string>(FetchAndCacheAsync(id));
}
private async Task<string> FetchAndCacheAsync(int id)
{
var data = await FetchFromDatabaseAsync(id);
_cache[id] = data;
return data;
}
private Task<string> FetchFromDatabaseAsync(int id)
=> Task.FromResult($"Data_{id}");
}
π‘ When to use ValueTask:
- Method has a synchronous fast path (caching, validation)
- Called very frequently (hot path)
- Performance-critical scenarios
β οΈ ValueTask limitations:
- Can only be awaited once
- Cannot use with
Task.WhenAlldirectly - More complex to use correctly
Common Mistakes β οΈ
1. Forgetting to Dispose IDisposable Resources
// β WRONG: Resource leak
public void ProcessFile(string path)
{
var stream = File.OpenRead(path);
// ... process stream ...
// Stream never disposed - handle leaks!
}
// β
RIGHT: Using statement ensures disposal
public void ProcessFile(string path)
{
using var stream = File.OpenRead(path);
// ... process stream ...
} // Automatically disposed here
π‘ C# 8+ using declarations (without braces) dispose at end of scope, making code cleaner.
2. Creating Unnecessary Large Object Heap (LOH) Allocations
Objects β₯85,000 bytes go to the Large Object Heap, which:
- Isn't compacted (causes fragmentation)
- Only collected during Gen 2 GC
- Can cause performance issues
// β BAD: Repeated LOH allocations
public void ProcessBatches()
{
for (int i = 0; i < 100; i++)
{
byte[] buffer = new byte[100_000]; // LOH allocation each time!
ProcessBatch(buffer);
}
}
// β
GOOD: Reuse buffer or use ArrayPool
public void ProcessBatches()
{
var pool = ArrayPool<byte>.Shared;
byte[] buffer = pool.Rent(100_000);
try
{
for (int i = 0; i < 100; i++)
{
ProcessBatch(buffer);
}
}
finally
{
pool.Return(buffer);
}
}
3. Using Finalizers Incorrectly
// β WRONG: Finalizer keeps object alive longer
public class BadResource
{
~BadResource() // Finalizer
{
// Cleanup...
}
}
// Objects with finalizers promoted to Gen 1,
// then finalized, THEN collected (slower!)
// β
RIGHT: Implement IDisposable pattern
public class GoodResource : IDisposable
{
private bool _disposed;
public void Dispose()
{
if (_disposed) return;
// Cleanup...
_disposed = true;
GC.SuppressFinalize(this); // Don't run finalizer
}
}
4. String Comparison Performance
// β SLOW: Case-insensitive comparison using ToLower()
if (str1.ToLower() == str2.ToLower()) // Creates 2 new strings!
{
// ...
}
// β
FAST: Use StringComparison parameter
if (string.Equals(str1, str2, StringComparison.OrdinalIgnoreCase))
{
// ... zero allocations!
}
5. Linq Deferred Execution Gotchas
// β DANGEROUS: Query executed multiple times
var activeUsers = users.Where(u => u.IsActive);
var count = activeUsers.Count(); // Execute query
var first = activeUsers.First(); // Execute again!
var last = activeUsers.Last(); // Execute third time!
// β
SAFE: Materialize once
var activeUsers = users.Where(u => u.IsActive).ToList();
var count = activeUsers.Count; // Property, no execution
var first = activeUsers.First();
var last = activeUsers.Last();
6. Ignoring Struct Copying Overhead
// β BAD: Large struct passed by value (copied!)
public struct LargeData
{
public double Data1;
public double Data2;
// ... 20 more fields
}
public void Process(LargeData data) // Copies 160+ bytes!
{
// ...
}
// β
GOOD: Pass by reference
public void Process(in LargeData data) // No copy, readonly reference
{
// ...
}
Key Takeaways π―
π Performance & Memory Quick Reference
| Concept | Best Practice |
|---|---|
| Memory | Stack for value types (fast), Heap for reference types (GC managed) |
| Collections | Pre-allocate capacity when size is known |
| Strings | Use StringBuilder for concatenation, Span |
| Value Types | Keep small (β€16 bytes), immutable, pass with 'in' if large |
| Boxing | Use generics to avoid boxing value types |
| LINQ | Materialize with ToList() if enumerating multiple times |
| Async | Use ValueTask |
| Pooling | Use ArrayPool for temporary buffers |
| LOH | Avoid repeated allocations β₯85KB, use pooling instead |
| IDisposable | Always dispose resources, use 'using' statement |
π§ Memory Mnemonic: "SALT"
- Stack for Small values
- Allocate with purpose (pre-size collections)
- Limit boxing (use generics)
- Take advantage of Span and pooling
π Profiling Tools:
- dotMemory (JetBrains) - Memory profiling
- PerfView (Microsoft) - CPU and memory analysis
- BenchmarkDotNet - Micro-benchmarking library
- Visual Studio Profiler - Built-in diagnostics
- dotTrace (JetBrains) - Performance profiling
π€ Did You Know? The .NET GC can collect 100,000+ objects in under 1ms for Gen 0 collections. The key to performance isn't avoiding GCβit's avoiding Gen 2 collections and keeping allocation rates low.
π Further Study
- Microsoft Docs - Memory Management: https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/
- Performance Best Practices in C#: https://docs.microsoft.com/en-us/dotnet/csharp/advanced-topics/performance/
- High Performance C# by Ben Watson: https://www.writinghighperf.net/
π Practice Challenge: Take an existing C# project and run a memory profiler. Identify the top 3 allocation sources and optimize them using techniques from this lesson. Measure before and after performance!