You are viewing a preview of this lesson. Sign in to start learning
Back to C# Programming

Performance & Memory

Optimize code with structs, spans, ref semantics, and low-level constructs

Performance and Memory Optimization in C#

Master C# performance optimization with free flashcards and spaced repetition practice. This lesson covers memory management fundamentals, garbage collection strategies, value vs. reference types, and performance profiling techniquesβ€”essential concepts for building high-performance .NET applications.

Welcome to Performance & Memory πŸ’»

Performance optimization in C# isn't just about writing faster codeβ€”it's about understanding how the .NET runtime manages memory, how the garbage collector works, and how to make informed architectural decisions. Whether you're building web APIs, desktop applications, or microservices, knowing these principles will help you create applications that scale efficiently and use resources wisely.

Core Concepts 🧠

Understanding the Stack and Heap πŸ“Š

C# uses two primary memory regions: the stack and the heap. Understanding their differences is fundamental to writing performant code.

FeatureStack πŸ“šHeap πŸ”οΈ
SpeedVery fast (LIFO access)Slower (requires GC)
SizeSmall (~1MB default)Large (limited by RAM)
LifetimeMethod scopeUntil GC collects
StorageValue types, referencesReference type objects
AllocationAutomaticManual (new keyword)
DeallocationAutomatic (stack pop)Garbage Collector

πŸ’‘ Key Insight: When you declare int x = 5;, the value lives on the stack. When you declare var person = new Person();, the reference lives on the stack, but the Person object itself lives on the heap.

STACK vs HEAP VISUALIZATION

    STACK                    HEAP
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ int x=5  β”‚          β”‚                  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€          β”‚  Person Object   β”‚
β”‚ ref ─────┼─────────→│  {Name="Alice"} β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€          β”‚                  β”‚
β”‚ int y=10 β”‚          β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€          β”‚  List Object  β”‚
β”‚ ref ─────┼─────────→│  {Count=50}      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
   Fast                   Needs GC
   Small                  Large
   Auto-cleaned           Managed

Value Types vs Reference Types βš–οΈ

This distinction affects both performance and behavior:

Value Types (stored on stack when local):

  • Primitive types: int, double, bool, char
  • Structs: DateTime, Guid, custom structs
  • Enums
  • Tuples: (int, string)

Reference Types (stored on heap):

  • Classes
  • Strings (special caseβ€”immutable)
  • Arrays
  • Delegates
  • Records (reference semantics by default)
// Value type behavior - COPY
int a = 10;
int b = a;  // b gets a COPY of the value
b = 20;     // a is still 10

// Reference type behavior - REFERENCE
var list1 = new List<int> { 1, 2, 3 };
var list2 = list1;  // list2 references SAME object
list2.Add(4);       // list1 also has 4 now!

🧠 Memory Tip: "VCSR" - Value types Copy, Structures Stack (when local), Reference types Reference the same object.

The Garbage Collector (GC) πŸ—‘οΈ

The .NET garbage collector automatically manages heap memory, but understanding how it works helps you write more efficient code.

Generational Garbage Collection:

GARBAGE COLLECTOR GENERATIONS

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Gen 0 (Young objects)                 β”‚
β”‚  πŸ†•πŸ†•πŸ†•πŸ†• Most objects die here         β”‚
β”‚  Collected frequently (~few ms)        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Gen 1 (Medium-lived objects)          β”‚
β”‚  ⏱️⏱️ Buffer between Gen 0 and 2       β”‚
β”‚  Collected less frequently             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Gen 2 (Long-lived objects)            β”‚
β”‚  πŸ›οΈπŸ›οΈ Static data, caches, singletons β”‚
β”‚  Collected rarely (expensive)          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

    Object Promotion Path:
    Gen 0 β†’ survives collection β†’ Gen 1
    Gen 1 β†’ survives collection β†’ Gen 2

⚠️ Performance Impact: Gen 2 collections are expensive because they scan the entire heap. Avoid creating many long-lived objects unnecessarily.

GC Modes:

ModeUse CaseBehavior
Workstation GCDesktop apps, client appsLower latency, less throughput
Server GCWeb servers, high-throughputHigher throughput, more memory
Concurrent GCResponsive UI appsGC runs on background thread

Memory Allocation Patterns πŸ“ˆ

Allocation Rate is one of the most critical performance metrics:

// ❌ BAD: High allocation rate
public string BuildMessage()
{
    string result = "";
    for (int i = 0; i < 1000; i++)
    {
        result += i.ToString() + ",";  // Creates 1000+ string objects!
    }
    return result;
}

// βœ… GOOD: Low allocation rate
public string BuildMessage()
{
    var sb = new StringBuilder(5000);  // Pre-allocate capacity
    for (int i = 0; i < 1000; i++)
    {
        sb.Append(i);
        sb.Append(',');
    }
    return sb.ToString();
}

πŸ’‘ Pro Tip: Use object pooling for frequently allocated/deallocated objects:

// Using ArrayPool to avoid allocations
var pool = ArrayPool<byte>.Shared;
byte[] buffer = pool.Rent(1024);  // Rent from pool
try
{
    // Use buffer...
}
finally
{
    pool.Return(buffer);  // Return to pool
}

Structs vs Classes: Performance Trade-offs βš–οΈ

When to use structs:

  • Small data structures (≀16 bytes recommended)
  • Immutable data
  • Frequently created, short-lived objects
  • Need value semantics

When to use classes:

  • Large objects (>16 bytes)
  • Need inheritance
  • Need reference semantics
  • Long-lived objects
// βœ… GOOD struct usage - small, immutable
public readonly struct Point
{
    public readonly int X;
    public readonly int Y;
    
    public Point(int x, int y)
    {
        X = x;
        Y = y;
    }
}

// ❌ BAD struct usage - too large, mutable
public struct HugeDataStructure  // Don't do this!
{
    public double Field1;
    public double Field2;
    // ... 20 more fields
    public double Field20;
}

⚠️ Warning: Passing large structs by value copies all the data. Use ref or in parameters:

public void ProcessPoint(in Point p)  // Pass by reference, no copying
{
    Console.WriteLine($"X: {p.X}, Y: {p.Y}");
}

Span and Memory πŸš€

Introduced in C# 7.2, Span<T> and Memory<T> enable zero-allocation operations on contiguous memory:

// Traditional approach - allocates substring
string text = "Hello, World!";
string substring = text.Substring(7, 5);  // Allocates new string

// Span approach - zero allocations
ReadOnlySpan<char> span = text.AsSpan();
ReadOnlySpan<char> slice = span.Slice(7, 5);  // No allocation!

Key Benefits:

  • Stack-allocated or wraps existing memory
  • No heap allocations for slicing
  • Unified API for arrays, strings, stack memory
  • Bounds checking with near-zero overhead
// Working with stack memory
Span<int> numbers = stackalloc int[100];  // Stack allocation!
for (int i = 0; i < numbers.Length; i++)
{
    numbers[i] = i * i;
}

πŸ’‘ Performance Insight: Span<T> is a ref struct, meaning it can only live on the stack. This restriction enables the compiler to make aggressive optimizations.

Avoiding Boxing πŸ“¦

Boxing occurs when a value type is converted to object or an interface type, forcing a heap allocation:

// ❌ Boxing occurs here
int number = 42;
object boxed = number;        // Heap allocation!
Console.WriteLine(boxed);     // Boxing in WriteLine

// βœ… Avoid boxing with generics
public void Process<T>(T value) where T : struct
{
    // No boxing - T stays as value type
}

// ❌ Boxing with collections
var list = new ArrayList();   // Non-generic
list.Add(42);                 // Boxes int to object

// βœ… No boxing with generics
var list = new List<int>();   // Generic
list.Add(42);                 // No boxing

πŸ” Did you know? Each boxing operation allocates 12-24 bytes on the heap (depending on platform), plus the value size. In a tight loop, this can create millions of allocations!

LINQ Performance Considerations πŸ”—

LINQ is powerful but can have hidden costs:

// ❌ Multiple enumerations
var data = GetExpensiveData().Where(x => x.IsActive);
int count = data.Count();           // Enumerates once
var first = data.FirstOrDefault();  // Enumerates again!

// βœ… Materialize once
var data = GetExpensiveData()
    .Where(x => x.IsActive)
    .ToList();  // Single enumeration
int count = data.Count;
var first = data.FirstOrDefault();

// ❌ Allocation-heavy
var result = numbers
    .Where(x => x > 0)      // Allocates enumerator
    .Select(x => x * 2)     // Allocates enumerator
    .ToList();              // Allocates list

// βœ… More efficient for large collections
var result = new List<int>(numbers.Count);
for (int i = 0; i < numbers.Count; i++)
{
    if (numbers[i] > 0)
        result.Add(numbers[i] * 2);
}

πŸ’‘ LINQ Optimization Tips:

  • Use Any() instead of Count() > 0
  • Use for loops for performance-critical code
  • Avoid ToList() unless you need materialization
  • Consider LINQ2DB or Dapper for database queries (faster than EF Core for reads)

Detailed Examples πŸ’‘

Example 1: String Concatenation Performance

Let's compare different string building approaches:

public class StringBenchmark
{
    private const int Iterations = 10000;
    
    // ❌ WORST: ~500ms, generates millions of objects
    public string ConcatenationOperator()
    {
        string result = "";
        for (int i = 0; i < Iterations; i++)
        {
            result += "item" + i;  // Creates new string each iteration
        }
        return result;
    }
    
    // βœ… BETTER: ~15ms, single object
    public string StringBuilderApproach()
    {
        var sb = new StringBuilder(Iterations * 10);
        for (int i = 0; i < Iterations; i++)
        {
            sb.Append("item");
            sb.Append(i);
        }
        return sb.ToString();
    }
    
    // βœ… BEST: ~8ms, zero-allocation for building
    public string SpanApproach()
    {
        Span<char> buffer = stackalloc char[100];
        var sb = new StringBuilder(Iterations * 10);
        
        for (int i = 0; i < Iterations; i++)
        {
            sb.Append("item");
            if (i.TryFormat(buffer, out int written))
            {
                sb.Append(buffer.Slice(0, written));
            }
        }
        return sb.ToString();
    }
}

πŸ“Š Performance Comparison:

MethodTimeAllocationsGen 0Gen 1Gen 2
+ operator500ms~400MB50000200010
StringBuilder15ms~1MB10010
Span + StringBuilder8ms~0.5MB5000

Why such a huge difference? String immutability means s1 + s2 creates a new string object. With 10,000 iterations, that's 10,000 allocations!

Example 2: Collection Pre-allocation

Knowing the size beforehand dramatically improves performance:

public class CollectionBenchmark
{
    // ❌ BAD: List resizes multiple times
    public List<int> WithoutCapacity(int count)
    {
        var list = new List<int>();  // Default capacity: 0
        for (int i = 0; i < count; i++)
        {
            list.Add(i);  // Resizes at 0β†’4β†’8β†’16β†’32β†’64...
        }
        return list;
        // Resizes ~17 times for 100,000 items
    }
    
    // βœ… GOOD: Single allocation
    public List<int> WithCapacity(int count)
    {
        var list = new List<int>(count);  // Pre-allocate
        for (int i = 0; i < count; i++)
        {
            list.Add(i);  // No resizing!
        }
        return list;
    }
    
    // βœ… BEST: When you know exact size
    public int[] UseArray(int count)
    {
        var array = new int[count];  // Single allocation, no overhead
        for (int i = 0; i < count; i++)
        {
            array[i] = i;
        }
        return array;
    }
}

πŸ” Under the Hood: When a List<T> grows, it:

  1. Allocates a new array (2Γ— current size)
  2. Copies all existing elements
  3. Discards the old array (becomes garbage)
LIST GROWTH WITHOUT PRE-ALLOCATION

Add item 0:  [0]                     ← Allocate 4
Add item 1:  [0,1]  
Add item 2:  [0,1,2]  
Add item 3:  [0,1,2,3]  
Add item 4:  [0,1,2,3,4,_,_,_]       ← Resize! Allocate 8, copy 4
Add item 5:  [0,1,2,3,4,5,_,_]  
...          ...
Add item 8:  [0,1,2,3,4,5,6,7,8...]  ← Resize! Allocate 16, copy 8

Each resize = allocation + copy + garbage

Example 3: Avoiding Closures in Loops

Closures can create unexpected allocations:

public class ClosureBenchmark
{
    // ❌ BAD: Captures loop variable, allocates per iteration
    public List<Action> CreateActionsClosure()
    {
        var actions = new List<Action>();
        for (int i = 0; i < 1000; i++)
        {
            actions.Add(() => Console.WriteLine(i));  // Closure!
            // Allocates delegate + display class per iteration
        }
        return actions;
    }
    
    // βœ… BETTER: Capture local copy
    public List<Action> CreateActionsLocalCopy()
    {
        var actions = new List<Action>(1000);
        for (int i = 0; i < 1000; i++)
        {
            int local = i;  // Local copy
            actions.Add(() => Console.WriteLine(local));
        }
        return actions;
    }
    
    // βœ… BEST: No closure needed
    public List<Action<int>> CreateActionsWithParameter()
    {
        var actions = new List<Action<int>>(1000);
        Action<int> action = i => Console.WriteLine(i);
        for (int i = 0; i < 1000; i++)
        {
            actions.Add(action);  // Reuse same delegate
        }
        return actions;
    }
}

🧠 Closure Pitfall: The infamous "loop variable capture" bug:

var actions = new List<Action>();
for (int i = 0; i < 3; i++)
{
    actions.Add(() => Console.WriteLine(i));
}
foreach (var action in actions)
    action();  // Prints: 3, 3, 3 (all reference same variable!)

Example 4: ValueTask for Async Performance

For frequently-called async methods that often complete synchronously:

public class AsyncBenchmark
{
    private Dictionary<int, string> _cache = new();
    
    // ❌ Allocates Task even when cached (synchronous path)
    public async Task<string> GetDataTaskAsync(int id)
    {
        if (_cache.TryGetValue(id, out var cached))
            return cached;  // Still allocates Task!
        
        var data = await FetchFromDatabaseAsync(id);
        _cache[id] = data;
        return data;
    }
    
    // βœ… Zero allocation for synchronous path
    public ValueTask<string> GetDataValueTaskAsync(int id)
    {
        if (_cache.TryGetValue(id, out var cached))
            return new ValueTask<string>(cached);  // No heap allocation!
        
        return new ValueTask<string>(FetchAndCacheAsync(id));
    }
    
    private async Task<string> FetchAndCacheAsync(int id)
    {
        var data = await FetchFromDatabaseAsync(id);
        _cache[id] = data;
        return data;
    }
    
    private Task<string> FetchFromDatabaseAsync(int id) 
        => Task.FromResult($"Data_{id}");
}

πŸ’‘ When to use ValueTask:

  • Method has a synchronous fast path (caching, validation)
  • Called very frequently (hot path)
  • Performance-critical scenarios

⚠️ ValueTask limitations:

  • Can only be awaited once
  • Cannot use with Task.WhenAll directly
  • More complex to use correctly

Common Mistakes ⚠️

1. Forgetting to Dispose IDisposable Resources

// ❌ WRONG: Resource leak
public void ProcessFile(string path)
{
    var stream = File.OpenRead(path);
    // ... process stream ...
    // Stream never disposed - handle leaks!
}

// βœ… RIGHT: Using statement ensures disposal
public void ProcessFile(string path)
{
    using var stream = File.OpenRead(path);
    // ... process stream ...
}  // Automatically disposed here

πŸ’‘ C# 8+ using declarations (without braces) dispose at end of scope, making code cleaner.

2. Creating Unnecessary Large Object Heap (LOH) Allocations

Objects β‰₯85,000 bytes go to the Large Object Heap, which:

  • Isn't compacted (causes fragmentation)
  • Only collected during Gen 2 GC
  • Can cause performance issues
// ❌ BAD: Repeated LOH allocations
public void ProcessBatches()
{
    for (int i = 0; i < 100; i++)
    {
        byte[] buffer = new byte[100_000];  // LOH allocation each time!
        ProcessBatch(buffer);
    }
}

// βœ… GOOD: Reuse buffer or use ArrayPool
public void ProcessBatches()
{
    var pool = ArrayPool<byte>.Shared;
    byte[] buffer = pool.Rent(100_000);
    try
    {
        for (int i = 0; i < 100; i++)
        {
            ProcessBatch(buffer);
        }
    }
    finally
    {
        pool.Return(buffer);
    }
}

3. Using Finalizers Incorrectly

// ❌ WRONG: Finalizer keeps object alive longer
public class BadResource
{
    ~BadResource()  // Finalizer
    {
        // Cleanup...
    }
}
// Objects with finalizers promoted to Gen 1,
// then finalized, THEN collected (slower!)

// βœ… RIGHT: Implement IDisposable pattern
public class GoodResource : IDisposable
{
    private bool _disposed;
    
    public void Dispose()
    {
        if (_disposed) return;
        // Cleanup...
        _disposed = true;
        GC.SuppressFinalize(this);  // Don't run finalizer
    }
}

4. String Comparison Performance

// ❌ SLOW: Case-insensitive comparison using ToLower()
if (str1.ToLower() == str2.ToLower())  // Creates 2 new strings!
{
    // ...
}

// βœ… FAST: Use StringComparison parameter
if (string.Equals(str1, str2, StringComparison.OrdinalIgnoreCase))
{
    // ... zero allocations!
}

5. Linq Deferred Execution Gotchas

// ❌ DANGEROUS: Query executed multiple times
var activeUsers = users.Where(u => u.IsActive);
var count = activeUsers.Count();  // Execute query
var first = activeUsers.First();  // Execute again!
var last = activeUsers.Last();    // Execute third time!

// βœ… SAFE: Materialize once
var activeUsers = users.Where(u => u.IsActive).ToList();
var count = activeUsers.Count;  // Property, no execution
var first = activeUsers.First();
var last = activeUsers.Last();

6. Ignoring Struct Copying Overhead

// ❌ BAD: Large struct passed by value (copied!)
public struct LargeData
{
    public double Data1;
    public double Data2;
    // ... 20 more fields
}

public void Process(LargeData data)  // Copies 160+ bytes!
{
    // ...
}

// βœ… GOOD: Pass by reference
public void Process(in LargeData data)  // No copy, readonly reference
{
    // ...
}

Key Takeaways 🎯

πŸ“‹ Performance & Memory Quick Reference

ConceptBest Practice
MemoryStack for value types (fast), Heap for reference types (GC managed)
CollectionsPre-allocate capacity when size is known
StringsUse StringBuilder for concatenation, Span for slicing
Value TypesKeep small (≀16 bytes), immutable, pass with 'in' if large
BoxingUse generics to avoid boxing value types
LINQMaterialize with ToList() if enumerating multiple times
AsyncUse ValueTask for hot paths with fast synchronous completion
PoolingUse ArrayPool for temporary buffers
LOHAvoid repeated allocations β‰₯85KB, use pooling instead
IDisposableAlways dispose resources, use 'using' statement

🧠 Memory Mnemonic: "SALT"

  • Stack for Small values
  • Allocate with purpose (pre-size collections)
  • Limit boxing (use generics)
  • Take advantage of Span and pooling

πŸ” Profiling Tools:

  • dotMemory (JetBrains) - Memory profiling
  • PerfView (Microsoft) - CPU and memory analysis
  • BenchmarkDotNet - Micro-benchmarking library
  • Visual Studio Profiler - Built-in diagnostics
  • dotTrace (JetBrains) - Performance profiling

πŸ€” Did You Know? The .NET GC can collect 100,000+ objects in under 1ms for Gen 0 collections. The key to performance isn't avoiding GCβ€”it's avoiding Gen 2 collections and keeping allocation rates low.

πŸ“š Further Study

  1. Microsoft Docs - Memory Management: https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/
  2. Performance Best Practices in C#: https://docs.microsoft.com/en-us/dotnet/csharp/advanced-topics/performance/
  3. High Performance C# by Ben Watson: https://www.writinghighperf.net/

πŸŽ“ Practice Challenge: Take an existing C# project and run a memory profiler. Identify the top 3 allocation sources and optimize them using techniques from this lesson. Measure before and after performance!