Modern Allocation Primitives
Span<T> and Memory<T> for stack-like and heap-safe buffer handling
Modern Allocation Primitives in .NET
Modern .NET memory allocation has evolved far beyond simple new operators with free flashcards and spaced repetition practice to help you master these concepts. This lesson covers stack allocation techniques, span-based primitives, and pooling strategiesβessential knowledge for building high-performance .NET applications that minimize garbage collection pressure.
Welcome to Advanced Memory Allocation π
Welcome to one of the most powerful aspects of modern .NET development! As .NET has matured through versions 5, 6, 7, and 8, Microsoft has introduced sophisticated allocation primitives that allow developers to write code approaching the performance of native languages while maintaining C#'s safety guarantees.
Modern allocation primitives are the low-level building blocks that enable you to control exactly where and how memory is allocated. Instead of always allocating on the managed heap (triggering garbage collection), you can now leverage stack allocation, memory pooling, and zero-copy techniques to dramatically improve performance in hot code paths.
π‘ Why does this matter? Every heap allocation creates work for the garbage collector. In high-throughput systems (web servers, game engines, financial trading platforms), millions of allocations per second can cause GC pauses that degrade user experience. Modern primitives help you avoid these allocations entirely.
Core Concepts: The Allocation Hierarchy π
The Three Memory Regions
Understanding where memory lives is fundamental to choosing the right allocation primitive:
| Region | Characteristics | When to Use | Performance |
|---|---|---|---|
| Stack | Automatic cleanup, extremely fast, limited size (~1MB) | Small, short-lived data | β‘ Fastest |
| Heap (Managed) | GC-managed, unlimited size, allocation overhead | Objects, long-lived data | π’ Slower (GC cost) |
| Unmanaged | Manual management, no GC, interop-friendly | Native interop, large buffers | β‘ Fast (but risky) |
MEMORY ALLOCATION DECISION TREE
ββββββββββββββββββββββ
β Need to allocate? β
βββββββββββ¬βββββββββββ
β
ββββββββββββββββββ΄βββββββββββββββββ
β β
ββββββ΄βββββ βββββββ΄ββββββ
β Size β β Size β
β β€ 512B? β β > 512B? β
ββββββ¬βββββ βββββββ¬ββββββ
β β
β β
βββββββββββ ββββββββββββ
β Use β β Lifetime?β
β stacka β βββββββ¬βββββ
β lloc β β
βββββββββββ βββββββββββββββ΄βββββββββββ
β β
ββββββ΄βββββ ββββββββ΄βββββββ
β Short β β Long/ β
β (<1 req)β β Unknown β
ββββββ¬βββββ ββββββββ¬βββββββ
β β
β β
ββββββββββββ ββββββββββββ
β Use β β Use heap β
β ArrayPoolβ β (new) β
ββββββββββββ ββββββββββββ
Span and Memory: The Foundation π
Span
Key properties of Span
- Ref struct: Can only live on the stack (cannot be boxed, stored in fields of classes, or used in async methods)
- Zero-allocation: Creating a Span over existing memory doesn't allocate
- Bounds-checked: Safe indexing prevents buffer overruns
- Performance: JIT compiler optimizes Span operations to native performance
Memory
- Not a ref struct (can be stored in fields, used in async methods)
- Slightly more overhead than Span
- Can be sliced and passed around without lifetime restrictions
SPAN VS MEMORY COMPARISON βββββββββββββββββββββββ βββββββββββββββββββββββ β Spanβ‘ β β Memory π¦ β βββββββββββββββββββββββ€ βββββββββββββββββββββββ€ β β Stack-only β β β Heap-friendly β β β Zero overhead β β β Async-compatible β β β Fastest β β β Storable in class β β β No async/await β β β Slight overhead β β β No class fields β β .Span β Get Span β βββββββββββββββββββββββ βββββββββββββββββββββββ
Stackalloc: Lightning-Fast Stack Allocation β‘
stackalloc allocates memory directly on the stack, bypassing the heap and garbage collector entirely. It's one of the most powerful tools for zero-allocation code.
Syntax evolution:
// Old way (unsafe context required)
unsafe {
int* numbers = stackalloc int[10];
}
// Modern way (safe with Span)
Span<int> numbers = stackalloc int[10];
Critical constraints:
- Memory automatically deallocated when method returns
- Limited by stack size (~1MB on Windows, ~8MB on Linux)
- Best for small, temporary buffers (β€ 512 bytes recommended)
- Cannot return stackalloc'd memory from a method
π‘ Pro tip: Use stackalloc for temporary buffers in hot loops. A buffer of 128 bytes allocated a million times per second would create 128MB/sec of garbage without stackallocβwith it, zero garbage!
β οΈ Warning: Stack overflow crashes are unrecoverable! Always validate buffer sizes before using stackalloc, especially with user input.
ArrayPool: Reusable Buffer Management π
When you need larger buffers (> 512 bytes) or buffers whose lifetime extends beyond a single method, ArrayPool
How it works:
- Rent an array from the pool (may get larger than requested)
- Use the array
- Return it to the pool (now available for reuse)
Two pool types:
ArrayPool<T>.Shared: Thread-safe global pool (most common)ArrayPool<T>.Create(): Custom pool with specific size limits
ARRAYPOOL LIFECYCLE βββββββββββββββββββββββββββββββββββββββββββββββ β ArrayPool.Shared β β β β ββββββ ββββββ ββββββ ββββββ β β β1024β β1024β β2048β β2048β (Available) β β ββββββ ββββββ ββββββ ββββββ β ββββββββββββββββ¬βββββββββββββββββββββββββββββββ β Rent(1000) β ββββββββββββ β Your Codeβ (Using 1024-byte array) ββββββββ¬ββββ β Return() β βββββββββββββββββββββββββββββββββββββββββββββββ β ββββββ ββββββ ββββββ ββββββ ββββββ β β β1024β β1024β β1024β β2048β β2048β β β ββββββ ββββββ ββββββ ββββββ ββββββ β βββββββββββββββββββββββββββββββββββββββββββββββ
Best practices:
- Always return arrays to the pool (use try-finally)
- Don't hold references after returning
- Clear sensitive data before returning
- The pool may give you a larger array than requested
MemoryPool: Flexible Memory Management π―
MemoryPool
- Returns
IMemoryOwner<T>which implements IDisposable - Better for async scenarios (Memory
is async-compatible) - Integrates with pipelines and stream-based APIs
- Supports custom allocators
using IMemoryOwner<byte> owner = MemoryPool<byte>.Shared.Rent(1024);
Memory<byte> memory = owner.Memory;
// Use memory...
// Automatically returned when disposed
Examples: Real-World Applications π§
Example 1: High-Performance String Parsing
Problem: Parse thousands of CSV lines per second without creating garbage.
Traditional approach (allocates heavily):
public void ParseCSV(string line) {
string[] parts = line.Split(','); // Allocates string array + strings
int id = int.Parse(parts[0]); // Allocates substring
string name = parts[1];
decimal price = decimal.Parse(parts[2]);
// Process...
}
Modern zero-allocation approach:
public void ParseCSV(ReadOnlySpan<char> line) {
// Split without allocation
int firstComma = line.IndexOf(',');
int secondComma = line.Slice(firstComma + 1).IndexOf(',') + firstComma + 1;
// Parse directly from spans (no substring allocation)
ReadOnlySpan<char> idSpan = line.Slice(0, firstComma);
ReadOnlySpan<char> nameSpan = line.Slice(firstComma + 1, secondComma - firstComma - 1);
ReadOnlySpan<char> priceSpan = line.Slice(secondComma + 1);
int id = int.Parse(idSpan);
// Use nameSpan directly or convert only if needed
decimal price = decimal.Parse(priceSpan);
// Process...
}
Performance impact: The modern approach allocates zero bytes per line. Processing 1 million lines saves ~200MB of allocations and eliminates GC pauses.
Example 2: Temporary Buffer with Smart Sizing
Problem: Need a buffer for processing, but size varies. Want to avoid allocation for small cases.
public void ProcessData(ReadOnlySpan<byte> input) {
const int StackAllocThreshold = 256;
// Smart allocation: stack for small, pool for large
Span<byte> buffer = input.Length <= StackAllocThreshold
? stackalloc byte[StackAllocThreshold]
: new byte[input.Length];
// Alternative with ArrayPool for large buffers
byte[]? rented = null;
try {
Span<byte> bufferAlt = input.Length <= StackAllocThreshold
? stackalloc byte[StackAllocThreshold]
: (rented = ArrayPool<byte>.Shared.Rent(input.Length));
// Process data in buffer...
Transform(input, bufferAlt);
} finally {
if (rented != null) {
ArrayPool<byte>.Shared.Return(rented);
}
}
}
Why this pattern works:
- Small inputs (β€256 bytes): Zero-allocation stack path
- Large inputs: Pooled allocation reuses memory
- Automatic fallback ensures correctness
π§ Memory device: Think "Stack for snacks, Pool for meals" - quick snacks on the stack, full meals need the pool!
Example 3: UTF-8 String Encoding Without Allocation
Problem: Convert strings to UTF-8 bytes for network transmission efficiently.
public void SendMessage(string message, Socket socket) {
// Calculate exact byte count needed
int maxByteCount = Encoding.UTF8.GetMaxByteCount(message.Length);
// Use stack for small messages, pool for large
Span<byte> buffer = maxByteCount <= 512
? stackalloc byte[512]
: ArrayPool<byte>.Shared.Rent(maxByteCount);
try {
// Encode directly to span (zero-copy)
int bytesWritten = Encoding.UTF8.GetBytes(message.AsSpan(), buffer);
// Send only the bytes actually used
socket.Send(buffer.Slice(0, bytesWritten));
} finally {
if (maxByteCount > 512) {
ArrayPool<byte>.Shared.Return(buffer.ToArray());
}
}
}
Performance optimization: Traditional Encoding.UTF8.GetBytes(string) allocates a new byte array every time. This approach reuses memory, critical for high-throughput servers.
Example 4: Memory in Async Pipelines
Problem: Process streaming data asynchronously with minimal allocations.
public async Task ProcessStreamAsync(Stream input) {
// Rent reusable buffer for entire pipeline
using IMemoryOwner<byte> owner = MemoryPool<byte>.Shared.Rent(4096);
Memory<byte> buffer = owner.Memory;
int bytesRead;
while ((bytesRead = await input.ReadAsync(buffer)) > 0) {
// Process chunk
Memory<byte> chunk = buffer.Slice(0, bytesRead);
await ProcessChunkAsync(chunk);
// Buffer automatically reused for next iteration
}
// Buffer returned to pool on dispose
}
private async Task ProcessChunkAsync(Memory<byte> data) {
// Memory<T> can be used across await boundaries
await Task.Delay(10); // Simulated async work
// Access the data after await (safe with Memory<T>)
Span<byte> span = data.Span;
// Process span...
}
Why Memory
Common Mistakes β οΈ
Mistake 1: Stack Overflow from Large stackalloc
β Wrong:
public void Process(int size) {
Span<byte> buffer = stackalloc byte[size]; // size could be huge!
// Stack overflow crash if size > ~1MB
}
β Right:
public void Process(int size) {
const int MaxStackSize = 512;
byte[]? rented = null;
Span<byte> buffer = size <= MaxStackSize
? stackalloc byte[MaxStackSize]
: (rented = ArrayPool<byte>.Shared.Rent(size));
try {
// Safe processing
} finally {
if (rented != null) ArrayPool<byte>.Shared.Return(rented);
}
}
Mistake 2: Forgetting to Return Arrays to Pool
β Wrong:
public void Process() {
byte[] buffer = ArrayPool<byte>.Shared.Rent(1024);
// Use buffer...
// Forgot to return! Memory leak in pool
}
β Right:
public void Process() {
byte[] buffer = ArrayPool<byte>.Shared.Rent(1024);
try {
// Use buffer...
} finally {
ArrayPool<byte>.Shared.Return(buffer, clearArray: true);
}
}
π‘ Pro tip: Set clearArray: true when returning buffers that held sensitive data (passwords, keys).
Mistake 3: Using Span in Async Methods
β Wrong:
public async Task ProcessAsync(Span<byte> data) { // Won't compile!
await Task.Delay(100);
// Span can't live across await
}
β Right:
public async Task ProcessAsync(Memory<byte> data) {
await Task.Delay(100);
Span<byte> span = data.Span; // Get span after await
// Process span...
}
Mistake 4: Assuming Pool Arrays Are Zeroed
β Wrong:
byte[] buffer = ArrayPool<byte>.Shared.Rent(100);
// Assumption: all bytes are 0
if (buffer[50] == 0) { // Might be leftover data!
// Dangerous!
}
β Right:
byte[] buffer = ArrayPool<byte>.Shared.Rent(100);
buffer.AsSpan(0, 100).Clear(); // Explicitly zero if needed
// Now safe to assume zeros
π Did you know? ArrayPool doesn't zero arrays by default for performance. This means rented arrays may contain data from previous uses!
Mistake 5: Storing Span in Class Fields
β Wrong:
public class DataProcessor {
private Span<byte> _buffer; // Won't compile! Span can't be a field
}
β Right:
public class DataProcessor {
private Memory<byte> _buffer; // Memory<T> can be stored
public void Process() {
Span<byte> span = _buffer.Span; // Get Span when needed
}
}
Key Takeaways π―
π Quick Reference Card: Modern Allocation Primitives
| Primitive | Best For | Key Constraint | Performance |
|---|---|---|---|
| stackalloc | Small (β€512B), short-lived buffers | Stack-only, size limits | β‘ Fastest |
| Span<T> | Zero-copy views, sync code | No async, no class fields | β‘ Zero overhead |
| Memory<T> | Async operations, storable | Slight overhead vs Span | π Fast |
| ArrayPool<T> | Medium/large reusable buffers | Must return, not zeroed | π Good (reuse) |
| MemoryPool<T> | Async pipelines, IDisposable | Slightly heavier | π Good |
π§ Decision Flowchart:
Size β€ 512B? β Yes β stackalloc + SpanSize β€ 512B? β No β Need async? β Yes β MemoryPool + Memory
Size β€ 512B? β No β Need async? β No β ArrayPool + Span
β οΈ Golden Rules:
- Never stackalloc with untrusted sizes
- Always return pooled arrays (use try-finally)
- Use Span for sync, Memory for async
- Clear sensitive data before returning to pools
- Measure before optimizing - these primitives add complexity
π Further Study
Official Microsoft Documentation on Span
: https://learn.microsoft.com/en-us/dotnet/api/system.span-1 - Comprehensive reference including performance characteristics and usage patterns
Memory Management in .NET (Microsoft Docs): https://learn.microsoft.com/en-us/dotnet/standard/automatic-memory-management
- Deep dive into how the GC works and why these primitives help
High-Performance .NET by Example: https://github.com/adamsitnik/awesome-dot-net-performance
- Community-curated list of performance resources and real-world examples
Congratulations! π You now understand the modern allocation primitives that power high-performance .NET applications. These toolsβstackalloc, Span