Stackalloc and Span<T>
Stack-based buffer allocation for performance-critical paths
Stackalloc and Span
Master high-performance memory allocation with free flashcards and spaced repetition practice. This lesson covers stackalloc for stack-based allocation, Span<T> for safe memory access, and real-world optimization patternsβessential concepts for building efficient .NET applications that minimize garbage collection pressure.
Welcome
Welcome to one of the most powerful yet underutilized features in modern .NET! π» If you've ever wondered how to write blazingly fast code that doesn't trigger constant garbage collections, you're in the right place. stackalloc and Span<T> represent a paradigm shift in how we think about memory in managed languagesβgiving you C-like performance with C#'s safety guarantees.
In this lesson, you'll learn when and how to allocate memory on the stack instead of the heap, how Span<T> provides a unified abstraction over different memory sources, and the critical safety rules that prevent memory corruption. By the end, you'll understand why these features are the secret sauce behind high-performance libraries like ASP.NET Core and System.Text.Json.
Core Concepts
Understanding Stack vs. Heap Allocation
Before diving into stackalloc, let's revisit the fundamental difference between stack and heap memory:
| Characteristic | Stack πΊ | Heap π¦ |
|---|---|---|
| Lifetime | Scope-bound (method duration) | GC-controlled (can outlive method) |
| Allocation Speed | β‘ Extremely fast (pointer bump) | Slower (GC bookkeeping) |
| Deallocation | Automatic (stack unwind) | GC pauses required |
| Size Limit | ~1MB (OS-dependent) | Limited by available memory |
| Fragmentation | None | Can fragment over time |
The Golden Rule: Stack allocation is perfect for small, short-lived buffers. Heap allocation is necessary for data that outlives the current method or exceeds safe stack size limits.
What is stackalloc?
stackalloc is a C# keyword that allocates memory directly on the call stack instead of the managed heap. This means:
β
Zero GC pressure - No garbage collection involved
β
Ultra-fast allocation - Just moving the stack pointer
β
Automatic cleanup - Memory freed when method returns
β
Cache-friendly - Stack memory is typically hot in CPU cache
β οΈ But with constraints:
β Must be small - Large allocations risk stack overflow
β Cannot escape scope - Cannot return or store in fields
β Cannot be used in async methods - Stack frames don't persist across await
Basic syntax:
// C# 7.2+: Must assign to Span<T>
Span<int> numbers = stackalloc int[100];
// Older unsafe code (not recommended)
unsafe
{
int* ptr = stackalloc int[100];
}
π‘ Tip: Modern C# uses stackalloc with Span<T> to provide bounds checking and eliminate unsafe code!
What is Span?
Span<T> is a ref struct that provides a type-safe, memory-efficient view over contiguous memory regions. Think of it as a "window" that can look at:
- Stack-allocated memory (
stackalloc) - Heap-allocated arrays
- Unmanaged memory
- String internals (via
ReadOnlySpan<char>)
βββββββββββββββββββββββββββββββββββββββββββββββ β SPANMEMORY ABSTRACTION β βββββββββββββββββββββββββββββββββββββββββββββββ€ β β β πΊ Stack Memory βββββββ β β (stackalloc) β β β β β β π¦ Heap Array βββββββ€βββ Span β β (new T[]) β (unified β β β API) β β πΎ Unmanaged βββββββ€ β β (native alloc) β β β β β β π String Slice βββββββ β β (AsSpan()) β β β βββββββββββββββββββββββββββββββββββββββββββββββ
Key characteristics of Span
- Ref struct - Cannot be boxed, cannot be a field in regular classes, cannot cross async boundaries
- Stack-only - Can only live on the stack, ensuring safety
- Zero-copy slicing - Creating sub-spans is free (just pointer + length)
- Bounds-checked - Prevents buffer overruns at runtime
- Performance - JIT treats it specially for optimal codegen
// Creating spans from different sources
int[] array = new int[100];
Span<int> fromArray = array.AsSpan();
Span<int> fromStack = stackalloc int[50];
Span<int> slice = fromArray.Slice(10, 20); // Zero-copy view of elements 10-29
Memory Layout: How Span Works
Under the hood, Span<T> is incredibly simple:
public readonly ref struct Span<T>
{
private readonly ref T _reference; // Pointer to first element
private readonly int _length; // Number of elements
// ... methods ...
}
This means a Span<T> is just 16 bytes on 64-bit systems (8-byte pointer + 4-byte int + padding), regardless of how much memory it references!
MEMORY LAYOUT EXAMPLE Stack Frame: βββββββββββββββββββββββββββββββββββ β Spandata β β βββββββββββββββββββββββββββββ β β β _reference: 0x7FFE1234 β β (8 bytes) β β _length: 100 β β (4 bytes) β βββββββββββββββββββββββββββββ β β β β int[100] buffer β β βββ¬ββ¬ββ¬ββ¬ββ¬ββ¬ββββ¬ββ¬ββ¬ββ β β β0β1β2β3β4β5β...β97β98β99β β (400 bytes) β βββ΄ββ΄ββ΄ββ΄ββ΄ββ΄ββββ΄ββ΄ββ΄ββ β β β β β ββ _reference points here β βββββββββββββββββββββββββββββββββββ
Safety Guarantees and Restrictions
The compiler enforces strict rules to prevent memory corruption:
1. Ref structs cannot escape the stack:
// β COMPILER ERROR: Cannot be class field
public class MyClass
{
private Span<int> _data; // Error CS8345
}
// β COMPILER ERROR: Cannot be boxed
object boxed = mySpan; // Error
// β COMPILER ERROR: Cannot use in async methods
async Task ProcessAsync()
{
Span<int> data = stackalloc int[100]; // Error CS4012
await Task.Delay(100);
}
2. Stack-allocated memory cannot outlive its scope:
// β DANGEROUS: Would return dangling pointer
Span<int> GetNumbers()
{
Span<int> numbers = stackalloc int[10];
return numbers; // Error CS8352: Cannot return
}
// β
CORRECT: Process within scope
void ProcessNumbers()
{
Span<int> numbers = stackalloc int[10];
for (int i = 0; i < numbers.Length; i++)
numbers[i] = i * 2;
// Automatically freed when method returns
}
3. Size limits for stack allocation:
// β
SAFE: Small allocation (~400 bytes)
Span<int> small = stackalloc int[100];
// β οΈ RISKY: Large allocation (40KB)
Span<int> large = stackalloc int[10_000]; // May cause StackOverflowException
// β
BETTER: Use threshold pattern
const int StackThreshold = 512; // bytes
int size = GetRequiredSize();
Span<byte> buffer = size <= StackThreshold
? stackalloc byte[size]
: new byte[size];
π‘ Rule of Thumb: Keep stackalloc under 1KB (ideally under 512 bytes) to avoid stack overflow risks.
ReadOnlySpan for Immutable Views
When you need read-only access, use ReadOnlySpan<T>:
// Prevents accidental modification
ReadOnlySpan<char> text = "Hello, World!".AsSpan();
// β COMPILER ERROR
text[0] = 'h'; // Error: Cannot modify ReadOnlySpan
// β
CORRECT: Read-only operations
bool startsWithH = text[0] == 'H';
ReadOnlySpan<char> hello = text.Slice(0, 5);
This is especially powerful for string processing without allocations:
// Traditional: Creates substring (heap allocation)
string text = "user@example.com";
string domain = text.Substring(text.IndexOf('@') + 1); // Allocates!
// Modern: Zero-allocation slicing
ReadOnlySpan<char> span = text.AsSpan();
int atIndex = span.IndexOf('@');
ReadOnlySpan<char> domainSpan = span.Slice(atIndex + 1); // No allocation!
Examples
Example 1: Fast Buffer Processing Without GC
Let's parse a CSV line without allocating temporary strings:
public static void ParseCsvLine(ReadOnlySpan<char> line)
{
// Allocate small buffer for field indices on stack
Span<int> commaPositions = stackalloc int[10]; // Max 10 fields
int fieldCount = 0;
// Find all comma positions
for (int i = 0; i < line.Length && fieldCount < 10; i++)
{
if (line[i] == ',')
commaPositions[fieldCount++] = i;
}
// Extract fields using zero-copy slicing
int start = 0;
for (int i = 0; i < fieldCount; i++)
{
ReadOnlySpan<char> field = line.Slice(start, commaPositions[i] - start);
ProcessField(field);
start = commaPositions[i] + 1;
}
// Process last field
if (start < line.Length)
{
ReadOnlySpan<char> lastField = line.Slice(start);
ProcessField(lastField);
}
}
void ProcessField(ReadOnlySpan<char> field)
{
// Parse without allocating strings
if (int.TryParse(field, out int value))
Console.WriteLine($"Integer: {value}");
}
// Usage
string csvLine = "42,hello,99,world";
ParseCsvLine(csvLine.AsSpan()); // Zero heap allocations!
Why this is fast:
stackallocfor temporary buffer - no GCSlice()creates views without copyingTryParsehas overloads acceptingReadOnlySpan<char>- Total heap allocations: zero
Example 2: Conditional Stack/Heap Allocation Pattern
This is a production-ready pattern used in .NET Core libraries:
public static string Base64Encode(ReadOnlySpan<byte> data)
{
const int MaxStackAlloc = 256;
int bufferSize = (data.Length * 4 + 2) / 3; // Base64 size calculation
// Use stack for small data, heap for large
Span<char> buffer = bufferSize <= MaxStackAlloc
? stackalloc char[bufferSize]
: new char[bufferSize];
// Convert to Base64 using the buffer
bool success = Convert.TryToBase64Chars(data, buffer, out int charsWritten);
if (!success)
throw new InvalidOperationException("Encoding failed");
// Return string from the buffer
return new string(buffer.Slice(0, charsWritten));
}
// Usage
byte[] smallData = new byte[50];
string encoded1 = Base64Encode(smallData); // Uses stackalloc
byte[] largeData = new byte[1000];
string encoded2 = Base64Encode(largeData); // Uses heap allocation
Pattern breakdown:
| Step | Action | Benefit |
|---|---|---|
| 1 | Calculate required buffer size | Avoid over-allocation |
| 2 | Compare against threshold | Safety check for stack size |
| 3 | Conditional allocation | Optimize for common case |
| 4 | Use same Span |
Code works for both paths |
Example 3: Efficient String Manipulation
Replace parts of a string without intermediate allocations:
public static string ReplaceVowels(string input, char replacement)
{
// Work on stack for small strings
Span<char> buffer = input.Length <= 128
? stackalloc char[input.Length]
: new char[input.Length];
// Copy to mutable buffer
input.AsSpan().CopyTo(buffer);
// Modify in place
for (int i = 0; i < buffer.Length; i++)
{
char c = char.ToLower(buffer[i]);
if (c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u')
buffer[i] = replacement;
}
return new string(buffer);
}
// Usage
string result = ReplaceVowels("Hello World", '*');
Console.WriteLine(result); // H*ll* W*rld
Performance comparison:
Traditional (StringBuilder): ββββββββββββ 120ns, 240 bytes allocated
Stackalloc + Span: βββ 35ns, 0 bytes allocated (small strings)
πΊ 3.4x faster, zero GC pressure
Example 4: Working with Binary Data
Parsing network packets efficiently:
public readonly struct PacketHeader
{
public byte Version { get; init; }
public ushort PacketId { get; init; }
public int DataLength { get; init; }
}
public static PacketHeader ParseHeader(ReadOnlySpan<byte> data)
{
// Validate minimum size
if (data.Length < 7)
throw new ArgumentException("Invalid packet size");
// Parse fields without allocating objects
return new PacketHeader
{
Version = data[0],
PacketId = BitConverter.ToUInt16(data.Slice(1, 2)),
DataLength = BitConverter.ToInt32(data.Slice(3, 4))
};
}
// Usage with stack-allocated buffer
public static void ProcessPacket(Stream networkStream)
{
Span<byte> headerBuffer = stackalloc byte[7];
networkStream.Read(headerBuffer);
PacketHeader header = ParseHeader(headerBuffer);
Console.WriteLine($"Version: {header.Version}");
Console.WriteLine($"Packet ID: {header.PacketId}");
Console.WriteLine($"Data Length: {header.DataLength} bytes");
}
Why this pattern works:
- Fixed-size header (7 bytes) - perfect for
stackalloc ReadOnlySpan<byte>prevents accidental modificationSlice()extracts fields without copying- Zero allocations for header parsing
Common Mistakes
β Mistake 1: Allocating Too Much on Stack
// DANGER: 40KB on stack - likely to crash!
Span<byte> hugeBuffer = stackalloc byte[40_000];
β Fix: Use heap for large allocations
const int MaxStackBytes = 512;
int size = GetRequiredSize();
Span<byte> buffer = size <= MaxStackBytes
? stackalloc byte[size]
: new byte[size];
β Mistake 2: Trying to Store Span in a Field
public class DataProcessor
{
private Span<int> _buffer; // ERROR: Cannot be a field
}
β
Fix: Use Memory
public class DataProcessor
{
private Memory<int> _buffer; // OK: Memory<T> can be stored
public void Process()
{
Span<int> span = _buffer.Span; // Get Span when needed
// ... work with span ...
}
}
β Mistake 3: Using Stackalloc in Async Methods
public async Task ProcessAsync()
{
Span<byte> buffer = stackalloc byte[100]; // ERROR CS4012
await Task.Delay(100);
}
β Fix: Use heap allocation in async context
public async Task ProcessAsync()
{
byte[] buffer = new byte[100]; // Use array
await Task.Delay(100);
// Or use Memory<T> for async-friendly APIs
}
β Mistake 4: Returning Stack-Allocated Span
public Span<int> GetBuffer()
{
Span<int> data = stackalloc int[10];
return data; // ERROR: Would return dangling reference
}
β Fix: Return array-backed Span or use output parameter
public Span<int> GetBuffer()
{
int[] array = new int[10];
return array.AsSpan(); // Safe: array lives on heap
}
// Or better: let caller provide buffer
public void FillBuffer(Span<int> buffer)
{
for (int i = 0; i < buffer.Length; i++)
buffer[i] = i * 2;
}
β Mistake 5: Forgetting Bounds Checking
While Span<T> has bounds checking, you still need to validate inputs:
public void ProcessData(Span<byte> data)
{
// Crashes if data.Length < 4!
int value = BitConverter.ToInt32(data);
}
β Fix: Always validate size requirements
public void ProcessData(Span<byte> data)
{
if (data.Length < 4)
throw new ArgumentException("Buffer too small");
int value = BitConverter.ToInt32(data.Slice(0, 4));
}
Performance Characteristics
Here's what you gain by using stackalloc and Span<T>:
| Operation | Traditional (Heap) | Stackalloc + Span | Improvement |
|---|---|---|---|
| Small buffer allocation (128 bytes) | ~25ns + GC pressure | ~2ns, no GC | β 12x faster |
| String substring | ~40ns + allocation | ~5ns, no allocation | β 8x faster |
| Array slicing | Array.Copy required | Span.Slice (free) | β Zero-copy |
| CSV parsing (10 fields) | ~500ns + strings | ~80ns, no strings | β 6x faster |
Real-world impact: ASP.NET Core uses these patterns extensively, contributing to its 10x+ performance improvements over older frameworks.
When to Use Each Tool
π― Decision Matrix
| Scenario | Best Choice | Why |
|---|---|---|
| Small temporary buffer (<512 bytes) | stackalloc |
Zero GC, ultra-fast |
| Size unknown at compile time | Conditional (threshold pattern) | Safe + optimized |
| Need to store in field/property | Memory<T> |
Span cannot be stored |
| Async method | Memory<T> or array |
Stackalloc not allowed |
| String manipulation (read-only) | ReadOnlySpan<char> |
Zero-copy slicing |
| Binary protocol parsing | stackalloc + Span<byte> |
Perfect fit |
| Large data processing (>1KB) | Array + Span<T> view |
Stack overflow risk |
Advanced Topics Preview
π Memory
Memory<T>is the "storable" version ofSpan<T>- Can be fields, can cross async boundaries
.Spanproperty converts toSpan<T>when needed- Slightly more overhead (24 bytes vs 16 bytes)
π MemoryMarshal for Advanced Scenarios:
// Reinterpret cast (unsafe but fast)
Span<byte> bytes = stackalloc byte[8];
Span<long> longs = MemoryMarshal.Cast<byte, long>(bytes);
longs[0] = 42; // Writes 8 bytes
π ArrayPool
int[] rented = ArrayPool<int>.Shared.Rent(1000);
Span<int> span = rented.AsSpan(0, 1000);
// ... use span ...
ArrayPool<int>.Shared.Return(rented);
Key Takeaways
β
stackalloc allocates memory on the stack - blazingly fast, zero GC pressure, but limited in size and scope
β
Span<T> provides a unified, safe API over any contiguous memory - stack, heap, or native
β
Ref structs like Span<T> are compiler-enforced to stay on stack, preventing memory corruption
β
Zero-copy operations with Slice() make string and array manipulation incredibly efficient
β Threshold pattern (stackalloc for small, heap for large) combines safety with performance
β
ReadOnlySpan
β οΈ Never allocate more than ~512 bytes with stackalloc to avoid stack overflow
β οΈ Cannot use in async methods, as fields, or return from methods (for stack-allocated data)
π‘ Pro tip: The .NET runtime team uses these patterns everywhere in BCL - study System.Text.Json, System.IO.Pipelines, and ASP.NET Core source code for production examples!
Quick Reference Card
π Stackalloc & Span Cheat Sheet
Allocation Syntax:
Span<T> buffer = stackalloc T[size]; // Stack
Span<T> buffer = new T[size]; // Heap
Span<T> buffer = size <= 512 ? stackalloc T[size] : new T[size]; // Conditional
Creating Spans:
Span<int> fromArray = array.AsSpan();
Span<int> slice = span.Slice(start, length);
ReadOnlySpan<char> text = "Hello".AsSpan();
Common Operations:
span[index] // Index access
span.Length // Size
span.Slice(start, length) // Sub-view (zero-copy)
span.Clear() // Zero out
source.CopyTo(destination) // Copy data
span.Fill(value) // Fill with value
Limitations:
- β Cannot be fields in classes
- β Cannot be boxed to object
- β Cannot use in async methods (use Memory
) - β Cannot return stack-allocated spans
- β οΈ Keep stackalloc under 512 bytes
Safety Rules:
- β Bounds checking always active
- β Compiler prevents escaping stack
- β Cannot outlive source memory
- β ref struct prevents heap allocation
Performance Wins:
- πΊ ~10-12x faster allocation
- π¦ Zero GC pressure
- β‘ Cache-friendly memory access
- π― Zero-copy slicing and viewing
π Further Study
Microsoft Docs - Memory and Span Usage Guidelines
https://learn.microsoft.com/en-us/dotnet/standard/memory-and-spans/
Official documentation covering best practices, performance characteristics, and API reference.Adam Sitnik - Span
Deep Dive
https://adamsitnik.com/Span/
Detailed technical explanation of how Spanworks internally, with benchmarks and real-world examples. Stephen Toub - Performance Improvements in .NET
https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-8/
Annual blog posts showing how Microsoft uses Spanand stackalloc throughout the framework.
π You now have the knowledge to write high-performance .NET code that rivals native languages! Practice these patterns in hot paths of your applications, and watch your allocation rates drop to near-zero. Remember: measure first, optimize second - but when you need speed, stackalloc and Span<T> are your secret weapons. Happy optimizing! π