What is Hermeticity?
Define hermetic builds and understand the isolation principles that separate them from traditional builds.
What is Hermeticity?
Master hermetic builds with free flashcards and spaced repetition practice. This lesson covers the core principles of hermeticity, practical applications in build systems, and common pitfalls that break reproducibilityβessential concepts for modern software engineering and DevOps.
Welcome ποΈ
Imagine running the same build command twice and getting different outputs each time. Sounds like a nightmare, right? That's exactly what hermetic builds prevent. In software engineering, hermeticity refers to the property of build systems being completely self-contained and reproducibleβlike a hermetically sealed container that keeps external contaminants out.
The term "hermetic" comes from Hermes Trismegistus, a legendary Hellenistic figure associated with alchemy and the art of creating airtight seals. In modern computing, a hermetic build is one that's sealed off from the unpredictable external environment, ensuring that the same inputs always produce identical outputs.
This lesson will take you deep into the world of hermetic builds, showing you why they matter, how to achieve them, and what mistakes to avoid. Whether you're building microservices, mobile apps, or infrastructure-as-code, understanding hermeticity is crucial for creating reliable, debuggable, and maintainable systems.
Core Concepts π‘
The Fundamental Definition
Hermeticity in build systems means that a build process:
- Depends only on declared inputs - No hidden dependencies on system state, environment variables (unless explicitly declared), or network resources
- Produces identical outputs - Given the same inputs, the build generates byte-for-byte identical artifacts every time
- Is isolated from the host environment - The build doesn't rely on tools, libraries, or configurations installed on the build machine
- Is reproducible across machines and time - You can run the same build today, tomorrow, or on a different continent and get the same result
π― Key Principle
A hermetic build is a pure function: Build(inputs) β outputs with no side effects and no reliance on external state.
Why Hermeticity Matters π―
Hermetic builds solve critical problems in software development:
1. Reproducibility π
- Debug issues by reproducing the exact build that failed
- Roll back to previous versions with confidence
- Verify that security patches don't introduce changes
2. Reliability β
- Eliminate "works on my machine" syndrome
- Reduce flaky builds caused by environmental differences
- Catch dependency issues early
3. Scalability π
- Enable aggressive caching (same inputs = cache hit)
- Distribute builds across multiple machines safely
- Parallelize build steps without coordination overhead
4. Security π
- Audit exactly what goes into your artifacts
- Prevent supply chain attacks from unexpected dependencies
- Verify build integrity through checksums
The Hermetic Build Spectrum
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DEGREES OF HERMETICITY β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π΄ Non-Hermetic β
Fully Hermetic
ββββββββββββββββββββΌβββββββββββββββββββΌββββββββββββββββββ
β β β β
βΌ βΌ βΌ βΌ
"make" "docker build" "bazel build" "nix build"
Uses system Fixed OS All deps Content-
libraries but can fetch declared addressed
from network explicitly everything
β Breaks β οΈ Mostly β
Hermetic β
Maximally
often works by design hermetic
Most build systems fall somewhere on this spectrum. Achieving perfect hermeticity is challenging, but even moving toward the right side dramatically improves build quality.
The Three Pillars of Hermeticity ποΈ
| Pillar | Description | Common Violations |
|---|---|---|
| π Isolation | Build runs in a controlled sandbox with no access to host system resources | Reading /etc/hosts, using system Python, accessing $HOME |
| π Declaration | All dependencies, tools, and inputs are explicitly listed and versioned | Implicit dependencies, "latest" tags, unversioned tools |
| π― Determinism | Same inputs always produce bit-identical outputs | Timestamps in artifacts, random UUIDs, non-deterministic compression |
Inputs and Outputs: The Contract π
A hermetic build system maintains a strict contract between inputs and outputs:
Declared Inputs:
- π Source code files (with exact versions/commits)
- π¦ Dependencies (pinned versions, checksummed)
- π§ Build tools (specific versions in containers/sandboxes)
- βοΈ Configuration files (checked into version control)
- π Environment variables (explicitly declared)
Forbidden Inputs:
- β System-installed libraries or tools
- β Network resources fetched during build
- β Current date/time (unless explicitly needed and declared)
- β Ambient environment variables
- β User-specific paths or credentials
- β Random number generators (unless seeded deterministically)
Expected Outputs:
- π Build artifacts (binaries, archives, images)
- π Build metadata (logs, timing info)
- π§ͺ Test results
- π Documentation
Determinism Deep Dive π
Determinism is often the trickiest aspect of hermeticity. Many build steps introduce non-determinism accidentally:
| Source of Non-determinism | Why It Happens | Solution |
|---|---|---|
| β° Timestamps | Build embeds current time in artifacts | Use SOURCE_DATE_EPOCH environment variable |
| π File ordering | Directory iteration order is filesystem-dependent | Sort files alphabetically before processing |
| π² Hash randomization | Python, Ruby hash tables use random seeds | Set PYTHONHASHSEED=0 or equivalent |
| π Parallel builds | Race conditions in concurrent operations | Ensure proper dependency ordering, atomic writes |
| π UUIDs/random IDs | Generating unique identifiers | Use content-based hashing instead |
| ποΈ Compression | Some algorithms include timestamps or vary by CPU | Use deterministic compression (gzip -n, tar --sort=name) |
π‘ Pro Tip: Use tools like diffoscope to compare two supposedly identical builds and find sources of non-determinism.
Caching and Hermeticity π
One of the biggest benefits of hermetic builds is aggressive caching:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β HERMETIC BUILD CACHING β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Input Hash Cache Lookup
ββββββββββββ ββββββββββββ
β Source β β β
β Code ββββ β Cache β
ββββββββββββ β β Server β
βββ Hashβββββ β
ββββββββββββ β β β
β Deps ββββ ββββββ¬ββββββ
ββββββββββββ β
β
βββββββββββ΄ββββββββββ
β β
Cache Hit Cache Miss
(return (run build,
cached store result)
output)
With hermetic builds:
- Content-addressable storage: Cache key = hash of all inputs
- Distributed caching: Share build artifacts across team/CI
- Incremental builds: Only rebuild what changed
- Remote execution: Send build to powerful remote servers
Real-World Examples π
Let's examine concrete scenarios that illustrate hermeticity in action.
Example 1: The Non-Hermetic Python Build β
Scenario: A team has a Python application with this build script:
#!/bin/bash
## build.sh - NON-HERMETIC VERSION
pip install -r requirements.txt
python setup.py build
python -m pytest
tar -czf app.tar.gz dist/
Why it's not hermetic:
- Undefined Python version: Uses whatever
pythonis on the PATH (could be 3.8, 3.9, 3.11...) - Unpinned dependencies:
requirements.txtcontains:
These will fetch different versions on different days!flask>=2.0 requests - System pip: Uses system-installed pip (version varies)
- Timestamp in tarball:
tar -czfembeds creation time - Ambient pytest: Uses whatever pytest is installed
Consequences:
- Developer A builds on Monday with Flask 2.0.1
- Developer B builds on Friday with Flask 2.3.0 (just released)
- Different behavior, different bugs, different security profiles
- CI/CD might produce different artifacts than local builds
Example 2: The Hermetic Python Build β
Improved version:
#!/bin/bash
## build.sh - HERMETIC VERSION
## Use exact Python version from container
docker run --rm -v $(pwd):/workspace \
python:3.11.2-slim \
/bin/bash -c '
cd /workspace
# Install exact pinned versions
pip install --no-cache-dir -r requirements-lock.txt
# Run build
python setup.py build
# Test with deterministic settings
PYTHONHASHSEED=0 python -m pytest
# Create deterministic tarball
tar --sort=name --mtime="2023-01-01 00:00:00" \
--owner=0 --group=0 --numeric-owner \
-czf app.tar.gz dist/
'
With requirements-lock.txt:
flask==2.0.1
requests==2.28.1
werkzeug==2.0.1
click==8.0.1
## ... all transitive dependencies pinned
Why it's hermetic:
- β
Fixed Python:
python:3.11.2-slimis a specific, immutable image - β Pinned dependencies: Exact versions including transitive deps
- β Isolated environment: Docker container provides clean sandbox
- β Deterministic tarball: Timestamps, ordering, ownership all fixed
- β
Deterministic tests:
PYTHONHASHSEED=0prevents hash randomization
Example 3: Bazel - Hermetic by Design ποΈ
Google's Bazel build system enforces hermeticity through its architecture:
## BUILD.bazel
py_library(
name = "mylib",
srcs = ["mylib.py"],
deps = [
"@pypi//flask:pkg", # External dependency
"@pypi//requests:pkg",
],
)
py_binary(
name = "myapp",
srcs = ["main.py"],
deps = [":mylib"],
)
## WORKSPACE - declares external dependencies
load("@rules_python//python:pip.bzl", "pip_install")
pip_install(
name = "pypi",
requirements = "//requirements-lock.txt",
)
Bazel's hermetic guarantees:
- Sandbox execution: Each build action runs in a filesystem sandbox
- Explicit dependencies: If not declared in
deps, it's not available - Content-addressed cache: Outputs cached by hash of inputs
- Toolchain management: Even compilers/interpreters are hermetic inputs
- Remote execution: Can run builds on remote servers transparently
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β BAZEL HERMETIC ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Build Request Sandbox
β ββββββββββββββββ
βΌ β Declared β
βββββββββββ β inputs only β
β Action βββββββββββββββββ β
β (build) β Copy inputs β ββββββββββ β
βββββββββββ β β Build β β
β β β Action β β
β β ββββββββββ β
βΌ β β
Compute Hash β ββββββββββ β
β β β Output β β
βΌ β ββββββββββ β
βββββββββββ ββββββββ¬ββββββββ
β Cache βββββββββββββββββββββββββ
β Lookup β Copy outputs
βββββββββββ
β
ββββββ΄ββββββ
β β
Hit Miss
(reuse) (execute)
Example 4: Docker - Partial Hermeticity β οΈ
Docker is often mistaken for being hermetic, but it requires discipline:
β Non-hermetic Dockerfile:
FROM ubuntu:latest
RUN apt-get update && apt-get install -y python3
COPY requirements.txt .
RUN pip3 install -r requirements.txt
COPY . .
CMD ["python3", "app.py"]
Problems:
ubuntu:latestchanges over timeapt-get installfetches latest packagespip3 installwithout pinned versions- Network access during build
β More hermetic Dockerfile:
## Pin base image by digest
FROM ubuntu@sha256:abcd1234...
## Install specific versions
RUN apt-get update && \
apt-get install -y \
python3=3.11.2-1 \
python3-pip=22.0.2+dfsg-1 && \
rm -rf /var/lib/apt/lists/*
## Copy and install pinned deps
COPY requirements-lock.txt .
RUN pip3 install --no-cache-dir -r requirements-lock.txt
## Copy source
COPY . .
CMD ["python3", "app.py"]
Improvements:
- Image pinned by SHA256 (immutable)
- Explicit package versions
- Locked dependencies
- Cache cleared to reduce variability
π€ Did you know? Even with these improvements, Docker builds aren't fully hermetic because they can access the network and depend on external registries. Tools like Nix and Guix go further by content-addressing everything.
Common Mistakes π¨
Let's explore the most frequent ways developers accidentally break hermeticity:
Mistake 1: Using "Latest" Tags β οΈ
The problem:
FROM node:latest
FROM python:3
These tags change over time:
node:latestmight be 18.x today, 20.x tomorrowpython:3could be 3.9, 3.10, 3.11...
The fix:
FROM node:18.16.0-alpine3.17
FROM python:3.11.2-slim-bullseye
## Even better: pin by SHA256
FROM node@sha256:abcd1234...
Mistake 2: Fetching Dependencies During Build β οΈ
The problem:
## In build script
npm install
go get ./...
wget https://example.com/asset.zip
Network access introduces:
- Version drift
- Availability issues (registry down = build fails)
- Security risks (man-in-the-middle attacks)
The fix:
- Vendor dependencies: Check them into your repository
- Use lock files:
package-lock.json,go.sum,Pipfile.lock - Content-addressed storage: Bazel's
http_archivewithsha256
Mistake 3: Reading System Environment Variables β οΈ
The problem:
## In application code
import os
config = os.environ.get('DATABASE_URL')
api_key = os.getenv('API_KEY')
This makes the build depend on the builder's environment!
The fix:
- Declare required env vars explicitly in build config
- Use configuration files checked into version control
- Inject at runtime, not build time (for secrets)
Mistake 4: Timestamps in Artifacts β οΈ
The problem:
zip -r app.zip dist/
tar -czf release.tar.gz bin/
## Both embed current timestamp!
The fix:
## zip with fixed timestamp
TZ=UTC zip -rX app.zip dist/
## tar with fixed mtime
tar --sort=name \
--mtime="2023-01-01 00:00:00" \
--owner=0 --group=0 \
-czf release.tar.gz bin/
## Or use SOURCE_DATE_EPOCH
export SOURCE_DATE_EPOCH=1672531200
tar -czf release.tar.gz bin/
Mistake 5: Implicit Tool Dependencies β οΈ
The problem:
build:
gcc -o myapp main.c
strip myapp
This assumes:
gccis installed- Specific version/configuration
striputility is available
The fix:
## Declare exact toolchain
FROM gcc:12.2.0-bullseye AS builder
WORKDIR /build
COPY . .
RUN gcc -o myapp main.c && strip myapp
Mistake 6: File System Ordering β οΈ
The problem:
import os
files = os.listdir('src/')
for f in files: # Order is non-deterministic!
process(f)
Filesystem iteration order varies by OS, filesystem type, and even kernel version.
The fix:
import os
files = sorted(os.listdir('src/')) # Explicit sort
for f in files:
process(f)
Key Takeaways π―
Let's consolidate everything you've learned about hermeticity:
The Core Principles π
- Same inputs β Same outputs (always, everywhere)
- Declare everything explicitly (no hidden dependencies)
- Isolate from the host (containers, sandboxes, VMs)
- Make it deterministic (eliminate randomness and timestamps)
- Enable caching (content-addressed storage)
Benefits You'll Gain β¨
| Benefit | Why It Matters |
|---|---|
| π Easier debugging | Reproduce any build exactly as it was |
| π Faster builds | Aggressive caching, incremental builds |
| π₯ Team consistency | Everyone gets identical artifacts |
| π Security | Audit supply chain, verify integrity |
| π Scalability | Distribute builds, remote execution |
Hermetic Build Checklist β
Use this before declaring a build hermetic:
- All dependencies pinned to exact versions
- Build runs in isolated container/sandbox
- No network access during build (or only to declared, checksummed resources)
- No reading of system environment variables (except explicitly declared)
- Timestamps handled deterministically (
SOURCE_DATE_EPOCH) - File ordering explicit (sort before processing)
- Compression/archiving uses deterministic flags
- Build tools versioned and declared
- Tested on different machines/environments
- Output artifacts bit-for-bit identical across runs
Tools for Hermetic Builds π§
| Tool | Approach | Best For |
|---|---|---|
| Bazel | Hermetic by design, sandbox execution | Large monorepos, polyglot projects |
| Nix | Functional package manager, content-addressed | System-level reproducibility |
| Docker | Containerization (requires discipline) | Quick wins, microservices |
| Buck2 | Hermetic, remote execution | Meta's open-source Bazel alternative |
| Pants | Hermetic Python/Go/etc builds | Monorepos with strong Python focus |
Mental Models π§
Mnemonic: H.E.R.M.E.T.I.C.
- Hash all inputs
- Explicitly declare dependencies
- Reproduce anywhere
- Make it deterministic
- Eliminate network access
- Toolchains under control
- Isolate from host
- Cache aggressively
Analogy: Recipe vs. Meal Kit π³
- Non-hermetic build = Recipe saying "add some flour, use fresh eggs"
- Hermetic build = Meal kit with exact measured ingredients, tools included, step-by-step instructions
Next Steps π
To deepen your understanding:
- Try it yourself: Take an existing project and make its build hermetic
- Compare builds: Use
diffoscopeto find non-determinism - Read tool docs: Explore Bazel or Nix documentation
- Join the community: reproducible-builds.org has resources and discussions
π Quick Reference Card
| Hermeticity | Same inputs always produce identical outputs |
| Three Pillars | Isolation, Declaration, Determinism |
| Key Enemies | Timestamps, network, file ordering, ambient env vars |
| Best Tools | Bazel (general), Nix (system-level), Docker (with discipline) |
| Golden Rule | If you can't reproduce it, you don't control it |
| Quick Win | Pin all dependency versions, run builds in Docker |
π Further Study
Deepen your knowledge with these resources:
Bazel Documentation on Hermetic Builds - https://bazel.build/concepts/hermeticity - Official guide from Google's Bazel team explaining hermetic principles and implementation
Reproducible Builds Project - https://reproducible-builds.org/ - Community effort with tools, guides, and best practices for achieving bit-for-bit reproducible builds
Nix Package Manager Manual - https://nixos.org/manual/nix/stable/ - Deep dive into purely functional package management and system-level hermeticity
Congratulations! You now understand what hermeticity means, why it matters, and how to achieve it in your build systems. The journey to fully hermetic builds takes time, but every step toward hermeticity makes your software more reliable, debuggable, and maintainable. π