You are viewing a preview of this lesson. Sign in to start learning
Back to Kafka for .NET Developers 2026

Core Kafka Mental Model

Master the fundamental concepts of Kafka before touching any .NET client. These ideas must feel boring before you move on.

Last generated

Why Kafka Exists: The Problem It Was Built to Solve

Picture a system you've probably built or inherited: an order service that needs to notify a warehouse service when a purchase completes. You wire them together directly — an HTTP call, maybe a database flag, perhaps a traditional message queue. It works. Then the analytics team needs order data too. Then fraud detection. Then the loyalty points service. Each new consumer means touching the producer, negotiating contracts, adding retry logic, and praying that a slow downstream service doesn't cascade into your order flow. What started as one clean integration has become a tightly coupled web — and every new requirement makes it more fragile.

Kafka exists because this problem isn't an edge case; it's the default trajectory of growing systems.

The Coupling Problem with Point-to-Point Messaging

Traditional point-to-point messaging embeds a hidden assumption: the producer must know about its consumers. The coupling shows up in two ways:

  • Structural coupling — The producer's code or configuration must change every time a new consumer appears. Adding fraud detection to an existing order flow means a deployment of the order service, not just the fraud service.
  • Temporal coupling — Producer and consumer must often be available at the same time. If the consumer is down, the producer either blocks, retries indefinitely, or drops the message.
[Point-to-Point: Every new consumer adds a new wire]

Order Service ──────────► Warehouse Service
Order Service ──────────► Analytics Service        (added later)
Order Service ──────────► Fraud Service            (added later)
Order Service ──────────► Loyalty Points Service   (added later)

Result: Producer must be modified or reconfigured for each addition.
        Failure in any consumer can stall or break the producer's flow.

Each arrow is a dependency — a deployment risk, a contract to maintain, and a potential point of failure. A common pattern that grows from this is the "god service": one central service that knows about everything because it's the only place that can coordinate all cross-cutting concerns. That service becomes the hardest thing in your system to change and the most dangerous to deploy.

The Deletion Problem with Traditional Queues

Traditional message queues operate on a destructive read model. A consumer receives a message, processes it, and acknowledges receipt. The broker then deletes the message. This creates serious architectural gaps the moment you ask:

  • "What happened to all orders placed between 2 AM and 4 AM when our analytics service was down?"
  • "Can we replay the last hour of events to debug why fraud detection is producing false positives?"
  • "We need to onboard a new data warehouse — can it read historical events?"

With a traditional queue the answer is almost always: no, not without extra infrastructure.

[Traditional Queue: Messages deleted on consumption]

Producer ──► [Queue] ──► Consumer A reads message → message deleted

Consumer B arrives later:
  "I'd like to read that message."
  Broker: "What message?"  ← gone forever

To add replay, you'd need:
  - A separate database log
  - Custom archival jobs
  - Schema synchronization between log and queue

Kafka's Answer: The Log as the Primary Data Structure

Kafka's foundational insight is to flip the model. Instead of treating a message as a transient envelope to be delivered and discarded, Kafka treats the log as the primary data structure. An event is written to the log once and stays there. Consumers read from the log at their own pace. When a consumer is done reading, the log is unchanged — the event is still there, available to be read again by a different consumer, or by the same consumer starting over.

[Kafka: Producers write to the log; consumers read independently]

Order Service ──► [Topic: orders] ──► Warehouse Service (reads from offset 0)
                       │           ──► Analytics Service (reads from offset 0)
                       │           ──► Fraud Service     (reads from offset 0)
                       │           ──► Loyalty Service   (reads from offset 0)
                       │
                  append-only log
                  events stay until retention expires

Adding a new consumer:
  - Deploy the new service
  - Point it at the topic
  - No changes to Order Service
  - No coordination with other consumers

Kafka decouples producers from consumers in two dimensions simultaneously — space (the producer doesn't know who the consumers are) and time (the consumer doesn't need to be running when the producer writes). The offset — the immutable sequential position of a record in the log — is what makes this possible. Each consumer tracks its own position independently. There is no global "already read" marker that gets destroyed when one consumer reads. Event replay is therefore a first-class operation rather than an afterthought.

What Kafka Is Actually Built For — and What It Isn't

Kafka is designed around specific trade-offs that make it excellent for some problems and actively wrong for others.

Kafka is a good fit for Kafka is not designed for
High-volume event streams Low-latency request/reply (< 1 ms)
Durable, replayable audit logs General-purpose RPC between services
Decoupling producers from multiple consumers Key-based lookups or queries
Ordered processing within a stream Ad-hoc message routing per consumer type
Fan-out to many independent consumers Ephemeral, fire-and-forget notifications

⚠️ Common Mistake: Teams sometimes reach for Kafka as a replacement for direct service-to-service RPC because they've heard it's "scalable." Kafka is not designed for the request/reply pattern — the round-trip latency introduced by broker writes and consumer polling makes it a poor fit for synchronous operations. If Service A needs an immediate answer from Service B, HTTP or gRPC is almost always the right tool. Kafka's strength is in telling the world that something happened, not in asking another service to do something and waiting for the result.

Kafka also makes an explicit bet on throughput over latency. Its batching and sequential I/O model achieves very high throughput, but individual message latency is measured in milliseconds to tens of milliseconds under typical configurations, not microseconds. For most event-driven architectures this is entirely acceptable. For systems where sub-millisecond latency is a hard requirement, different tools exist. This isn't a criticism — it's a description of design intent. Kafka's constraints are what make its guarantees credible.

With that motivation established, the next section grounds it in concrete structure: what brokers are, how topics work as append-only logs, and what a Kafka record actually contains.


The Kafka Data Model: Brokers, Logs, and Events

The concepts here — brokers, topics, offsets, records, retention — are the load-bearing ideas that every configuration decision, every debugging session, and every architectural trade-off will reference.

The Cluster: A Cooperative of Brokers

A Kafka deployment is a cluster made up of one or more brokers. A broker is simply a server process that stores data and serves read and write requests. No single broker holds all the data — each broker is responsible for a subset, divided by a unit called the partition.

┌─────────────────────────────────────────────────────┐
│                   Kafka Cluster                     │
│                                                     │
│  ┌───────────┐  ┌───────────┐  ┌───────────┐        │
│  │  Broker 1 │  │  Broker 2 │  │  Broker 3 │        │
│  │           │  │           │  │           │        │
│  │ Topic A   │  │ Topic A   │  │ Topic B   │        │
│  │ Part 0    │  │ Part 1    │  │ Part 0    │        │
│  │ Topic B   │  │ Topic C   │  │ Topic A   │        │
│  │ Part 1    │  │ Part 0    │  │ Part 2    │        │
│  └───────────┘  └───────────┘  └───────────┘        │
└─────────────────────────────────────────────────────┘

This distribution is the source of Kafka's horizontal scalability. Producers and consumers connect to any broker as an entry point; Kafka routes them to whichever broker holds the data they need. From a client perspective, you address the cluster, not any individual broker.

Topics: The Append-Only Log

The central data structure in Kafka is the topic — a named, append-only log. Every producer write appends a new record to the tail, and the log grows monotonically. It never shrinks in response to reads, and existing records are never modified.

In a queue, reading a message typically removes it. The queue drains as consumers work through it. In Kafka, reading a record leaves it exactly where it is.

Topic: "order-events"

Offset:  0        1        2        3        4
       ┌──────┬──────┬──────┬──────┬──────┐
Log:   │Rec 0 │Rec 1 │Rec 2 │Rec 3 │Rec 4 │ ← tail (new writes go here)
       └──────┴──────┴──────┴──────┴──────┘
         ↑
      (immutable; records are never removed by reads)

This append-only property means two independent services can read the same topic and each get every record, without any coordination. There is no concept of "claiming" a message — each consumer independently tracks its own position.

Offsets: The Immutable Address of Every Record

Every record written to a partition is assigned an offset — an immutable, sequential integer that uniquely identifies its position within that partition. The first record gets offset 0, the second gets offset 1, and so on. Offsets never change and are never reused. If a record at offset 42 is eventually deleted by retention policy, offset 42 is simply gone — the next record is still at whatever offset it was originally assigned.

Offsets matter because they are the mechanism by which consumers track their position in the log. A consumer that has processed up to offset 41 knows exactly where to resume after a restart: it asks for offset 42.

⚠️ Common Mistake: Offsets are scoped to a single partition, not to a topic. A topic with three partitions has three independent offset sequences, each starting at 0. There is no global "topic offset."

Records: The Atom of Kafka Data

The individual unit of data in Kafka is a record. It has four components:

Component Required Notes
Key No Byte array; used for partition routing and compaction
Value No (but almost always present) Byte array; the actual payload
Timestamp Set automatically if omitted Milliseconds since epoch
Headers No Key-value metadata pairs, both as byte arrays

Kafka treats keys and values as raw bytes. Kafka itself enforces no schema. When you write a Message<string, OrderCreated> in .NET, the Confluent client serializes your key and value to bytes before they ever reach a broker. The broker stores those bytes and hands them back to a consumer, which is responsible for deserializing them correctly. Kafka cannot tell you if a producer serialized its value as JSON and a consumer is trying to deserialize it as Protobuf. The contract between producers and consumers lives entirely outside Kafka.

// The raw structure of a Kafka record at the .NET client level.
var message = new Message<string, string>
{
    // Key: used for partition routing.
    // All events for the same order go to the same partition.
    Key = "order-7821",

    // Value: the payload. Kafka stores this as bytes.
    // Nothing prevents a consumer from using a different deserializer.
    Value = "{\"status\": \"placed\", \"total\": 149.99}",

    // Timestamp: optional. If omitted, the client sets it to DateTime.UtcNow.
    Timestamp = new Timestamp(DateTime.UtcNow),

    // Headers: optional metadata for routing, tracing, or versioning signals.
    Headers = new Headers
    {
        { "source-service", System.Text.Encoding.UTF8.GetBytes("order-api") },
        { "schema-version", new byte[] { 0x01 } }
    }
};

The record key is entirely optional — a null key is valid and common. When the key is null, Kafka routes the record to a partition using a round-robin or sticky-partition strategy. The key only becomes meaningful when you need ordering guarantees for a specific entity, for example ensuring all events for a given customer ID land on the same partition in the same sequence.

Retention: Who Decides When Data Disappears?

In a traditional message queue, data disappears when a consumer acknowledges it. In Kafka, retention is controlled by the broker, not by consumers. Two retention policies are available by default:

  • Time-based retention: Records older than a configurable duration are deleted, regardless of whether anyone has read them.
  • Size-based retention: When a partition's log exceeds a configured size, the oldest segments are deleted to make room.

Both can be applied together. Crucially, consumer acknowledgement plays no role in either. A consumer that is offline for two weeks may return to find that the earliest records it hadn't processed are already gone.

Time-based retention (e.g., 7 days):

Offset: 0     1     2     3     4     5     6     7
       ┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐
Log:   │ Day1│ Day1│ Day2│ Day3│ Day5│ Day6│ Day7│ Day8│
       └─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┘
        ↑─── DELETED (>7 days) ───↑    ↑─── RETAINED ──↑

Consumer position (offset 2) is now behind the retention boundary.
Kafka will report this as an "offset out of range" — the consumer
must reset to the earliest available offset.

⚠️ Common Mistake: A slow or offline consumer whose offset falls behind the retention boundary will encounter an OffsetOutOfRange error on its next poll. This is expected behavior of a time-bounded log. Handling this case gracefully — deciding whether to reset to earliest or latest, and what to do about the gap — must be designed for explicitly.

There is a third retention mode: log compaction. With compaction enabled, Kafka retains the most recent record for each key indefinitely, discarding older records with the same key. This transforms a topic from a time-bounded stream into something closer to a changelog — useful for materializing the current state of entities. Consider an inventory service that publishes a record every time a product's stock level changes: with time-based retention, the full history of stock changes is available for a window; with log compaction, only the latest stock level for each product ID is guaranteed to survive.

The Data Model in One View

A Kafka cluster is a group of broker servers, each holding a slice of the data. Topics are named, append-only logs partitioned across those brokers. Every record on a partition has an immutable offset that consumers use to track their position. Each record is a bytes-in, bytes-out key-value pair with headers and a timestamp — schema is your responsibility. The log persists according to time and size policies, indifferent to whether any consumer has read it.

Producer                Kafka Cluster                     Consumer A    Consumer B
   │                         │                                │              │
   │  write(key, value)      │                                │              │
   │────────────────────────►│                                │              │
   │                    Broker 1                              │              │
   │                   ┌──────────────────────────────┐       │              │
   │                   │ Topic "orders" / Partition 0 │       │              │
   │                   │ [off:0][off:1][off:2][off:3] │       │              │
   │                   └──────────────────────────────┘       │              │
   │                         │                                │              │
   │                         │◄── poll(from offset 2) ────────│              │
   │                         │──── records [2,3] ────────────►│              │
   │                         │                                │              │
   │                         │◄── poll(from offset 0) ─────────────────────► │
   │                         │──── records [0,1,2,3] ──────────────────────► │
   │                         │                                │              │
   │                   (records unchanged; log still has [0,1,2,3])          │

Consumer A and Consumer B both read from the same partition at different offsets, and the log is unchanged by either read. Their positions are completely independent. This is the foundation that the next sections build on.

Term One-Line Definition
Broker A server in the Kafka cluster; stores a subset of partition data
Topic A named, append-only log; divided into one or more partitions
Offset Immutable sequential integer identifying a record's position in a partition
Record Key + value (both bytes) + optional headers + timestamp
Retention Time- or size-based policy governing when records are deleted; consumer-independent
Log compaction Alternative retention mode; keeps latest record per key indefinitely

Producers and Consumers: How Data Flows Through Kafka

Every Kafka system reduces to a single loop: something writes a record, and something else reads it. Understanding that loop precisely — not just loosely — is what separates developers who configure Kafka confidently from those who cargo-cult settings until something breaks in production. This section traces the complete lifecycle of one event, from the moment a producer sends it to the moment a consumer processes it.

The Producer Side: Writing to the Log

A producer is any process that appends records to a Kafka topic. The producer always initiates contact with the broker, never the other way around. A producer's write lands at the tail of a partition's log and is assigned the next sequential offset — permanent and immutable from that point forward.

The most consequential decision a producer makes is its acks setting, which controls how many broker acknowledgements the producer waits for before considering a write successful.

acks=0    Producer fires and forgets. No ack waited for.
          Highest throughput. Data loss on broker crash is silent.

acks=1    Producer waits for the partition leader to write to its local log.
          Good balance for many use cases. Data can be lost if the leader
          crashes before replicas catch up.

acks=all  Producer waits for the leader AND all in-sync replicas (ISR) to
          acknowledge. Strongest durability guarantee. Slightly higher latency.

🎯 Key Principle: acks does not change whether Kafka replicates — replication happens based on your topic's replication factor regardless. acks only controls how long the producer waits before moving on. The durability trade-off is about what the producer knows, not about what the cluster does.

using Confluent.Kafka;

var config = new ProducerConfig
{
    BootstrapServers = "localhost:9092",
    // acks=all: wait for leader + all in-sync replicas.
    // Safest option; small latency cost per produce call.
    Acks = Acks.All
};

using var producer = new ProducerBuilder<string, string>(config).Build();

var record = new Message<string, string>
{
    Key = "order-service",
    Value = "{\"orderId\": 42, \"status\": \"placed\"}"
};

// ProduceAsync returns a DeliveryResult containing the assigned offset.
DeliveryResult<string, string> result = await producer.ProduceAsync("orders", record);

Console.WriteLine($"Record written to partition {result.Partition}, offset {result.Offset}");

With acks=all, the returned DeliveryResult confirms the record has been replicated to all in-sync replicas. With acks=0, the result arrives immediately without a real broker acknowledgement.

⚠️ Common Mistake: Treating acks=1 as "safe enough" without considering replication lag. If the partition leader crashes between writing to its local log and replicating to followers, that record is gone even though the producer received a positive acknowledgement. For financial events, order records, or any data where loss is unacceptable, acks=all paired with a replication factor of at least 3 is the correct starting point.

The Replication Layer: Leaders and Replicas

Before tracing the consumer side, it helps to understand the durability layer that sits between a write and a read. Each partition has exactly one leader broker and zero or more replica brokers. All producer writes go to the leader. All consumer reads also come from the leader. Replicas exist to absorb the leader role if the current leader fails — they are not read replicas in the database sense.

         ┌─────────────────────────────────────────────────────────┐
         │                   Kafka Cluster                         │
         │                                                         │
         │  Broker 1 (Leader for partition 0)                      │
         │  ┌──────────────────────────────────────┐               │
         │  │  Partition 0:  [ 0 ][ 1 ][ 2 ][ 3 ]  │◄── Producer   │
         │  └──────────────────────────────────────┘               │
         │         │ replicate                                     │
         │         ▼                                               │
         │  Broker 2 (Replica for partition 0)                     │
         │  ┌──────────────────────────────────────┐               │
         │  │  Partition 0:  [ 0 ][ 1 ][ 2 ][ 3 ]  │               │
         │  └──────────────────────────────────────┘               │
         │         │ replicate                                     │
         │         ▼                                               │
         │  Broker 3 (Replica for partition 0)                     │
         │  ┌──────────────────────────────────────┐               │
         │  │  Partition 0:  [ 0 ][ 1 ][ 2 ][ 3 ]  │               │
         │  └──────────────────────────────────────┘               │
         └─────────────────────────────────────────────────────────┘
                                    │
                                    ▼
                              Consumer reads from leader (Broker 1)

The in-sync replica set (ISR) is the set of replicas fully caught up with the leader. When acks=all is configured, the leader waits until every broker in the ISR has confirmed the write. If a replica falls behind, it is removed from the ISR temporarily and does not block the acks=all wait. The min.insync.replicas setting enforces a minimum ISR size — if too many replicas fall out, writes can be rejected entirely rather than accepted with reduced durability guarantees.

The Consumer Side: Pulling from the Log

Kafka does not push records to consumers. Consumers pull records from brokers at their own pace.

This pull model has a specific, important consequence: a slow consumer cannot exert backpressure on a producer. If your consumer falls behind, records accumulate on the log (up to retention limits), and the producer continues writing entirely unaffected. This decoupling is deliberate — it is precisely what allows multiple independent consumers to read the same topic at different speeds without interfering with each other.

Producer                   Kafka Broker                 Consumer
   │                            │                           │
   │─── write record ──────────►│                           │
   │◄── ack (based on acks) ────│                           │
   │                            │                           │
   │                            │◄── fetch request ─────────│
   │                            │─── records + offsets ────►│
   │                            │                           │
   │                            │◄── commit offset ─────────│
   │                            │─── ok ───────────────────►│

The consumer asks the broker for records starting from a specific offset, processes whatever comes back, and then explicitly commits its new position. The broker does not track where a particular consumer is unless the consumer tells it.

Offsets and Consumer Position: What Actually Moves

A consumer tracks its position through committed offsets — stored in a special internal Kafka topic called __consumer_offsets — that declares "this consumer group has successfully processed up to this offset in this partition." The consumer writes this record explicitly; the broker does not advance it automatically when the consumer reads.

This distinction has practical consequences:

  • If a consumer reads a record but crashes before committing the offset, it will re-read that record after restarting. This is at-least-once delivery — the default.
  • If a consumer commits the offset before finishing processing, a crash after the commit means that record will not be re-read. This is at-most-once delivery.
  • Exactly-once delivery requires idempotent producers and transactional APIs — not the out-of-the-box behavior.
using Confluent.Kafka;

var config = new ConsumerConfig
{
    BootstrapServers = "localhost:9092",
    GroupId = "order-processor-v1",
    // Start from the earliest available record if no committed offset exists.
    AutoOffsetReset = AutoOffsetReset.Earliest,
    // Disable auto-commit so we control exactly when offsets are committed.
    EnableAutoCommit = false
};

using var consumer = new ConsumerBuilder<string, string>(config).Build();
consumer.Subscribe("orders");

while (true)
{
    ConsumeResult<string, string> result = consumer.Consume(TimeSpan.FromSeconds(5));
    if (result == null) continue;

    Console.WriteLine($"Offset {result.Offset}: {result.Message.Value}");

    // Process the record here...

    // Only commit AFTER successful processing.
    // StoreOffset marks it locally; CommitAsync writes to __consumer_offsets.
    consumer.StoreOffset(result);
    await consumer.CommitAsync();
}

This example disables auto-commit and commits only after processing is complete. If the process crashes between StoreOffset and CommitAsync, the offset is not persisted and the record will be re-consumed after restart — a feature, not a bug, that prevents silent data loss.

⚠️ Common Mistake: Leaving EnableAutoCommit = true (the default) and assuming it means "commit when I'm done." Auto-commit fires on a timer, not on record completion. A record can be auto-committed while it is still being processed in a background thread, turning a partial failure into a silently skipped record. For any processing logic with side effects, explicit commit control is the safer default.

The Full Lifecycle in One View

  1. Producer calls ProduceAsync("orders", record)
     │
  2. Broker assigns offset 42 to the record
     │
  3. Leader writes record to its local log
     │
  4. (If acks=all) Leader waits for ISR replicas to replicate
     │
  5. Broker sends acknowledgement back to producer
     │
     │          ─────── time passes ───────
     │
  6. Consumer polls broker: "Give me records from offset 42 on partition 0"
     │
  7. Broker returns record at offset 42 (and any subsequent records)
     │
  8. Consumer processes the record
     │
  9. Consumer commits offset 43 ("I've processed everything up to 42")
     │
 10. Record remains on the log until retention expires
     (Other consumers can still read offset 42 independently)

Step 10 is worth pausing on. The record is not gone. It sits on the log precisely as long as the retention policy permits, regardless of how many consumers have read it. This is what makes replay, backfill, and independent consumption at different speeds possible. The producer-to-consumer flow is fully asynchronous and decoupled in time — all coordination happens through offsets, a single integer per partition per consumer group.


Practical Mental Model: Reading Kafka Like a .NET Developer

The previous sections described what Kafka is structurally. This section takes a different angle: what does that model feel like when you are the developer holding the keyboard? The goal is to make the abstractions concrete enough that the first time you wire up the Confluent .NET client, nothing surprises you at the conceptual level.

The Git Analogy: Thinking in Offsets

The single most useful analogy for a .NET developer approaching Kafka is git. A Kafka topic partition behaves like a git commit log: events are appended to the tail, each one gets a sequential identifier (an offset, analogous to a commit SHA), and reading past events does not affect what anyone else sees or what comes next.

Kafka Partition (like a git log, oldest → newest)

  Offset:  0      1      2      3      4      5      6
           ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐
           │ e0 │ │ e1 │ │ e2 │ │ e3 │ │ e4 │ │ e5 │ │ e6 │  ← tail
           └────┘ └────┘ └────┘ └────┘ └────┘ └────┘ └────┘
                ↑                        ↑
          Consumer A                Consumer B
          (reading at 1)           (reading at 4)

  Neither consumer affects the other or the log itself.

The analogy has limits worth naming: unlike git, a Kafka partition is append-only — you cannot rewrite history. And retention is time- or size-based, so old offsets eventually become unavailable. But for building day-to-day intuition, the git commit log model holds surprisingly well. When you think "I want to re-read those events from yesterday," you are thinking like a git user doing git log --since=yesterday. Kafka supports exactly that mode of thinking — seek to an older offset and read forward.

A Minimal Producer in the Confluent .NET Client

A producer needs exactly three things to function:

  1. Bootstrap server address — how to find the cluster initially
  2. Key serializer — what to write to the wire for the record key
  3. Value serializer — what to write to the wire for the record value

Every other configuration key has a default you are implicitly accepting. Those defaults are not always wrong, but they represent choices: about batching, about durability (acks), about retry behavior.

using Confluent.Kafka;

// BootstrapServers is the only field required for a valid cluster connection.
// Acks, BatchSize, LingerMs, etc. all have defaults you are accepting implicitly.
var config = new ProducerConfig
{
    BootstrapServers = "localhost:9092"
};

// The type parameters <TKey, TValue> declare your serialization contract.
// StringSerializer is built-in; for custom types you implement ISerializer<T>.
using var producer = new ProducerBuilder<string, string>(config)
    .SetValueSerializer(Serializers.Utf8)
    .SetKeySerializer(Serializers.Utf8)
    .Build();

var result = await producer.ProduceAsync(
    topic: "order-events",
    message: new Message<string, string>
    {
        Key   = "order-123",
        Value = "{\"status\":\"placed\"}"
    });

Console.WriteLine($"Delivered to partition {result.Partition}, offset {result.Offset}");

Notice what is absent: no schema registry, no topic creation, no explicit acknowledgement configuration. The acks default in librdkafka is all, which is the safest setting — but if throughput is your priority, you will likely tune this later. The point is that the producer works with just a server address and serializers. Complexity is opt-in.

🎯 Key Principle: The minimal viable producer has three requirements — bootstrap address, key serializer, value serializer. Every other config is a default you are choosing to accept. Learn what those defaults are before you tune them.

A Minimal Consumer: Group ID Is Not Optional

The consumer has one additional mandatory field that catches many developers off guard: the group ID. This is not a cosmetic label. Kafka uses the group ID as the identity under which your consumer commits its progress. Without it, the broker cannot track where you left off, and the client will throw a configuration exception before it connects.

The second required decision is auto-offset-reset policy — what to do when your consumer starts and there is no previously committed offset for this group. Earliest starts from the oldest available record; Latest starts from records written after this consumer started. Neither is universally correct; the right answer depends on whether missing events during a first deployment is acceptable for your use case.

using Confluent.Kafka;

var config = new ConsumerConfig
{
    BootstrapServers  = "localhost:9092",
    GroupId           = "order-processor-v1",  // required — this is your consumer's identity
    AutoOffsetReset   = AutoOffsetReset.Earliest
    // EnableAutoCommit defaults to true — offsets are committed on a timer.
    // For production you often want EnableAutoCommit = false and manual commits.
};

using var consumer = new ConsumerBuilder<string, string>(config)
    .SetValueDeserializer(Deserializers.Utf8)
    .SetKeyDeserializer(Deserializers.Utf8)
    .Build();

consumer.Subscribe("order-events");

var cts = new CancellationTokenSource();

try
{
    while (!cts.IsCancellationRequested)
    {
        var record = consumer.Consume(cts.Token);

        Console.WriteLine(
            $"Partition {record.Partition}, " +
            $"Offset {record.Offset}: " +
            $"[{record.Message.Key}] {record.Message.Value}");

        // With EnableAutoCommit = false, call consumer.Commit(record) here.
    }
}
catch (OperationCanceledException)
{
    consumer.Close(); // flushes offsets and sends a leave-group request to the broker
}

Calling consumer.Close() rather than just disposing the consumer matters. Close() sends a leave group request to the broker, triggering an immediate partition rebalance among surviving consumers. If you skip it and let the session time out, your partitions stay unassigned until the session timeout expires — which can be tens of seconds depending on configuration.

🧠 Mnemonic: BAGBootstrap servers, Auto-offset-reset, Group ID. If any of the three is missing, the consumer either throws immediately or behaves unpredictably.

Serialization Is Your Contract — Not Kafka's

Kafka is a byte-transport system. When a producer writes a record, Kafka stores whatever bytes the serializer produces. When a consumer reads that record, Kafka hands back those same bytes. Kafka itself has no awareness of the schema, type, or encoding. It applies no validation.

The consequence: if your producer serializes a value as UTF-8 JSON and your consumer tries to deserialize it as a Protobuf message, you will not get a clear deserialization exception at the Kafka layer. The failure mode is silent data corruption, not a loud exception that points at the mismatch.

Producer side                     Kafka               Consumer side
─────────────────                ─────────          ────────────────────────
OrderEvent object                                    ??? (deserialized as...)
    │                                                         ▲
    │  JsonSerializer<OrderEvent>                             │  ProtobufDeserializer<PaymentEvent>
    ▼                                                         │
 {"orderId":"123"}  ──── raw bytes ──→  [0x7B 0x22 0x6F...] ─┘

                                  Kafka sees ONLY raw bytes.
                                  No schema check. No type check.
                                  Mismatch = silent corruption.

⚠️ Common Mistake: Assuming that type parameters on ProducerBuilder<TKey, TValue> and ConsumerBuilder<TKey, TValue> create any kind of runtime contract between producer and consumer. They do not. Those generics are purely compile-time conveniences within a single process. The bytes on the wire carry no type metadata unless your serializer explicitly encodes it — for example, Confluent's Schema Registry with Avro or Protobuf adds a schema ID prefix to each message.

The practical discipline:

  • Treat the serializer/deserializer pair as a shared interface between teams, versioned like a public API.
  • When onboarding a new consumer, confirm the exact serializer the producer uses before writing a single line of consumer code.
  • In production systems, a schema registry enforces this contract at write time.

A common way this surfaces in practice: a producer upgrades from plain string values to JSON-encoded strings (still string type, but now with structure inside). Consumers that previously treated the value as a raw identifier now receive JSON blobs they were not expecting. No Kafka error. No exception. Just unexpected data flowing into downstream logic until someone notices an anomaly in application behavior.


Common Misconceptions That Block Progress

Every mental model you carry into Kafka will either accelerate you or quietly sabotage you. These four misconceptions are the specific wrong beliefs that cause developers to write code that works in a demo and breaks in production.

Misconception 1: Kafka Is a Replacement for a Database

Wrong thinking: "We're already storing events in Kafka, so we can query it like a database when we need to look up a user's latest order."

Correct thinking: Kafka is a log transport — its job is to move events reliably from writers to readers, not to serve as a queryable store of record.

The structural differences matter enormously:

  • No key-based lookups. There is no equivalent of SELECT * FROM orders WHERE order_id = 42. Records are accessible only by partition and offset.
  • No indexes. Kafka maintains only one internal pointer: the offset.
  • Retention is finite by default. A topic configured with seven-day retention will silently discard records after that window. If your consumer falls behind by more than the retention window, those events are gone.

A team that stores payment events in Kafka with the intent of querying them for audits six months later will find the records gone — retention expired weeks ago. The correct architecture has a consumer writing those events to a durable store (a database, a data warehouse, or a compacted topic with an explicit retention strategy) and queries that instead.

Kafka is well-suited for being the pipeline that feeds a query-capable store. Tools like ksqlDB or Kafka Streams let you build materialized views derived from Kafka topics — but those are separate systems consuming from Kafka, not Kafka itself becoming queryable.

🎯 Key Principle: Use Kafka to move events. Use a database to answer questions about events.

Misconception 2: Consuming a Message Deletes It

Wrong thinking: "Once my consumer reads a message, it's gone — just like popping from a queue."

Correct thinking: Consuming a record does not modify the log in any way. The record remains in place until retention expires, and any number of independent consumers can read the same record on their own schedules.

This misconception is a direct import from traditional message queue thinking. When a consumer reads a record at offset 47, the broker does nothing to that record — it stays at offset 47, readable by any other consumer that asks for it. The only thing that changes is the consumer offset — a number that your consumer group stores to track how far it has read. That offset is state about the consumer, not state about the log.

Kafka Log (Topic: orders, Partition: 0)
─────────────────────────────────────────────────────
 Offset:  0          1          2          3
 Value:   order_A    order_B    order_C    order_D
─────────────────────────────────────────────────────
         ↑                     ↑
    Consumer Group X      Consumer Group Y
    (read up to 0)         (read up to 2)

Both groups see the full log. Neither group's reads
affect what the other group sees.

This is what enables replay — one of Kafka's most operationally valuable properties. If a downstream system processes events incorrectly due to a bug, you can reset the consumer group's offset back to an earlier point and re-process the entire window of events without any cooperation from the producer. This is impossible in a traditional queue where consumed messages are destroyed.

🤔 Did you know? Kafka does have a compaction mode (cleanup.policy=compact) where the broker retains only the most recent record per key. Even in this mode, individual consumers do not delete records — the broker's compaction process does, asynchronously, on its own schedule. Consumer reads are always non-destructive.

Misconception 3: Higher Replication Factor Always Means Slower Writes

Wrong thinking: "I set replication.factor=3 and my throughput dropped, so replication is a write performance tax proportional to my replica count."

Correct thinking: The write latency a producer experiences depends almost entirely on the acks setting, not on the replication factor. With acks=1, the producer waits only for the partition leader to acknowledge — replicas catch up asynchronously and the producer has already moved on.

acks Setting Producer Waits For Durability Typical Latency Impact
0 Nothing (fire and forget) None Lowest
1 Leader acknowledgement only Moderate Low
all (or -1) Leader + all in-sync replicas Highest Higher

With acks=1, replication is invisible to producer latency. The leader writes the record to its local log, sends the ack, and separately the followers fetch that record asynchronously. The producer is not in that loop.

⚠️ Common Mistake: With acks=1, if the leader crashes after acknowledging but before a replica has replicated that record, the record is lost. The producer received an ack, but the data is gone. This is not a performance trade-off — it is a correctness trade-off. Choosing acks=all with min.insync.replicas=2 gives a much stronger durability guarantee, at the cost of additional latency proportional to replica sync time.

🎯 Key Principle: Replication factor controls the durability ceiling. The acks setting controls how much of that ceiling the producer actually waits to confirm. These are independent dials.

Misconception 4: Kafka Guarantees Exactly-Once Delivery Out of the Box

Wrong thinking: "Kafka is enterprise messaging infrastructure, so it handles exactly-once delivery automatically — I don't need to think about duplicates."

Correct thinking: Kafka's default delivery guarantee is at-least-once. Exactly-once requires deliberate, explicit configuration of both the producer and the consumer, and even then it applies only within specific boundaries.

This is the most consequential misconception on this list because the failure mode is silent. Duplicates appear only during retries — network hiccups, broker leader elections, consumer rebalances — which are exactly the conditions that are hard to reproduce in a test environment.

Producer Write Lifecycle (default configuration)
─────────────────────────────────────────────────
 1. Producer sends record to broker leader
 2. Broker writes record, sends ack
 3. [Network drops ack in transit]
 4. Producer times out waiting for ack
 5. Producer retries — sends the record AGAIN
 6. Broker writes the record a second time
 Result: two identical records in the log
─────────────────────────────────────────────────

The broker did nothing wrong. The producer did nothing wrong. The protocol is working as designed. With default settings (enable.idempotence=false), there is no mechanism to detect that the retry is a duplicate.

Exactly-once in Kafka is achieved through two complementary mechanisms:

  • Idempotent producer (enable.idempotence=true): The broker assigns each producer a PID and tracks sequence numbers per partition. If a retry arrives with the same sequence number, the broker deduplicates it. This handles the retry scenario above — but only for a single producer session.
  • Transactional API: Wraps multiple writes across partitions or topics in an atomic transaction. A consumer configured with isolation.level=read_committed will only see records from committed transactions.
// Enabling idempotence is the minimum bar for avoiding
// producer-retry duplicates. It also implicitly sets
// acks=all and retries to a high value.
var idempotentConfig = new ProducerConfig
{
    BootstrapServers = "localhost:9092",
    EnableIdempotence = true,
    // TransactionalId enables the full transactional API.
    // Required if you need atomic multi-partition writes.
    TransactionalId = "my-transactional-producer-1"
};

using var producer = new ProducerBuilder<string, string>(idempotentConfig).Build();

producer.InitTransactions(TimeSpan.FromSeconds(10));
producer.BeginTransaction();

try
{
    await producer.ProduceAsync("output-topic",
        new Message<string, string> { Key = "k1", Value = "v1" });

    producer.CommitTransaction();
}
catch (ProduceException<string, string>)
{
    producer.AbortTransaction();
    throw;
}

Even with idempotent producers and transactional writes, exactly-once is bounded by the Kafka system boundary. If your consumer reads from Kafka and writes to an external database, that external write is outside Kafka's transaction scope. True end-to-end exactly-once processing that spans Kafka and external systems requires idempotent consumers — designing your downstream processing so that receiving the same record twice produces the same result as receiving it once, for example using upserts keyed on the record's unique identifier.

🧠 Mnemonic: A-E-XAt-least-once by default → Enable idempotence for single-producer dedup → X (transactional) for atomic multi-write. Each level adds guarantees and configuration complexity.


Summary

Kafka exists because tight producer-consumer coupling and destructive message deletion create an architectural ceiling — a point beyond which your integration layer becomes more work than the features it supports. The log-as-primary-data-structure model addresses both problems simultaneously: producers write once without knowing who will read, and consumers read independently without destroying what they've consumed.

The concepts below are the load-bearing walls. Everything that follows — partition assignment strategies, consumer group rebalancing, exactly-once stream processing, schema evolution — is built on top of them.

Concept What to Remember
Kafka's purpose High-throughput, ordered, durable event streams — not RPC, not a database
The log Append-only; reads never modify it; retention is time- or size-based, not acknowledgement-based
Offsets Immutable, per-partition integers; consumers track their own position independently
Records Raw bytes — Kafka enforces no schema; serialization is entirely your responsibility
acks setting Controls how long the producer waits, not whether replication occurs
Pull model Consumers fetch at their own pace; slow consumers cannot slow producers
Kafka is not a database No key-based lookups, no indexes, finite retention by default
Consuming does not delete Records persist until retention expires; multiple consumer groups share the same log
Replication vs. latency acks=1 decouples producer latency from replica count; acks and replication factor are independent dials
Delivery guarantees At-least-once by default; exactly-once requires EnableIdempotence and, for multi-partition atomicity, transactions

⚠️ The most dangerous misconception is the delivery guarantee one — not because it is hard to understand, but because the failure mode is intermittent. Systems built on incorrect assumptions about delivery semantics appear healthy until they don't. Make your delivery guarantee decision explicitly, document it, and design your consumers accordingly.

The next lessons move into partition assignment strategies, consumer group coordination, and configuration tuning — all of which become intelligible only once this base model feels like common knowledge rather than new information. That is the bar to clear before moving on.