
How Uber uses Apache Kafka in production
Table of contents
Uber's Kafka deployment is one of the most extensively documented in the industry, with the company publishing more than a dozen engineering blog posts and conference talks describing how its Streaming Platform team has evolved the infrastructure over a decade. By 2021, Uber was processing trillions of messages per day across tens of thousands of topics, with throughput that had grown from roughly one million to twelve million messages per second over five years. Apache Kafka sits at the centre of nearly every real-time system Uber operates, from matching riders with drivers to billing advertisers to detecting fraud.
Company overview
Uber operates a global ride-hailing, food delivery, and freight platform across more than 70 countries. At peak, thousands of trips are being coordinated simultaneously, each generating a continuous stream of GPS updates, payment events, and state changes from driver and rider applications. That volume, combined with strict latency requirements for matching and pricing, pushed Uber toward an event-driven architecture early in its growth.
Kafka was adopted in early 2015, beginning with a small cluster in a single region. Within a year, the platform was auditing approximately one trillion messages per day. By 2021, that figure had grown to trillions, and the team had built a suite of internal tools - uReplicator, Chaperone, uForwarder, and uGroup - on top of it. The timeline below traces the major milestones.
Timeline
Uber's Kafka use cases
Kafka underpins real-time operations across Uber's core products and its internal data platform. The use cases below span multiple engineering teams and reflect how the system has expanded from transport to advertising and insurance.
Rider-driver matching and surge pricing
GPS location events from rider and driver applications flow into Kafka, where stream processing jobs analyse supply and demand in near real-time. The same event streams power dynamic pricing, which updates fares every few seconds. UberEats ETAs are also calculated from Kafka-sourced event streams. These pipelines operate with strict latency targets: the platform targets API latency below 5ms and 99.99% availability.
Fraud detection
Streaming analysis of transaction and session events runs continuously to identify fraud patterns. Bot detection and rider session analytics feed into similar pipelines.
Change data capture
The DBEvents framework reads MySQL binary logs via an internal tool called StorageTapper and streams change events into Kafka. Cassandra CDC events follow the same path. Downstream, these events land in Uber's Hadoop data lake via Apache Hudi, which supports upsert-capable writes for incremental updates.
Ad event processing
Impressions and clicks from UberEats advertising flow through a Kafka pipeline that handles ad pacing, budget management, and customer billing. This pipeline operates with exactly-once guarantees, described in more detail under Special techniques below.
Async microservice queuing
More than 1,000 internal consumer services use Kafka as an asynchronous queue via the uForwarder Consumer Proxy rather than consuming directly from brokers. The proxy handles offset management and delivery, shielding application teams from Kafka consumer group mechanics.
Dead letter queues and retry pipelines
The Driver Injury Protection insurance product deducts per-mile premiums on each trip. When payment processing fails, events route through tiered retry topics with increasing delays before landing in a dead letter queue. Independent consumer groups maintain retry pipelines for each failure tier.
Data archival and real-time analytics
Kafka events sink to Apache Pinot for real-time OLAP queries and to Apache Hive for warehouse analysis. This combination gives Uber both low-latency query access to recent data and long-term historical analysis from the same event stream.
Kappa architecture backfill
Streaming and batch jobs share a single codebase. Rather than replaying historical data into Kafka for backfill runs, Uber models Apache Hive as a streaming source in Spark Structured Streaming, allowing the same job logic to run against historical data without reloading it into the cluster.
Scale and throughput
Throughput grew twelve-fold over five years. Much of that growth was absorbed by the Consumer Proxy rather than by adding partitions, since the proxy decouples partition count from consumer service concurrency.
Uber's Kafka architecture
Cluster topology
Uber operates two tiers of Kafka clusters across multiple regions. Regional Clusters receive all producer traffic; producers always publish to their local region. Aggregate Clusters hold cross-region replicas and provide a unified global view of each topic. Replication between tiers is handled by uReplicator, Uber's in-house replacement for Kafka MirrorMaker.
In addition to the regional/aggregate split, clusters are segmented by use case: separate clusters exist for logging, database changelogs, and high-reliability messaging. This isolation contains blast radius when one cluster has problems and allows retention and replication policies to be tuned per use case.
A Kafka Cluster Federation layer presents multiple physical clusters as a single logical cluster. A Kafka Proxy handles metadata routing, and a central Metadata Service maintains the registry of which topics live on which physical clusters.
Producer architecture
Producers publish through a Local Agent, a persistence layer that buffers messages on the producer side before writing to brokers. This improves durability by ensuring messages survive transient broker unavailability without being dropped at the client. Multi-language producer support covers Java, Go, Python, Node.js, and C++. An internal Kafka REST Proxy provides an HTTP interface for non-JVM producers; through internal optimisation work, its throughput was raised from 7,000 to 45,000 QPS per box.
Consumer architecture
Rather than having each microservice manage its own Kafka consumer group, Uber routes most consumer traffic through uForwarder, a push-based Consumer Proxy. uForwarder fetches messages from Kafka partitions and delivers them to consumer service endpoints via gRPC. This design separates partition count from processing concurrency: a consumer service can scale its processing threads independently of how many Kafka partitions the topic has. Offset commits are managed centrally by the proxy rather than by individual service instances.
uForwarder also handles out-of-order offset commits. Rather than blocking a partition on a slow message, it acknowledges individual messages and commits only contiguous completed ranges. This allows parallel processing within a partition without the risk of losing acknowledgement state.
Stream processing
Apache Flink is the primary stream processing engine for stateful workloads, including fraud detection and ad event aggregation. Apache Samza runs alongside Flink for some pipelines. Apache Spark Structured Streaming is used for the Kappa architecture backfill jobs described above.
Kafka Connect ecosystem
CDC ingestion uses StorageTapper, Uber's internal MySQL binlog reader, to produce change events into Kafka. Downstream, sinks connect to Apache Hive, HDFS, and Apache Pinot. The Chaperone audit system also consumes every Kafka message as part of the observability pipeline.
Special techniques and engineering innovations
uReplicator: custom cross-cluster replication
Uber replaced Kafka MirrorMaker with uReplicator after MirrorMaker caused weekly production outages. The root cause was consumer group rebalancing: whenever a topic was added or changed, MirrorMaker would pause replication for five to ten minutes while rebalancing completed. uReplicator addresses this with Apache Helix for static partition assignment and a DynamicKafkaConsumer that eliminates rebalance-triggered pauses. New topics can be added to replication at runtime without restarting the cluster. uReplicator also applies header-based filters during replication to prevent cyclic data duplication across federated clusters.
Exactly-once ad event processing
Uber's advertising billing pipeline uses a combination of Flink transactional producers, Kafka read_committed consumer isolation, two-minute checkpoint intervals, and per-record UUIDs to achieve end-to-end exactly-once delivery. Deduplication at the sink layer uses Apache Pinot's native upsert capability and Hive keyed on the same UUIDs. This pipeline was built specifically to avoid double-counting billable advertising events.
Kafka tiered storage
Uber implemented Kafka tiered storage using the KIP-405 pluggable storage interface. Recent log segments remain on broker disk for low-latency access. Older segments are offloaded to remote storage (HDFS, S3, GCS, or Azure) transparently, without the consumer needing to know which tier a message is fetched from. This decouples storage retention from broker capacity: longer retention periods no longer require adding brokers or running separate data pipelines to external storage.
Kappa architecture backfill via Hive as a streaming source
When a streaming job needs to backfill historical data, replaying that data into Kafka adds significant cluster load and can disrupt real-time consumers. Uber avoids this by modelling Apache Hive as a streaming source in Spark Structured Streaming. The same job code handles both real-time Kafka consumption and historical Hive reads, and windowing semantics are preserved across the two modes.
Dead letter queue topology
Uber's dead letter queue implementation uses a multi-tier retry topology: a main consumption topic feeds into retry topics with increasing delay intervals, and messages that exhaust retries land in a dead letter topic. Each tier uses Avro schemas and a leaky bucket pattern for flow control. Multiple independent consumer groups can maintain their own retry pipelines against the same underlying topics.
uGroup: consumer group visibility via __consumer_offsets decoding
Standard Kafka consumer group monitoring relies on active consumers reporting metrics. This misses consumer groups that are stopped or failing silently. uGroup is a streaming job that decodes Kafka's internal __consumer_offsets topic directly, making all consumer group activity visible regardless of consumer state. It also tracks offset state across regions to support disaster recovery failover.
Operating Kafka at scale
End-to-end auditing with Chaperone
Chaperone is Uber's audit system for Kafka pipelines. It consumes every message across all topics and uses ten-minute tumbling windows to compute count, p99 latency, and duplication metrics at four points in the pipeline: the proxy client, the proxy server, regional brokers, and aggregate brokers. It audits more than 20,000 topics and uses write-ahead logging and UUIDs to ensure that audit records themselves are written with exactly-once semantics. When a discrepancy appears, operators can pinpoint which pipeline tier introduced data loss or duplication.
Consumer group observability with uGroup
uGroup emits lag metrics and stuck-partition alerts for all consumer groups, including those that are not currently running. During a multi-region failover, uGroup provides the offset state mapping needed for consumers to resume from the correct position in the target region.
Schema governance with Schema-Service and Heatpipe
Uber maintains an in-house schema registry called Schema-Service that enforces backward-compatible Avro schema evolution. The Heatpipe library, used by producers, validates messages against the registered schema at ingestion time. This prevents malformed or schema-incompatible data from entering Kafka pipelines.
Topic ownership enforcement
Uber requires ownership metadata at topic creation. Automated tooling infers ownership where possible. This is part of a broader data culture initiative to ensure every dataset - including Kafka topics - has an identifiable owner who can be contacted during an incident.
Centralised upgrades via Consumer Proxy
Before uForwarder, each of Uber's 1,000+ consumer services maintained its own Kafka client library, often across multiple languages. Upgrading Kafka client versions required coordinating changes across hundreds of services. With uForwarder, the Kafka consumer implementation is centralised in the proxy. Client upgrades happen in the proxy without requiring changes to the services it serves.
Deployment
Uber runs Kafka as a self-managed deployment across its own infrastructure in multiple geographic regions.
Challenges and how they solved them
Full tech stack
Key contributors
Key takeaways for your own Kafka implementation
- Replication tooling matters at scale. MirrorMaker's consumer group rebalancing caused weekly outages for Uber. If you are replicating across clusters or regions, understand how your replication tool handles topic changes and partition rebalancing before it becomes a production problem.
- Partition count and consumer concurrency are separate concerns. Uber's Consumer Proxy approach shows that you do not have to create more partitions to scale consumer throughput. A proxy or multiplexing layer can decouple the two, which is relevant if you have many small consumers or need to avoid partition-count overhead.
- Exactly-once requires a coordinated strategy across producers, consumers, and sinks. Uber's ad billing pipeline combines Flink transactions, Kafka
read_committedisolation, and per-record UUIDs with Pinot upserts. No single piece provides the guarantee on its own. If you need exactly-once for a high-stakes pipeline, plan the deduplication strategy at every tier from the start. - Tiered storage changes the broker-sizing conversation. Decoupling log retention from broker disk means you can extend retention without adding brokers. If your current retention policy is constrained by storage cost, tiered storage is worth evaluating before the next round of broker capacity planning.
- Centralising consumer infrastructure simplifies client upgrades. Coordinating a Kafka client upgrade across hundreds of services in multiple languages is operationally expensive. If you are managing many consumer services, a shared consumer proxy or library layer reduces the coordination overhead significantly.
Sources and further reading
If you want to explore your own Kafka topics and consumer groups with the kind of visibility Uber has built internally, Kpow offers a free 30-day trial and connects to any Kafka cluster in minutes.