Abstract digital artwork featuring smooth, overlapping curved shapes in shades of green and blue on a black background.

How Shopify uses Apache Kafka in production

Table of contents

Factor House
May 11th, 2026
xx min read

Shopify processes 1.75 trillion Apache Kafka messages every month across a fleet of self-managed clusters running on Google Kubernetes Engine. At the centre of that infrastructure is a change data capture pipeline that streams mutations from more than 100 MySQL database shards at a sustained 65,000 records per second, peaking at 100,000 records per second on Black Friday. Kafka is how Shopify moves data between its sharded monolith, its real-time processing layer, and its data warehouse — and has been since 2014, when the company also wrote and open-sourced its own Go Kafka client library because no suitable one existed.

Company overview

Shopify is a commerce platform that lets merchants of any size build and run online and physical retail operations. As of 2020, the platform had processed over $40 billion in total merchant sales across 175 countries, with a baseline of 2 million requests per minute scaling to 10 million at peak. The engineering organization is primarily Ruby on Rails and Go.

Shopify adopted Apache Kafka in 2014 as a central event bus for log aggregation and internal event collection. The initial deployment replaced a plain log-file batch system that introduced latency between event generation and availability in downstream dashboards and Hadoop storage. From 2014 to 2016, clusters were managed across data centre regions using Chef. In 2017, Shopify began migrating to Google Cloud Platform, completing the move to Kubernetes-managed Kafka by 2018. The change data capture system, which would become one of the most technically demanding parts of the Kafka estate, was rebuilt on Kafka Connect and Debezium in 2021.

Key milestones:

  • 2013: Shopify open-sources Sarama, a Go client library for Apache Kafka 0.8, built internally to avoid a JVM dependency
  • 2014: Kafka adopted as the internal data bus; Ruby on Rails producer pipeline built using SysV message queues and Go Sarama
  • 2016: Multi-tenant Kafka clusters managed across data centre regions via Chef
  • 2017: Cloud migration begins; flash sales Kafka architecture presented at Kafka Summit San Francisco
  • 2018: Full migration to Kubernetes StatefulSets on GCP; Sam Obeid and Christopher Vollick present the work at Kafka Summit London
  • 2020: Platform processes 1.75 trillion Kafka messages per month; CDC pipeline peaks at 100,000 records/sec during BFCM
  • 2021: Log-based CDC pipeline published; 400TB of CDC data stored in Kafka
  • 2022: HybridSource pattern for Flink and Kafka archival documented; BFCM live map pipeline using Flink, Kafka, and SSE published
  • 2024: BFCM data push reaches 12TB per minute; Kafka partition scaling identified as critical for analytics data freshness

Shopify's Kafka use cases

Event collection and log aggregation

The original use case from 2014. Kafka serves as the company-wide event bus for collecting events and aggregating log data from all internal systems and data centres. The data flows from producers through regional clusters, mirrors to an aggregate cluster, and lands in the data warehouse formatted as Apache Parquet. This pipeline replaced a batch log-file system that could not provide timely data for internal dashboards.

Change data capture

Shopify's core application is a sharded monolith backed by more than 100 independent MySQL database shards. The CDC pipeline streams binary log mutations from every shard into Kafka, presenting downstream consumers with a single compacted topic per logical table rather than requiring them to subscribe to 100+ shard-specific topics.

Before 2021, this was handled by Longboat, a query-based extraction tool that polled tables periodically. The limitation was that hard deletes were invisible: once a row was removed, Longboat could not capture the deletion. The replacement system uses Kafka Connect and Debezium to read MySQL binary logs directly, capturing every insert, update, and delete with a P99 latency of under 10 seconds from database write to Kafka availability.

Real-time buyer signal pipeline (Shopify Inbox)

Shopify Inbox is the merchant-to-customer messaging product. A real-time pipeline combines two Kafka event types: Monorail events (Shopify's internal structured event abstraction) and CDC events. These feed into Apache Beam jobs running on Google Cloud Dataflow that score customer intent signals and inform the Inbox response suggestion system. At non-peak hours, the pipeline processes tens of thousands of cart and checkout events per second.

BFCM live map

During Black Friday and Cyber Monday, Shopify runs a live map showing real-time global sales. Apache Flink reads from Kafka topics, computes per-region aggregations, and publishes results back to Kafka topics. A Golang SSE server subscribes to those output topics and pushes data to web clients. The 2021 BFCM event ingested 323 billion rows of data with end-to-end visualization latency of 21 seconds and 100% uptime throughout the event.

Elasticsearch multi-DC replication

Kafka is used to replicate Elasticsearch index updates across geographic regions, enabling a transition from active-passive to active-active multi-region search.

Microservice messaging

Ruby on Rails and Go services produce events through Monorail, Shopify's internal event abstraction layer. Monorail enforces schemas, supports schema versioning, and defines explicit producer-consumer contracts on top of raw Kafka topics.

Scale and throughput

MetricValueSource / ContextMonthly Kafka messages1.75 trillionShopify Engineering Blog, December 2020Events per day (all clusters)BillionsShopify Engineering Blog, November 2018CDC average throughput65,000 records/secCDC architecture post, March 2021CDC peak throughput (BFCM 2020)100,000 records/secCDC architecture post, March 2021CDC data stored in Kafka400TB+CDC architecture post, March 2021Debezium connectors~150 across 12 Kubernetes podsCDC architecture post, March 2021Cart/checkout events per second (non-peak)Tens of thousandsShopify Inbox pipeline post, December 2021BFCM 2021 rows ingested323 billionSSE data streaming post, November 2022BFCM 2024 data push12TB per minuteBFCM readiness post, November 2025Producer pipeline throughput (2014)Thousands of events/sec; billions per weekRails producer pipeline post, July 2014

Shopify's Kafka architecture

Cluster topology

Shopify runs multiple regional Kafka clusters, one per GCP region, and a single aggregate cluster. The aggregate cluster receives all data mirrored from the regional clusters via MirrorMaker and serves as the feed for the data warehouse. Clusters use 30-node configurations managed as Kubernetes StatefulSets with Persistent Volumes.

Kafka was previously deployed on VMs managed by Chef. The migration to Kubernetes, completed in 2018, was executed as a three-step process to avoid any downtime: deploy new cloud clusters, mirror the existing on-premises clusters to both aggregate clusters simultaneously, then migrate clients from the data centre to the cloud clusters.

Producer architecture

Rails and Go applications do not connect directly to Kafka brokers. Instead, they write events to a local SysV message queue. A Go-based producer process reads from that queue and produces to Kafka using the Sarama client library. When Shopify began containerising services with Docker, namespace isolation broke the SysV IPC mechanism, so a TCP-to-SysV proxy layer was added on the host to allow containerised services to continue writing to the queue.

This indirection fully decouples application code from Kafka availability: the SysV queue is sized to hold two hours of events, so Kafka cluster maintenance or restarts have zero impact on producers.

CDC pipeline architecture

The CDC pipeline for Shopify's sharded monolith follows a four-stage pattern:

  1. One Debezium MySQL connector per database shard reads from the MySQL binary log.
  2. A RegexRouter transform consolidates events from all shards into intermediate Kafka topics.
  3. A custom Kafka Streams application reads from the intermediate topics and demultiplexes records by table.
  4. Output: one compacted Kafka topic per logical table, partitioned by primary key.

Consumers subscribe to the per-table output topics and see a unified view of each table regardless of how many shards the underlying data spans. Confluent Schema Registry provides data discovery and dependency tracking across CDC topics.

Stream processing

A large portion of Shopify's Flink applications use Kafka as both source and sink. The BFCM live map is one example: Flink reads from Kafka, computes aggregations, and publishes results back to Kafka. The Streaming Capabilities team also manages HybridSource pipelines, described in the special techniques section below.

Consumer architecture

Kafka topics have per-topic retention policies. Expired data is archived to Google Cloud Storage rather than deleted, enabling Flink applications to backfill from historical data using the HybridSource connector before switching to live Kafka consumption.

Special techniques and engineering innovations

SysV message queue producer decoupling

Shopify's producer pipeline places a SysV message queue between application code and the Kafka producer process. This pattern predates widespread use of producer buffering at the client level and gives Shopify a host-local buffer that survives Kafka cluster disruptions. The queue holds up to two hours of events, making cluster maintenance transparent to producers.

Compacted per-table CDC topics with cross-shard demultiplexing

Each logical database table is represented as a single compacted Kafka topic partitioned by primary key. Compaction keeps only the most recent record per key, giving consumers a current-state view without replaying full history. The complexity of 100+ underlying MySQL shards is absorbed entirely by the CDC pipeline: a RegexRouter transform consolidates shard-specific events into intermediate topics, and a custom Kafka Streams application repartitions records by table into the final output topics.

GCS Serde for large records

CDC events from large database rows can exceed Kafka's 1MB per-message limit. Shopify implemented a custom Serde that detects oversized payloads, stores them in Google Cloud Storage, and writes a GCS pointer to the Kafka topic. Consumers use the same Serde to transparently retrieve the payload from GCS. Standard Kafka consumers remain compatible without modification.

HybridSource for Flink backfill

Kafka topics at Shopify have configured retention limits. When a topic expires data, that data transitions to GCS archives partitioned across thousands of splits. Flink applications that need to backfill historical data use the HybridSource connector to read from the GCS archive first, then seamlessly transition to the live Kafka topic. This removes the need for manual coordination between historical batch jobs and real-time stream processing.

Kubernetes rack-awareness and rolling restart safety

Kafka broker pods use inter-pod anti-affinity rules to prevent co-located replicas, and Kubernetes zone labels are mapped to Kafka rack configuration to ensure replicas are distributed across availability zones. Readiness probes gate rolling restarts, preventing more than one broker from going offline simultaneously during configuration changes or upgrades. Kafka's sensitivity to concurrent broker restarts can trigger terabytes of partition rebalancing, and the pod anti-affinity and readiness probe combination is how Shopify avoids this.

Open-source contribution to Debezium

During initial CDC deployment, full table snapshots held MySQL read locks for hours at a time, blocking writes on production databases. A Shopify engineer implemented a lock-free snapshot mode for Debezium's MySQL connector and contributed it upstream to the open-source project. This mode has since been available to all Debezium users.

Operating Kafka at scale

Deployment model: Self-managed on Google Kubernetes Engine. Brokers run as Kubernetes StatefulSets with dedicated Persistent Volumes. Kafka pods have resource-monitoring sidecars, node affinity rules to place brokers on dedicated nodes, and inter-pod anti-affinity to spread replicas.

Retention and archival: All Kafka topics have per-topic retention policies. Expired data is archived to Google Cloud Storage to support Flink HybridSource backfills rather than being discarded.

BFCM capacity planning (Game Days): Shopify runs annual chaos-engineering exercises called Game Days starting in spring, simulating 150% of the previous year's BFCM peak load across three GCP regions. Game Days have identified Kafka-specific capacity gaps: the analytics infrastructure required partition count increases to maintain data freshness during traffic spikes, and these increases are now part of the pre-BFCM preparation checklist.

CDC connector management: Approximately 150 Debezium connectors are managed across 12 Kubernetes pods, with one connector per MySQL shard to isolate failure domains. Schema evolution for CDC consumers was an active area of work as of 2021, with the team implementing governance through Confluent Schema Registry for dependency tracking.

Cloud migration protocol: The migration from on-premises VMs to GCP Kubernetes was executed in three phases: deploy cloud clusters, run MirrorMaker to replicate from on-premises to both aggregate clusters simultaneously, then migrate clients. No downtime was required during the transition.

Challenges and how they solved them

JVM dependency for the Kafka client Kafka's initial client libraries were Java and Scala only. Deploying a JVM instance on all of Shopify's Go-native servers was considered impractical. In 2013, engineer Evan Huus built Sarama, a Go client library for Apache Kafka, and Shopify open-sourced it under the MIT license. Sarama is now maintained by IBM and remains a widely used Go Kafka client.

Docker namespace isolation breaking SysV IPC When Shopify containerised Ruby on Rails services, Docker's process namespace isolation broke the SysV message queue mechanism used to buffer events for Kafka. The producer process ran on the host, but containerised services could no longer write to the host's IPC namespace. The solution was a TCP-to-SysV proxy layer on the host that accepted messages over a TCP socket and wrote them into the SysV queue. Containerised services connect over TCP; the rest of the pipeline is unchanged.

CDC table snapshot locking Debezium's initial snapshot mode for MySQL connector held a global read lock for the duration of a full table snapshot. For Shopify's large tables, this meant hours of blocked writes on production databases. A Shopify engineer contributed a lock-free snapshot mode to the upstream Debezium project, reading table data without holding a lock. This resolved the blocking but introduced a different constraint: the lock-free mode cannot guarantee a consistent point-in-time snapshot while binlog events are also being consumed. Remaining limitations include tables too large to snapshot within practical timeframes.

Records exceeding the 1MB Kafka message size limit Certain database rows, particularly those with large text or blob fields, produced CDC events that exceeded Kafka's default message size limit. Rather than raising the broker message size limit globally (which affects all topics), Shopify implemented a custom Serde that externalises oversized payloads to GCS and writes a pointer record to Kafka. Consumers using the same Serde retrieve the payload from GCS transparently.

100+ shard complexity for CDC consumers Shopify's monolith spans more than 100 MySQL shards. A naive CDC design would require consumers to subscribe to shard-specific topics for every table they care about. Shopify's solution abstracts the sharding entirely: a RegexRouter transform and a custom Kafka Streams demultiplexer consolidate all shard events into unified per-table output topics. Consumers interact only with the per-table topics.

Schema evolution in CDC topics Breaking changes to internal database schemas propagate to all consumers of the corresponding CDC topics. As of the March 2021 blog post, this was an active challenge with mitigation strategies under development. Confluent Schema Registry is used for dependency tracking.

Kafka partition count insufficient for BFCM analytics throughput Game Days exercises revealed that analytics Kafka topics needed higher partition counts to sustain data freshness at BFCM scale. Partition count increases are now a standard item in the pre-BFCM preparation checklist.

Full tech stack

Category Technology Role
Message broker Apache Kafka Central event bus and streaming platform for all internal pipelines
Kafka client Sarama (originally Shopify/sarama, now IBM/sarama) Go client library for all Kafka producers; built by Shopify in 2013
Deployment / infra Kubernetes (GKE, GCP) StatefulSet-based Kafka broker deployment and lifecycle management
CDC connector Debezium MySQL binary log CDC source connector; one instance per database shard
Connectors Kafka Connect Orchestrates Debezium connectors and RegexRouter transforms in the CDC pipeline
Schema registry Confluent Schema Registry Data discovery and dependency tracking for CDC topics
Stream processing Kafka Streams Custom application for CDC cross-shard demultiplexing by table
Stream processing Apache Flink Stateful stream processing; reads from and writes to Kafka for real-time analytics and BFCM live map
Stream / batch processing Apache Beam / Google Cloud Dataflow Unified batch and stream processing for the Shopify Inbox buyer signal pipeline
Batch processing Apache Spark Batch processing stage in the data platform pipeline
Storage sinks Google Cloud Storage (GCS) Archive for expired Kafka topic data; external payload store for oversized CDC records
Storage sinks Google Cloud Bigtable Delivery backend for the Reportify query service
Data format Apache Parquet On-disk format for data platform processed output
Replication MirrorMaker Replicates regional Kafka clusters to the aggregate cluster for data warehouse ingestion
Internal abstraction Monorail Internal event layer enforcing schemas and version contracts on top of raw Kafka topics
Streaming server Go SSE server Subscribes to Kafka topics and pushes processed data to web clients via Server-Sent Events (BFCM live map)

Key contributors

Name Title / Team Contribution
Evan Huus Engineer, Shopify Created Sarama, Shopify's open-source Go Kafka client library (2013)
Simon Eskildsen Engineer, Shopify Designed the Ruby on Rails Kafka producer pipeline using SysV MQ and Go Sarama (2014)
Sam Obeid Senior Production Engineer, Shopify Led Kafka-on-Kubernetes migration; authored the 2018 engineering blog post; speaker at Kafka Summit London 2018
Christopher Vollick Engineer, Shopify Speaker at Kafka Summit London 2018, "Kafka in Containers in Docker in Kubernetes in the Cloud"
John Martin Senior Production Engineer, Streaming Platform Co-led log-based CDC architecture design and documentation (2021)
Adam Bellemare Staff Engineer, Data Platform Co-authored CDC architecture blog post (2021)
Ashay Pathak Data Scientist, Messaging team Co-authored Shopify Inbox real-time buyer signal pipeline post (2021)
Selina Li Data Scientist, Messaging team Co-authored Shopify Inbox real-time buyer signal pipeline post (2021)
Kevin Lam Engineer, Streaming Capabilities team Co-authored Flink optimisation post covering HybridSource and Kafka retention (2022)
Rafael Aguiar Engineer, Streaming Capabilities team Co-authored Flink optimisation post (2022)
Bao Nguyen Senior Staff Data Engineer Designed and documented the BFCM live map Kafka, Flink, and SSE pipeline (2022)
Arbab Ahmed Reliability Engineering lead Co-authored data platform scaling post covering 1.75 trillion monthly Kafka messages (2020)
Bruno Deszczynski DPE Technical Program Manager Co-authored data platform scaling post (2020)

Key takeaways for your own Kafka implementation

  • Decouple producers from brokers at the host level. Shopify's SysV message queue layer means application code never holds a direct connection to Kafka. A two-hour local buffer absorbs cluster maintenance windows without producer-side changes. If your producers are tightly coupled to broker availability, consider an intermediary buffer before the producer client.
  • Build a CDC abstraction that hides database sharding from consumers. A one-connector-per-shard approach with a RegexRouter and a Kafka Streams demultiplexer lets you present unified per-table topics to downstream consumers regardless of how many shards exist. This makes the shard count a deployment detail rather than an API contract.
  • Compacted topics are a viable current-state store for CDC data. Shopify uses compacted per-table Kafka topics partitioned by primary key to give consumers the latest record per row without requiring them to replay full history. This is a practical alternative to maintaining a separate read database for certain consumer patterns.
  • Plan for large records before they cause incidents. If your data includes variable-length text or binary fields, the 1MB Kafka message limit will eventually be reached. Shopify's GCS Serde externalises oversized payloads before they reach the broker, keeping the solution transparent to standard consumers and avoiding broker-level configuration changes that affect every topic.
  • Treat BFCM-scale capacity planning as a year-round process. Shopify's Game Days exercises start in spring and run through autumn. The discovery that Kafka partition counts needed to increase for analytics freshness came from load testing, not from production incidents. If your traffic has a predictable peak season, model Kafka throughput requirements explicitly before the peak rather than scaling reactively.

Sources and further reading

If you want visibility into your own Kafka estate, including consumer lag, partition health, and topic throughput, give Kpow a try with a free 30-day trial. You can connect it to any Kafka cluster in minutes and deploy it via Docker, Helm, or JAR.