
How The New York Times uses Apache Kafka in production
Table of contents
The New York Times is one of the largest English-language news publishers in the world, with a digital subscriber base in the millions and content operations spanning text, photography, video, and interactive journalism. Its publishing infrastructure must serve a website, iOS and Android applications, search, and personalisation systems — all simultaneously and with very low latency from the moment a journalist hits publish.
The Times operated multiple legacy content management systems for decades, each built independently with different APIs and schemas. Content from the 1920s was digitised via OCR; content from the late 1990s lived in a CMS that bore no resemblance to the one used a decade later. By 2017, the organisation had reached a point where the inconsistency between these systems was creating real engineering cost: every downstream service had to understand each upstream API individually, schema changes caused inconsistencies, and bootstrapping a new system against the historical content archive meant making individual API calls at a scale that created unpredictable load.
The trigger for adopting Kafka was the decision to build a unified publishing pipeline. Kafka's two distinguishing properties — infinite event retention and globally ordered consumption — made it the only viable platform for treating the content log as a permanent store rather than a transient queue.
Timeline
The New York Times' Kafka use cases
Content publishing and distribution
The primary use case is the publishing pipeline. When any piece of content is ready for publication, the producing service writes it to the Monolog — a single-partition Kafka topic that acts as the canonical record of everything the Times has ever published. Downstream consumers read from this topic to power their own systems.
The downstream services include:
- The website and native iOS/Android applications, which need published content available with minimal latency after a journalist publishes
- The Elasticsearch cluster powering site search, which requires a denormalised view of content for indexing
- Personalisation systems, which need to reprocess recent content to build recommendations
- Content list services, both manually curated and query-driven
Before the Monolog, each of these systems maintained its own connection to upstream APIs. After the Monolog, they all consume from Kafka, with no knowledge of or dependency on the producing CMS.
Derived views: the Denormalized Log and Skinny Log
Two additional Kafka topics serve specific downstream needs:
The Denormalized Log is a multi-partition topic created by the Denormalizer service (described in the architecture section). It republishes each top-level asset together with all of its resolved dependencies — so Elasticsearch ingestion nodes receive a complete, self-contained payload rather than having to chase references. It is partitioned by top-level asset URI, allowing parallel ingestion.
The Skinny Log carries lightweight processing notifications — signals that an asset has been processed and is available — without the full content payload. Downstream systems use it for cache invalidation and for tracking service level objective (SLO) metrics: end-to-end latency from publish event to downstream availability.
Scale and throughput
The New York Times' Kafka deployment is not high-frequency by the standards of financial services or ride-sharing platforms. The content corpus is editorial: text, images, and associated metadata. As of 2017, the full archive — every published asset since 1851 — totalled less than 100GB and fit in a single Kafka topic on a single disk.
The small corpus size is a deliberate consequence of the data model: normalisation means each asset is stored once, with references handled via URI rather than duplication. The single-partition Monolog fits comfortably on a single disk, which makes full log replay from offset 0 a practical operational tool rather than a theoretical capability.
The New York Times' Kafka architecture
The Monolog
The Monolog is a single-partition Kafka topic with infinite retention. It holds every published asset in chronological order, with one critical constraint: assets are topologically sorted so that every dependency appears in the log before any asset that references it. An image shared across multiple articles is written once; each article references that image by URI. A consumer reading from offset 0 will always encounter an image before any article that embeds it.
Assets are serialised as Protocol Buffer v3 (proto3) binaries. Every asset is identified by a structured URI: for example, nyt://article/577d0341-9a0a-46df-b454-ea0718026d30. This uniform identifier scheme applies regardless of asset type, making the reference graph consistent and resolvable.
The gateway service
No producer writes directly to the Monolog. All writes pass through a gateway service that validates each asset against the protobuf schema before allowing it into the log. A custom linter enforces forward and backward compatibility on every schema change, ensuring that new field additions or type modifications do not silently break existing consumers. Schema enforcement at the write boundary keeps the log clean from the start rather than requiring downstream consumers to handle malformed messages defensively.
The Denormalizer
The Denormalizer is a Java application built on the Kafka Streams API. It consumes the Monolog and maintains a local state store of the latest version of every asset, along with the reference graph linking assets together.
When a dependency is updated — for example, an image caption is corrected — the Denormalizer identifies every top-level asset that references that image and re-publishes those top-level assets to the Denormalized Log. This means downstream consumers such as Elasticsearch always receive the complete, current bundle of a top-level asset and all its dependencies in a single message, without needing to chase references themselves.
Elasticsearch ingestion
Each Elasticsearch ingestion node runs a Kafka Streams application that reads a partition of the Denormalized Log. It assembles JSON documents from the protobuf payloads and writes them to specific Elasticsearch shards. Because the Denormalized Log is partitioned by top-level asset URI, each ingestion node owns a deterministic subset of the content, enabling parallel ingestion without coordination between nodes.
Producer and consumer architecture
Producers: All content systems write through the gateway service. The gateway performs schema validation and then appends the asset to the single-partition Monolog. Producers are decoupled from all downstream consumers — they have no knowledge of how many consumers exist or what they do with the content.
Consumers: Each downstream system maintains its own consumer group and offset position in the relevant log (Monolog, Denormalized Log, or Skinny Log). Because Kafka retains events indefinitely, a consumer can always replay from offset 0 to rebuild its data store from scratch — which means stateless, immutable service deployments become practical.
GraphQL layer
A GraphQL schema is automatically generated from the protobuf definitions. This means the Kafka data model and the consumer API contract stay in sync without manual schema maintenance. The GraphQL schema exposes the same typed structure as the protobuf messages, giving API consumers a self-documenting interface derived directly from the canonical data model.
Special techniques and engineering innovations
Single-partition topic for total causal ordering
The decision to use a single partition for the Monolog is the most deliberate architectural departure from typical Kafka practice. Multi-partition topics offer parallelism and higher throughput, but they cannot guarantee global ordering across partitions. The Monolog requires total ordering: a referenced asset must always be visible before the asset that references it. With multiple partitions, an article could appear in one partition before its referenced image appears in another, breaking consumers that try to resolve the dependency graph in order. A single partition makes this impossible.
This is a reasonable trade-off given the data volume. Editorial content does not arrive at the rates that would make a single partition a throughput bottleneck.
Topological sort of the dependency graph
Before an asset is written to the Monolog, it is sorted topologically within its dependency graph. The Times described it as ensuring "you always see a referenced asset before the asset doing the referencing." This is not handled by Kafka itself — it is enforced by the producing system before the write. The result is a log where consumers can process messages in sequence without defensive look-ahead or buffering.
Immutable service deployments via log replay
Because Kafka retains the full content history indefinitely, any downstream data store can be rebuilt from scratch by replaying from offset 0. The Times uses this property to enable immutable, stateless service deployments. Rather than running in-place database migrations when a schema or data change needs to be reflected downstream, a team can destroy its data store and replay the log. This eliminates a class of errors that arise from incremental state mutation over time.
Derived logs as first-class pipeline artefacts
The Denormalized Log and Skinny Log are not implementation details or optimisation hacks — they are first-class outputs of the pipeline, each designed to serve a specific downstream need. The Denormalized Log removes the reference-resolution burden from consumers that need complete assets. The Skinny Log gives lightweight notification consumers a low-overhead signal without the full payload. Designing explicit derived logs keeps individual consumer applications simple and reduces coupling between the core Monolog and the specific requirements of each downstream system.
Operating Kafka at scale
Deployment model: Kafka and ZooKeeper run on Google Cloud Platform (GCP) Compute instances. Application services — the gateway, the Denormalizer, and consumer applications — run in containers on Google Kubernetes Engine (GKE).
Service communication: Inter-service communication uses gRPC over Cloud Endpoints. Kafka Consumer API connections are made over SSL.
Schema governance: The custom protobuf linter that runs at write time is the primary schema governance mechanism. It checks forward and backward compatibility on every schema change before allowing it into the gateway, which means breaking changes are caught before they reach the log.
SLO tracking: The Skinny Log provides the signal for measuring end-to-end publish latency — from the moment an asset is written to the Monolog to the moment downstream systems confirm it is available. This gives the team an observable measure of pipeline health tied to a user-facing outcome.
Operational lesson: Boerge Svingen noted in the Software Engineering Daily interview that self-managing Kafka added meaningful operational burden. He indicated that a managed Kafka service would significantly reduce this overhead — a consideration worth weighing for teams planning similar log-based architectures.
Challenges and how they solved them
Legacy API fragmentation
Problem: Each of the Times' legacy CMS systems had its own API, developed independently with no shared schema. Consumers had to maintain separate integration logic for each upstream system.
Root cause: Decades of independent CMS development without a unifying data contract.
Solution: The Monolog and its gateway enforce a single protobuf schema across all producers. Every piece of content — regardless of which CMS originated it — must conform to the same schema before it enters the log. Consumers only need to understand one format.
Outcome: Downstream services are decoupled from CMS internals entirely. A CMS change does not require downstream teams to update their integration code.
Historical content access at scale
Problem: Services that needed to bootstrap against historical content had to make individual API calls per asset. At 166 years of archives, this created unpredictable and unmanageable load on upstream APIs.
Root cause: API-based access is not designed for bulk historical retrieval. Each call adds latency and load that compounds at archive scale.
Solution: Log replay from Kafka offset 0. Because the Monolog retains all content indefinitely, any consumer can replay from the beginning to build or rebuild its data store without making API calls to upstream systems.
Outcome: A new downstream system can bootstrap against the full content archive by reading the Monolog from offset 0, with no coordination required with upstream teams and no risk of overloading upstream APIs.
In-place state mutation and schema drift
Problem: Services that maintained their own permanent state had to implement migration logic for each schema change, leading to inconsistencies between systems that had migrated at different times.
Root cause: Distributed mutable state creates divergence when schema changes are applied incrementally.
Solution: Log replay enables a different approach: instead of migrating state in place, a service can rebuild its data store from the log after a schema change. The Denormalizer and Elasticsearch ingestion nodes are designed with this assumption — they are stateless with respect to the log.
Outcome: Schema changes become a log replay operation rather than a coordinated migration across multiple live services.
Cloud messaging services as Kafka alternatives
Problem: The team evaluated Google Pub/Sub, AWS SNS/SQS, and Amazon Kinesis as potential alternatives to self-managed Kafka.
Root cause: Cloud-native messaging services are appealing for operational simplicity, but the Monolog requires two properties: infinite event retention and globally ordered consumption.
Solution: None of the evaluated cloud messaging services satisfied both requirements. Apache Kafka was the only platform that did, which drove the decision to self-manage it on GCP despite the operational overhead.
Outcome: A deliberate trade-off: greater operational burden in exchange for the architectural properties that make the log-as-source-of-truth model possible.
Full tech stack
Key contributors
Key takeaways for your own Kafka implementation
- A single-partition topic is sometimes the right choice. If your use case requires total causal ordering across all events — not just within a partition — a single-partition topic removes the risk of cross-partition ordering violations. This only works if your throughput is low enough that a single partition is not a bottleneck, which is true for content-publishing patterns and similar workloads.
- Enforce schema at the write boundary, not the read boundary. The Times' gateway validates every asset against a protobuf schema before it enters the Monolog. Consumers never need to handle malformed messages because they cannot enter the log. Moving schema enforcement upstream reduces defensive complexity in every consumer you build.
- Design explicit derived logs for specific consumer needs. Rather than burdening individual consumers with reference resolution or full-payload overhead, create purpose-built derived topics — a Denormalized Log for systems that need complete assembled assets, a Skinny Log for lightweight notification consumers. Each derived log keeps consumer applications simpler and reduces coupling to the source topic.
- Infinite retention enables immutable deployments. If Kafka retains the full event history, downstream data stores become fully reproducible from the log. You can destroy and rebuild any materialised view from offset 0, which simplifies schema migrations and removes the need for coordinated in-place migration logic across services.
- Evaluate managed Kafka against your operational capacity. The Times self-managed Kafka on GCP and found the overhead significant. If your team does not have deep Kafka operational experience, a managed service may be worth the cost — especially if the architectural properties you need (infinite retention, ordered consumption) are available in the managed offering you are evaluating.
Sources and further reading
Primary sources
- Boerge Svingen, "Publishing with Apache Kafka at The New York Times," Confluent Blog, September 6, 2017. https://www.confluent.io/blog/publishing-apache-kafka-new-york-times/
- Boerge Svingen, "The Source of Truth: Why the New York Times Stores Every Piece of Content Ever Published in Kafka," Kafka Summit NYC 2017 (slides). https://www.slideshare.net/ConfluentInc/kafka-summit-nyc-2017-the-source-of-truth-why-the-new-york-times-stores-every-piece-of-content-ever-published-in-kafka
- Boerge Svingen, Software Engineering Daily interview, October 30, 2017. https://softwareengineeringdaily.com/2017/10/30/kafka-at-ny-times-with-boerge-svingen/
Try Kpow with your Kafka cluster
If you are running a Kafka publishing pipeline and want visibility into consumer lag, topic throughput, and offset positions across your Monolog and derived topics, Kpow connects to any Kafka cluster and gives you that observability in a single interface. You can try it free for 30 days — deploy via Docker, Helm, or JAR, and connect to your cluster in minutes.