Abstract digital artwork featuring smooth, overlapping curved shapes in shades of green and blue on a black background.

How Afterpay uses Apache Kafka in production

Table of contents

Factor House
May 23rd, 2026
xx min read

Afterpay runs Apache Kafka as the event stream for its Global Payments Platform, operating it under PCI DSS constraints that required a custom network architecture and bespoke Kafka client libraries. Payment events leave the PCI zone via a two-hop AWS PrivateLink design, serialised with Avro and signed with AWS KMS-backed HMAC signatures before reaching the shared Kafka cluster. A separate challenge emerged after Block's 2022 acquisition: Kafka-archived data from Afterpay's Sydney AWS region was feeding Delta merge operations running in US West, generating more than USD 1,500 per day in S3 cross-region egress. The solution, Project Teleport, restructured the pipeline to keep compute co-located with data at each stage and delivered annual savings of approximately USD 540,000.

Company overview

Afterpay is a buy-now-pay-later platform founded in Sydney, Australia in 2014 and acquired by Block Inc. in 2022. It operates in Australia, New Zealand, the United States, the United Kingdom, Canada, and Europe, processing instalment payment agreements for millions of consumers at retail and online merchants. Afterpay's infrastructure runs on AWS, with its primary data engineering operations historically anchored in the Sydney region.

The acquisition by Block brought an integration challenge: Afterpay's Kafka-sourced data needed to flow into Block's consolidated data platform, which operated in US West. Reconciling two distinct AWS regions — with different data ownership, egress cost profiles, and pipeline architectures — is the defining data engineering problem that shaped Afterpay's most recent Kafka work.

Key Kafka milestones:

  • Pre-2022: Afterpay runs Kafka for PCI-compliant payment event streaming from its Global Payments Platform.
  • 2022 (acquisition): Block acquires Afterpay. Afterpay's Kafka data in Sydney (APSE2) needs to integrate with Block's US West (USW2) data platform.
  • September 2022: Jing Li publishes "Implementing Kafka in the Payments PCI World," documenting the two-hop PrivateLink architecture, custom Kafka client libraries, and PCI compliance controls deployed across all Afterpay payment environments.
  • November 2024: Bulk migration of Afterpay Kafka topics to Project Teleport commences.
  • March 2025: Project Teleport migration completes. Unni Krishnan publishes the full case study, reporting annual savings of approximately USD 540,000 and zero-downtime migration of approximately 120 Kafka topics.

Afterpay's Kafka use cases

PCI-compliant payment event streaming

Afterpay's primary Kafka use case is streaming payment events out of the PCI DSS zone to downstream consumers. Every payment processed by Afterpay's Global Payments Platform generates events that need to reach risk, analytics, and reporting systems without those systems having direct access to the PCI network. Kafka provides the durable, decoupled transport layer; the custom client libraries and network architecture enforce the compliance boundary.

Risk decisioning

Kafka-sourced payment event data feeds Afterpay's Risk Decisioning systems downstream. These are among the approximately 200 datasets delivered from Afterpay's Kafka pipelines via Project Teleport, providing risk teams with timely event data for fraud and credit decisions.

Business intelligence and financial reporting

Afterpay's business intelligence and financial reporting teams are major consumers of the Delta Lake pipeline fed by Kafka-archived data. Project Teleport consolidates approximately 200 downstream datasets from the Sydney Kafka archive into the Block data platform in US West, making them available to BI and finance tooling.

Machine learning pipelines

Databricks Spark Declarative Pipelines read streaming data from Afterpay's Kafka topics to feed machine learning workloads. The data lake uses a medallion architecture (bronze, silver, gold) on Delta Lake, with Unity Catalog and AWS Glue providing governance and cataloguing.

Scale and throughput

  • Daily pipeline volume: 9 TB of Kafka-archived data processed per day by Afterpay's data team (Unni Krishnan, March 2025).
  • Topics migrated: Approximately 120 Kafka topics migrated under Project Teleport (Unni Krishnan, March 2025).
  • Downstream datasets: Approximately 200 datasets delivered to risk, BI, and financial reporting from Kafka pipelines (Unni Krishnan, March 2025).
  • Payment events published: "Millions of messages" published since the PCI Kafka implementation was deployed to all payments environments (Jing Li, September 2022).
  • Compression ratio: Approximately 50% reduction in data volume from Avro-to-Parquet conversion in Project Teleport Stage 1.
  • Egress saving: Approximately USD 540,000 per year from restructuring the cross-region pipeline (Unni Krishnan, March 2025).

Afterpay's Kafka architecture

Deployment

Afterpay's Kafka infrastructure runs on AWS, with the primary archival cluster anchored in Sydney (APSE2). The payments Kafka cluster is operated by Block's Platform Engineering team. Cross-region data processing targets AWS US West (USW2) via Project Teleport.

PCI zone network design

PCI DSS compliance prohibits mixing cardholder data environments with general infrastructure. For Kafka, this means the network path from the PCI zone to the Kafka broker must be isolated from shared transit networks.

AWS Transit Gateway was evaluated but rejected on two grounds: it lacked support for overlapping VPC CIDR ranges across Block's network topology, and it would have created a direct routing path between the PCI zone and other Block networks. Afterpay's solution uses dedicated AWS PrivateLinks with a two-hop architecture:

  • Hop 1: PCI zone connects to Platform Engineering's network via a dedicated PrivateLink.
  • Hop 2: Platform Engineering's network connects to the Kafka cluster via a second PrivateLink.

Non-PCI teams use AWS Transit Gateway with a single PrivateLink hop. The Payments Platform uses two hops, ensuring the PCI environment has no direct visibility into the shared transit network.

DNS resolution for Kafka brokers across the PrivateLink connections uses AWS Route53 private hosted zones. Schema Registry access from the PCI zone is gated through a Squid Proxy that explicitly whitelists Schema Registry DNS names as an egress control.

Custom Kafka client libraries

A standard Kafka client provides no protection against cardholder data leaving the PCI zone in message payloads, no mechanism for asynchronous publication, and no message-level integrity verification. Afterpay's Payments Platform team built custom producer and consumer libraries to address all three (Jing Li, September 2022):

Card detection and obfuscation: The producer library applies RegEx-based cardholder data detection during Avro serialisation. Any field matching a card number pattern is obfuscated before the bytes are published to the Kafka topic. Luhn algorithm validation was planned as a second pass to reduce false positives.

Non-blocking publication: A dedicated thread pool dispatches Kafka publishing asynchronously. The payment processing thread is released immediately after serialisation; Kafka I/O proceeds independently. A Kafka broker outage or elevated publish latency cannot propagate into payment response times.

Message integrity verification: Producers sign each message using HMAC with an AWS KMS Customer Managed Key. Consumers within the PCI zone verify the signature with the same KMS key before processing. This provides a tamper-evidence guarantee independent of TLS, which is relevant for audit and compliance scenarios where message-level provenance needs to be demonstrable.

Kafka data archival (Confluent Sink Connectors)

Afterpay archives Kafka topics to S3 using Confluent Sink Connectors, which land hourly records as Avro files in the Sydney APSE2 region. These archived files are the input to the Project Teleport cross-region pipeline.

Project Teleport: cross-region data processing

Before Project Teleport, Afterpay's data team ran Delta merge operations with compute in APSE2 writing to Delta tables in USW2. S3 cross-region egress costs exceeded USD 1,500 per day. The core problem was that every merge operation transferred data between regions at the point of processing.

Project Teleport restructures the pipeline into two stages, each co-located with its data:

Stage 1 (APSE2): DeltaSync Spark jobs use Databricks Autoloader for incremental Avro file discovery. Files are converted from Avro to compressed Parquet and loaded into streaming interface Delta tables. Compression reduces the volume transferred to USW2 by approximately 50%.

Stage 2 (USW2): Delta Live Tables (DLT) jobs apply transformations and run merge operations locally using the apply_changes API. The apply_changes API provides built-in deduplication, handling the duplicate events that arise from Kafka's at-least-once delivery semantics without requiring downstream logic.

Apache Airflow orchestrates both stages. A checkpoint transfer mechanism ensures historical data is not reprocessed during migration. All approximately 120 topics were migrated between November 2024 and March 2025 with no reported downtime.

Producer architecture

Producers in the PCI pipeline use Afterpay's custom producer library. The card detection, thread pool, and KMS signing layers run within the library; the underlying Kafka client handles batching and delivery. Avro is the serialisation format throughout.

Consumer architecture

Custom consumer libraries verify the KMS-based HMAC signature on each message before processing. The Schema Registry, accessed via Squid Proxy, provides schema resolution for Avro deserialisation.

Special techniques and engineering innovations

Two-hop PrivateLink for PCI network isolation

The two-hop PrivateLink design is notable for its explicit trade-off: it adds a network hop and DNS configuration overhead in exchange for a provable compliance boundary. The PCI zone has no visibility into the shared transit network; it sees only the Platform Engineering PrivateLink endpoint. This makes the network isolation auditable, which matters for PCI DSS assessments.

KMS-signed Kafka messages for message-level integrity

Transport-layer security (TLS) protects messages in transit but does not provide message-level provenance. If a message is intercepted and modified between the producer and broker, or replayed by an adversary with network access, TLS does not prevent this. HMAC signing with a KMS CMK addresses both scenarios: a consumer can verify that a message was produced by an entity that held the signing key at the time of production, and that the message has not been modified since. This is a meaningful addition in regulated environments where audit evidence at the message level is valuable.

Deduplication via Delta Live Tables apply_changes

Rather than writing custom deduplication logic for Kafka's at-least-once delivery guarantees, Project Teleport delegates this to Delta Live Tables' apply_changes API. The API applies upsert semantics as records land in the target table, treating retried events as updates rather than inserts. This keeps the pipeline logic simple and the deduplication behaviour well-defined.

Co-locating compute with data to eliminate cross-region egress

The USD 1,500/day egress problem arose because compute was separated from the data it was processing. Project Teleport's solution is straightforward in principle: run each stage's compute in the same region as its input data. The Avro-to-Parquet conversion in Stage 1 compounds the saving by reducing the volume transferred between regions before Stage 2 begins.

Operating Kafka at scale

Deployment model: Afterpay's Kafka cluster runs on AWS, operated by Block's Platform Engineering team. Confluent Sink Connectors handle the archival layer from Kafka to S3. Block does not publicly describe using a fully managed Kafka service such as Amazon MSK or Confluent Cloud for the primary cluster.

Schema governance: Avro is used throughout the PCI and archival pipelines. Schema Registry access from the PCI zone is controlled via a Squid Proxy, which enforces explicit DNS-level egress whitelisting. Schema updates must be compatible with existing consumers before being deployed.

Orchestration: Apache Airflow manages cross-stage orchestration in Project Teleport, coordinating the DeltaSync jobs in APSE2 and the DLT jobs in USW2.

Migration strategy: Project Teleport's checkpoint transfer mechanism allows Kafka topic migration without reprocessing historical data. Bulk migration of approximately 120 topics ran over four months (November 2024 to March 2025) with zero reported downtime. Transient streaming interface Delta tables in Stage 1 use sliding window logic to avoid race conditions during migration.

Data lake governance: The Delta Lake medallion architecture (bronze, silver, gold) on S3 provides tiered data quality guarantees. Unity Catalog and AWS Glue handle metadata and governance across the platform.

Challenges and how they solved them

PCI network isolation for Kafka connectivity

Standard Kafka client connectivity over shared transit infrastructure is not viable under PCI DSS without additional isolation controls. AWS Transit Gateway was evaluated and rejected because it lacked support for overlapping VPC CIDR ranges across Block's network topology, and because it would have exposed the PCI zone to other Block networks. Afterpay designed a two-hop PrivateLink architecture: PCI zone to Platform Engineering, then Platform Engineering to the Kafka cluster. DNS is resolved via Route53 private hosted zones; Schema Registry egress is explicitly controlled via Squid Proxy. Since rollout, millions of payment events have been published through this architecture.

Cardholder data leaking into Kafka message payloads

A standard Kafka producer serialises all fields without inspecting their content. In a payments context, this creates a risk that card numbers embedded in event payloads would be published to Kafka topics accessible outside the PCI zone. Afterpay's custom producer library applies RegEx-based card number detection during Avro serialisation and obfuscates matching fields before publishing. The check runs inline in the serialisation path, with no external calls or added dependencies. Luhn algorithm validation was planned as a follow-up to reduce false positives.

Kafka I/O blocking payment processing

Synchronous Kafka publication on the main payment processing thread means that any Kafka broker latency — whether from a broker restart, network fluctuation, or backpressure — propagates into payment response times. Afterpay's custom producer library introduces a dedicated thread pool that takes Kafka publishing off the main thread as soon as serialisation completes. The payment flow's latency is entirely independent of Kafka's availability. This is a straightforward pattern, but the explicit documentation of it as a PCI-adjacent concern is useful for other teams operating in regulated environments.

USD 1,500/day in cross-region S3 egress from Delta merge operations

After Block acquired Afterpay, data processing consolidated in US West. Afterpay's Kafka archive was in Sydney (APSE2), and the Delta merge jobs ran compute in APSE2 against Delta tables in USW2. Every merge operation transferred data cross-region, generating more than USD 1,500/day in S3 egress costs. Project Teleport split the pipeline at the regional boundary: Stage 1 runs in APSE2 and converts Avro to Parquet locally, reducing transfer volume by ~50%; Stage 2 runs merge operations locally in USW2. The annual saving is approximately USD 540,000.

Duplicate events from Kafka at-least-once delivery

Kafka's at-least-once delivery semantics mean that events may be published more than once, particularly during producer retries or consumer rebalances. Without explicit handling, these duplicates propagate into downstream Delta Lake tables as duplicate rows. Project Teleport uses Delta Live Tables' apply_changes API to apply upsert semantics as records land in USW2, treating retried events as updates. The deduplication logic is built into the DLT framework rather than maintained as custom code.

Full tech stack

Category Tools Notes
Message broker Apache Kafka AWS-hosted; operated by Block's Platform Engineering team
Serialisation format Apache Avro Used throughout PCI and archival pipelines
Schema registry Confluent Schema Registry Accessed via Squid Proxy from PCI zone
Connectors Confluent Sink Connectors Archive Kafka topics to S3 as hourly Avro files (APSE2)
Custom client libraries Proprietary (Afterpay Payments Platform) Card obfuscation, async thread pool publishing, KMS HMAC signing
Key management AWS KMS (Customer Managed Keys) HMAC signing for message integrity verification
Networking AWS PrivateLink (two-hop), AWS Route53 private hosted zones, Squid Proxy PCI zone Kafka connectivity and Schema Registry egress control
Stream processing Databricks Spark Declarative Pipelines, Delta Live Tables Kafka-to-Delta Lake ingestion; Project Teleport pipeline
File discovery Databricks Autoloader Incremental Avro file ingestion in Project Teleport Stage 1
Orchestration Apache Airflow Cross-stage pipeline orchestration in Project Teleport
Data lake storage Delta Lake on S3 Medallion architecture; Avro-to-Parquet via Project Teleport
Compute platform Databricks DeltaSync jobs and DLT workloads
Metadata / catalogue Unity Catalog, AWS Glue Data lake governance post-acquisition
Analytics sink Snowflake Platform-agnostic BI access on Delta Lake

Key contributors

  • Jing Li — authored "Implementing Kafka in the Payments PCI World" (September 2022), detailing the PCI Kafka architecture including the two-hop PrivateLink design and custom client libraries. Implementing Kafka in the Payments PCI World
  • Unni Krishnan — authored "Project Teleport: Cost-Effective and Scalable Kafka Data Processing at Block" (March 2025), detailing the cross-region Kafka data processing pipeline and its cost outcomes. Project Teleport
  • Unnee Udayakumar — Senior Manager of Data Engineering, Cash App ML and Data Science organisation; primary contact named in the Databricks case study on Delta Lake and Kafka streaming integration. Databricks case study

Key takeaways for your own Kafka implementation

  • PCI DSS compliance does not require a separate Kafka cluster inside the PCI zone. Afterpay uses standard Kafka infrastructure outside the PCI zone and enforces the boundary at the network layer (two-hop PrivateLink) and the client layer (card detection in the serialiser, KMS signing, async publication). The compliance work is in the access controls and client libraries, not in the broker topology.
  • Message-level signing with KMS is a viable pattern for tamper-evidence in regulated pipelines. HMAC verification at the consumer provides provenance guarantees that TLS alone does not. The operational requirement is shared KMS key access between producers and authorised consumers, which fits naturally into IAM-based access control on AWS.
  • Keeping compute co-located with data at each pipeline stage eliminates cross-region egress costs. The pattern in Project Teleport is reusable: if you are running a multi-stage pipeline across regions, identify the point where data crosses the regional boundary and ensure compression or format conversion happens before that crossing, not after.
  • Delta Live Tables' apply_changes API handles Kafka at-least-once deduplication without custom logic. If you are landing Kafka events into Delta Lake and your producer may retry, apply_changes provides the deduplication semantics you need at the framework level.
  • Cross-region pipeline migration is achievable without downtime using checkpoint transfer. Afterpay migrated approximately 120 Kafka topics across a four-month window with no reported downtime by using checkpoint transfer to avoid historical data reprocessing. If you are rearchitecting a Kafka-to-lakehouse pipeline, the migration approach matters as much as the target architecture.

Sources and further reading

Primary sources

  1. Jing Li, "Implementing Kafka in the Payments PCI World" (September 2022)
  2. Unni Krishnan, "Project Teleport: Cost-Effective and Scalable Kafka Data Processing at Block" (March 2025)
  3. Unnee Udayakumar (attributed), "Consistently providing accurate financial insights" (Databricks case study)

Try Kpow with your Kafka cluster

If you are monitoring a Kafka cluster at any scale, you can try Kpow free for 30 days. It connects to any Kafka cluster in minutes and deploys via Docker, Helm, or JAR.