Abstract digital artwork featuring smooth, overlapping curved shapes in shades of green and blue on a black background.

How DoorDash uses Apache Kafka in production

May 16th, 2026

xx min read

DoorDash operates one of the largest food delivery platforms in the United States, coordinating millions of orders, restaurant partnerships, and Dasher assignments every day. Underpinning that coordination is an Apache Kafka deployment that spans five clusters, more than 2,500 topics, and around six billion messages per day on average — with peaks reaching twice that rate.

The company reached that scale through two distinct adoption waves: an initial migration from RabbitMQ in mid-2019 to stop task-processing outages, then a ground-up rebuild of its real-time data infrastructure under a platform called Iguazu that replaced Amazon Kinesis and SQS and brought data warehouse latency down from one day to a few minutes.

Company overview

DoorDash connects consumers, restaurants, and delivery drivers (Dashers) across the United States, Canada, Australia, and Japan. The platform processes a high volume of time-sensitive transactions — order placement, restaurant confirmation, Dasher assignment, and delivery completion — all of which depend on low-latency event propagation between microservices.

DoorDash first adopted Kafka in mid-2019 after repeated RabbitMQ failures began causing order checkout and Dasher dispatch outages under peak load. That initial migration, led by engineer Ashwin Kachhara, addressed the immediate reliability problem. A separate initiative, beginning around 2020 and led by Allen Wang (who had previously built Netflix's Keystone Kafka pipeline), took a broader view: replacing multiple fragmented cloud-native queuing systems with a unified, Kafka-centred streaming platform.

Kafka milestones at DoorDash

Date	Event
Mid-2019	RabbitMQ and Celery outages drive adoption of Kafka for order checkout and Dasher assignment
Late 2019	Allen Wang joins from Netflix; begins building the Real-Time Streaming Platform (RTSP) team
~2020	Construction of Iguazu begins; Kafka replaces Amazon SQS and Kinesis as the central ingestion hub
2021	Riviera declarative ML feature framework goes into production (Kafka + Flink SQL + Redis)
2022	Iguazu reaches hundreds of billions of events per day; Allen Wang publishes the primary architecture post and presents at QCon San Francisco
2023	API-first Kafka topic governance replaces Terraform/Atlantis GitOps; 2,500+ topics, five clusters, six billion messages per day documented publicly
2024	Kafka multi-tenancy via OpenTelemetry context propagation; full self-serve CRUD for topics, users, and ACLs
2025	Fabricator (successor to Riviera) launched: 100+ pipelines, 500 ML features, 100 billion daily feature values

DoorDash's Kafka use cases

Asynchronous task processing

The original use case. Before mid-2019, DoorDash routed order checkout and Dasher assignment work through RabbitMQ and Celery. Repeated broker failures under peak load caused customer-facing outages with direct revenue impact; RabbitMQ offered limited observability and required slow, manual recovery steps.

Ashwin Kachhara and his team replaced RabbitMQ with Kafka for asynchronous task processing. Kafka's durable log, at-least-once delivery guarantees, and horizontal scalability eliminated the outage pattern. The migration was completed without downtime using a gradual traffic cutover approach.

Real-time event pipeline to Snowflake (Iguazu)

The more architecturally significant initiative. Before Iguazu, DoorDash maintained separate pipelines using Amazon SQS and Amazon Kinesis to feed its data warehouse, ML platform, and time-series metrics backend. Each pipeline used different technology and communication patterns, producing data warehouse latency of up to one day and significant operational overhead.

Iguazu replaced those fragmented pipelines with a single Kafka-centred platform. All microservices and mobile clients publish events via an HTTP-based Kafka proxy; Apache Flink jobs consume those topics and fan out to multiple destinations (Snowflake, Redis, Chronosphere) with format transformations applied per sink. End-to-end latency to Snowflake dropped from one day to a few minutes.

Use cases documented within Iguazu include:

Dasher assignment monitoring: near-real-time warehouse data lets the Dasher assignment team detect algorithm bugs within minutes rather than the next day
Mobile application health monitoring: checkout page load errors are routed to Chronosphere for operational alerting
Real-time user sessionization and behaviour analysis

Real-time ML feature generation (Riviera and Fabricator)

Delivery events are consumed from Kafka topics by Flink jobs to produce real-time ML features — for example, recent average restaurant wait times. The Riviera framework (and its 2025 successor, Fabricator) expresses these pipelines as a SQL query plus a YAML configuration file, routing Kafka topic data through Flink into Redis for low-latency reads by inference services. By 2025, the Fabricator platform ran more than 100 pipelines generating 500 unique features at a rate of over 100 billion feature values per day.

Search index replication

The Search Platform team built a continuous indexing pipeline where database changes propagate through Kafka to Flink applications, which then update Elasticsearch. Before this pipeline, a full reindex of the search catalogue took up to one week; the Kafka and Flink approach made it continuous and near-real-time.

Advertising budget pacing

The ads platform publishes a record to a Kafka topic when a campaign hits its daily spend limit. A Flink streaming job consumes that topic and writes an expiration flag to CockroachDB, embedding the budget-capped state into DoorDash's in-house search index at the filtering stage. This change produced a 43% drop in search processor latency and a 45% reduction in discarded candidates compared to the previous approach.

Production testing via Kafka multi-tenancy

In 2024, DoorDash introduced a multi-tenant Kafka architecture to allow test traffic to flow through the same production topics and consumer instances as live traffic. OpenTelemetry context propagation carries a "doortest" tenant identifier on each Kafka record; consumers branch on that metadata without creating separate topics. This approach enables production-fidelity testing without topic proliferation.

Scale and throughput

Metric	Figure	Source
Kafka clusters	5	DoorDash Engineering Blog, December 2023
Topics	2,500+	DoorDash Engineering Blog, December 2023
Daily message volume	~6 billion messages	DoorDash Engineering Blog, December 2023
Average throughput	~4 million messages per minute	DoorDash Engineering Blog, December 2023
Peak throughput	~8 million messages per minute	DoorDash Engineering Blog, December 2023
Iguazu event volume	Hundreds of billions of events per day	Allen Wang, DoorDash Engineering Blog, August 2022
Delivery guarantee	99.99% (four nines)	Allen Wang, DoorDash Engineering Blog, August 2022
Data loss rate (async mode)	Less than 0.001%	Allen Wang, InfoQ / QCon Plus, June 2023
Flink daily processing volume	220 TB per day	Junaid Effendi, The Sequence (citing DoorDash engineering sources)
New topics provisioned per week	~100	DoorDash Engineering Blog, December 2023
Warehouse latency improvement	~1 day to a few minutes	Allen Wang, InfoQ / QCon Plus, June 2023
Fabricator daily feature values	100+ billion	DoorDash Engineering Blog, April 2025

DoorDash's Kafka architecture

High-level overview

DoorDash runs its Kafka infrastructure on AWS. The five clusters are operated by the Real-Time Streaming Platform (RTSP) team. Public sources do not describe the specific role of each cluster (whether they are segmented by team, environment, or criticality), though all five are provisioned and governed through the same internal tooling.

The Iguazu architecture has three main layers:

Ingestion: An HTTP-based Kafka proxy (built on Confluent's open-source Kafka REST Proxy) runs on Kubernetes and accepts events from microservices and mobile clients. The proxy provides a REST interface so internal producers do not need native Kafka client configuration.
Processing: Apache Flink jobs run as standalone Kubernetes services deployed via Helm charts, consuming Kafka topics and applying per-destination fan-out and format transformations.
Delivery: Events are written to S3 in Parquet format, then ingested into Snowflake via Snowpipe (triggered by Amazon SQS notifications). ML features go to Redis. Operational metrics go to Chronosphere.

Event ingestion layer: the Iguazu Kafka proxy

DoorDash extended the open-source Confluent Kafka REST Proxy with several additions for the Iguazu use case:

Multi-cluster routing (a single proxy endpoint routes to the appropriate cluster)
Asynchronous producing (decoupling publisher latency from Kafka ACK wait)
Metadata pre-fetching
HTTP header passthrough for OpenTelemetry context propagation

Schema management

All events are defined in Protocol Buffers. Schemas are stored in a central shared Protobuf Git repository and validated at build time in CI/CD pipelines — invalid schemas are caught before deployment, not at runtime. Confluent Schema Registry handles runtime schema lookup.

Where downstream sinks require Avro (for example, certain legacy connectors), DoorDash's serialization library converts Protobuf to Avro transparently using Avro's protobuf library. Producers remain Protobuf-native and are unaware of Avro consumers.

Every event flowing through Iguazu uses a standard envelope containing: creation time, source service, encoding method reference, and a schema registry pointer.

Producer architecture

For high-volume event topics without natural message keys (random partition assignment), DoorDash's default producer configuration produced small, frequent batches and excessive broker CPU load. The RTSP team addressed this with two configuration changes:

Switching to Kafka's built-in sticky partitioner (batching null-keyed records to the same partition until a batch is ready)
Increasing linger.ms to 50-100 ms to accumulate larger batches

The result was a 30-40% reduction in Kafka broker CPU utilisation on those topics.

For topics requiring delivery guarantees, DoorDash uses replication factor 2 with min.insync.replicas set to 1 — a deliberate choice to favour throughput and cost over maximum durability on non-critical event streams.

Consumer architecture

Consumer groups are managed centrally by the RTSP team. Topic readiness after provisioning is confirmed by polling Prometheus metrics via Chronosphere; the Minions orchestration service waits until topic metrics appear in Chronosphere before marking an onboarding workflow complete.

Stream processing

Apache Flink is the primary stream processing engine. DoorDash exposes two interfaces to internal users:

Flink DataStream API for custom stateful processing logic
Flink SQL with YAML configuration (via the Riviera and Fabricator frameworks) for declarative feature pipelines, where engineers write a SQL query and a YAML file rather than Flink code

Data delivery paths

Destination	Path
Snowflake (data warehouse)	Kafka topic → Flink job → S3 (Parquet) → SQS notification → Snowpipe → Snowflake
Redis (ML feature store)	Kafka topic → Flink SQL job (Riviera / Fabricator) → Redis
Chronosphere (operational metrics)	Kafka topic → Flink job → Chronosphere time-series backend
Elasticsearch (search index)	Database change event → Kafka topic → Flink job → Elasticsearch
CockroachDB (ad budget state)	Ad spend event → Kafka topic → Flink streaming job → CockroachDB

‍

S3 acts as a durable buffer independent of Kafka retention. If Snowflake suffers downtime, failed Snowpipe ingestions can be backfilled from S3 without depending on Kafka replayability.

Per-event isolation is a design principle: each event type has its own dedicated Flink application and its own Snowpipe instance. Failures are contained to a single event type rather than cascading across a shared pipeline.

Special techniques and engineering decisions

Declarative ML feature pipelines

The Riviera framework (2021) and its successor Fabricator (2025) allow data scientists to express Flink stream processing jobs as a SQL query plus a YAML configuration file. The framework generates the Flink job; engineers who would otherwise need to write DataStream API code can instead write a single YAML file. Feature development time was reduced from weeks to hours, and the feature engineering codebase shrank by 70%.

S3 as a durable overflow buffer

Finite Kafka retention creates a risk: if Snowflake is unavailable long enough for Kafka logs to roll off, data is lost. DoorDash addressed this by routing all Iguazu Flink jobs through S3 (Parquet) before Snowpipe ingestion. S3 is not subject to Kafka retention windows, so backfills remain possible regardless of Kafka log state.

OpenTelemetry-based Kafka multi-tenancy

Rather than creating separate topics per test scenario, DoorDash propagates OpenTelemetry context (tenant ID and route information) on every Kafka record header. Production and test traffic share the same topics and consumer group instances; consumers inspect the OTEL header to branch handling. This avoids topic proliferation while providing production-fidelity test conditions.

Protobuf-to-Avro transparent conversion

DoorDash maintains Protobuf as the canonical internal event format. Where downstream sinks require Avro, its serialization library converts on the fly. This allows the company to remain Protobuf-native without modifying producers when adding Avro-dependent sinks.

Operating Kafka at scale

Deployment

All five Kafka clusters run on AWS, managed by the RTSP team. Flink jobs are deployed as standalone Kubernetes services via Helm charts. Infrastructure is managed with Terraform.

Topic governance

DoorDash went through three generations of topic provisioning tooling:

Terraform / Atlantis (GitOps): Required infrastructure team review for every topic creation request. As the pace of new topics grew to roughly 100 per week, this created a significant support burden.
Infra Service + Minions (2023): Replaced GitOps with an HTTP API (Infra Service) and a Cadence-backed orchestration service (Minions) that automates end-to-end Iguazu onboarding — creating Kafka topics, launching Flink jobs, creating Snowflake objects, and opening pull requests, with Slack notifications at each step. Configuration options exposed to users are deliberately restricted to capacity-related settings (retention, partition count) to prevent common misconfigurations. Onboarding time reduced by 95%, from multiple days to under one hour, typically within 15 minutes.
Kafka Self-Serve (2024): Extended Infra Service to cover the full resource lifecycle — topics, user accounts, and ACLs — with auto-approval rules for routine, non-sensitive changes such as standard ACL grants or topic resizes within pre-set bounds.

Authentication and access control

DoorDash uses SASL/SCRAM for per-service Kafka authentication and ACLs for authorisation, controlling which producers and consumer groups can access each topic. The 2024 Kafka Self-Serve platform manages user account creation and ACL assignment through the same API-first workflow as topic creation.

Monitoring

Chronosphere is DoorDash's operational metrics backend for Kafka. Topic readiness after provisioning is confirmed by polling Prometheus metrics through Chronosphere's API — the Minions orchestration workflow waits on this before marking onboarding complete. Application-level operational events (for example, mobile checkout errors) are also routed through Iguazu Flink jobs to Chronosphere for alerting.

Schema evolution

Protobuf's inherent backward and forward compatibility is the primary schema evolution mechanism. Schemas must pass CI/CD validation before merging; runtime enforcement is handled through Confluent Schema Registry. The Fabricator framework enforces backward- and forward-compatible evolution at the framework level.

Challenges and how DoorDash solved them

RabbitMQ reliability failures at peak load

Problem: RabbitMQ repeatedly went down under heavy order volume. Because order checkout and Dasher assignment ran through RabbitMQ and Celery, broker failures directly caused customer-facing outages. Recovery was slow and manual; observability was poor.

Solution: Replaced RabbitMQ with Kafka for asynchronous task processing using a gradual traffic cutover with no downtime. Kafka's durable log and horizontal scalability eliminated the outage pattern.

Outcome: Task-processing outages from broker failures ceased; the system could scale horizontally to handle peak load.

Source: Ashwin Kachhara, DoorDash Engineering Blog, September 2020.

Fragmented SQS and Kinesis pipelines with one-day warehouse latency

Problem: Separate SQS and Kinesis pipelines for the data warehouse, ML platform, and metrics backend used incompatible technologies and communication paradigms, producing warehouse data latency of up to one day and significant operational overhead from managing disparate systems.

Solution: Consolidated onto a single Kafka + Flink platform (Iguazu). Kafka serves as the unified pub/sub hub; Flink applies per-destination fan-out with data format transformations.

Outcome: Warehouse latency from one day to a few minutes; a single platform team operates the infrastructure.

Source: Allen Wang, DoorDash Engineering Blog, August 2022.

High broker CPU on null-keyed event streams

Problem: For high-volume topics without message keys, the default round-robin partition assignment produced small, frequent batches, generating excessive CPU load on brokers.

Solution: Configured Kafka's sticky partitioner and increased linger.ms to 50-100 ms to accumulate larger batches per partition before flushing.

Outcome: 30-40% reduction in Kafka broker CPU utilisation on affected topics.

Source: Allen Wang, August 2022 (via Shen Zhu engineering blog summary).

Gevent incompatibility with librdkafka in Python

Problem: A Python point-of-sale service used Gevent for async I/O via monkey-patching. Gevent cannot patch librdkafka (a C library), so native Kafka consumer usage blocked Gevent's event loop.

Solution: Ran the Kafka consumer loop inside a dedicated Gevent greenlet. Blocking I/O calls inside the consumer were replaced with Gevent-compatible equivalents.

Outcome: The migrated service outperformed the previous Celery/Gevent worker under heavy I/O.

Source: Jessica Zhao and Boyang Wei, DoorDash Engineering Blog, February 2021.

Snowflake downtime risking data loss within Kafka retention windows

Problem: Kafka's finite log retention creates a data loss risk if Snowflake is unavailable long enough for topic logs to roll off.

Solution: All Iguazu Flink jobs write to S3 in Parquet format before Snowpipe ingestion. S3 serves as a durable buffer independent of Kafka retention; backfills are possible from S3 regardless of Kafka log state.

Source: Allen Wang, InfoQ / QCon Plus, June 2023.

Manual topic provisioning becoming a bottleneck

Problem: GitOps-based topic creation required infrastructure team review for every new topic. At approximately 100 new topics per week, this created a significant support burden and slowed engineering teams waiting for approvals.

Solution: Replaced GitOps with an HTTP API (Infra Service) backed by Minions/Cadence orchestration, with auto-approval for routine requests.

Outcome: Iguazu onboarding time reduced by 95%, from multiple days to under one hour.

Source: DoorDash Engineering Blog, December 2023.

Full tech stack

Category	Technology	Role
Message broker	Apache Kafka	Central pub/sub and event streaming backbone; 5 clusters, 2,500+ topics
Stream processing	Apache Flink	DataStream API for custom logic; Flink SQL for declarative feature pipelines
Event ingestion proxy	Confluent Kafka REST Proxy (self-hosted, open source)	HTTP-based event ingestion layer (Iguazu Kafka Proxy) for microservices and mobile clients
Schema registry	Confluent Schema Registry	Runtime schema lookup and enforcement for Protobuf and Avro events
Schema format (primary)	Protocol Buffers (Protobuf)	Primary event schema format; defined in a central shared Git repository, validated at CI/CD build time
Schema format (secondary)	Avro	Auto-converted from Protobuf by DoorDash's serialization library for sinks that require it
Intermediate storage	Amazon S3	Durable buffer for Parquet-format event data between Flink jobs and Snowflake
Warehouse ingestion	Amazon SQS + Snowpipe	SQS notifications trigger Snowpipe to ingest from S3 to Snowflake
Data warehouse	Snowflake	Analytical data warehouse for business intelligence and Dasher monitoring
Feature store	Redis	Sink for Riviera/Fabricator real-time ML feature pipelines; serves inference services
OLAP / internal analytics	Apache Pinot	Real-time OLAP for ad campaign reporting and Risk Platform dashboards; fed from Kafka
Monitoring backend	Chronosphere	Operational metrics and alerting; receives real-time events from Iguazu; confirms Kafka topic readiness after provisioning
Container orchestration	Kubernetes	Runs Flink jobs (as standalone services) and the Kafka REST Proxy
Deployment	Helm	Deploys Flink jobs on Kubernetes
Infrastructure-as-code	Terraform	Infrastructure provisioning for Kafka and surrounding systems
Workflow orchestration	Cadence	Workflow engine backing the Minions orchestration service for Iguazu onboarding
Internal UI	Retool	Internal management UI for Iguazu pipeline configurations
Authentication	SASL/SCRAM	Per-service Kafka authentication
Authorisation	Kafka ACLs	Controls which producers and consumer groups can access each topic
Observability / context	OpenTelemetry (OTEL)	Context propagation for Kafka multi-tenancy; carries tenant and route metadata on each record header
Search index	Elasticsearch	Continuously updated via Kafka + Flink indexing pipeline
Ad state storage	CockroachDB	Stores budget-capped flags written by the Kafka/Flink advertising pipeline
Batch processing	Apache Spark	Large-scale data lake transformations (separate from the Kafka streaming path)
Batch orchestration	Apache Airflow	Orchestrates batch pipelines
Feature pipeline orchestration	Dagster	DAG construction and orchestration within the Fabricator feature engineering framework
SQL query engine	Trino	Unified SQL query layer over the data lake

Key contributors

Name	Title / team	Contribution
Allen Wang	Tech Lead, Data Platform; founding member, Real-Time Streaming Platform	Architect of Iguazu; authored the primary August 2022 engineering blog post; presented at QCon San Francisco 2022 and QCon Plus 2022; previously built Netflix's Keystone Kafka pipeline; contributed rack-aware consumer partition assignment to Apache Kafka
Ashwin Kachhara	Software Engineer, Kotlin and Go Platform team	Led the 2019 RabbitMQ-to-Kafka migration; authored the September 2020 engineering blog post; presented at the Bay Area Apache Kafka meetup and a Confluent event
Jessica Zhao	Software Engineer	Co-authored the February 2021 post on making Kafka consumer compatible with Gevent in Python
Boyang Wei	Software Engineer	Co-authored the February 2021 post on making Kafka consumer compatible with Gevent in Python
Carlos Herrera	Software Engineer, Developer Platform (Infrastructure Engineering)	Lead author on the March 2024 Kafka multi-tenancy blog post
Seed Zeng	Software Engineer, Storage team	Co-authored the August 2024 Kafka Self-Serve blog post
Kane Du	Software Engineer, Storage Self-Serve platform	Co-authored the August 2024 Kafka Self-Serve blog post
Donovan Bai	Software Engineer, Storage team (stateful systems including Kafka)	Co-authored the August 2024 Kafka Self-Serve blog post

Key takeaways for your own Kafka implementation

Use S3 as a durable layer between Kafka and your warehouse. Kafka retention is finite; if your downstream sink has an outage, you lose the ability to replay. Routing Flink output through S3 (Parquet) before Snowpipe decouples ingestion from Kafka retention windows and allows backfill regardless of Kafka log state.
Tune your producer for null-keyed high-volume topics. If you have event streams without natural message keys, round-robin partition assignment produces small, frequent batches and elevated broker CPU. Kafka's sticky partitioner combined with a linger.ms of 50-100 ms is a low-risk configuration change that can meaningfully reduce broker load.
Restrict what users can configure when self-serving topics. DoorDash's API-first governance deliberately exposes only capacity-related settings (retention, partition count) to internal users, hiding complex broker parameters. This design choice reduces misconfiguration without removing autonomy, and it scales better than a review-based GitOps model as topic volume grows.
Consider isolating Flink jobs per event type rather than sharing pipelines. Per-event isolation means a failure in one pipeline does not cascade to others. The trade-off is more Kubernetes deployments to manage; DoorDash addressed that with Helm and centralised orchestration through Minions.
If you run large-scale ML feature pipelines on Kafka, evaluate a declarative DSL layer. Riviera and Fabricator show that abstracting Flink SQL behind a YAML configuration significantly widens the pool of engineers who can create and maintain real-time feature pipelines, at the cost of framework complexity that the platform team must own.

Sources and further reading

Primary sources

Ashwin Kachhara, "Eliminating Task Processing Outages by Replacing RabbitMQ with Apache Kafka Without Downtime," DoorDash Engineering Blog, September 2020.
Jessica Zhao and Boyang Wei, "How to Make Kafka Consumer Compatible with Gevent in Python," DoorDash Engineering Blog, February 2021.
DoorDash Engineering, "Building A Declarative Real-Time Feature Engineering Framework," DoorDash Engineering Blog, March 2021.
Satish, Danial, and Siddharth, "Building Faster Indexing with Apache Kafka and Elasticsearch," DoorDash Engineering Blog, July 2021.
Allen Wang, "Building Scalable Real-Time Event Processing with Kafka and Flink," DoorDash Engineering Blog, August 2022.
Allen Wang, "From Zero to a Hundred Billion: Building Scalable Real-Time Event Processing at DoorDash," QCon San Francisco, October 2022.
Allen Wang, "Building Scalable Real Time Event Processing with Kafka and Flink," InfoQ / QCon Plus (recorded), June 2023.
DoorDash Engineering (RTSP team), "API-First Approach to Kafka Topic Creation," DoorDash Engineering Blog, December 2023.
DoorDash Engineering, "Introducing DoorDash's In-House Search Engine," DoorDash Engineering Blog, February 2024.
Carlos Herrera, Amit Gud, and Yunji Zhong, "Setting Up Kafka Multi-Tenancy," DoorDash Engineering Blog, March 2024.
Seed Zeng, Kane Du, and Donovan Bai, "DoorDash Empowers Engineers with Kafka Self-Serve," DoorDash Engineering Blog, August 2024.
DoorDash Engineering, "Introducing Fabricator: A Declarative Feature Engineering Framework," DoorDash Engineering Blog, April 2025.

If you are running Kafka in production and want visibility into consumer lag, topic health, and cluster performance, Kpow gives you a UI and monitoring layer that works with any Kafka deployment. You can try it with a free 30-day trial and connect it to any cluster in minutes via Docker, Helm, or JAR.

Kafka cluster monitoring

A detailed guide to Kafka producer monitoring

Kafka consumer monitoring and performance tuning

Kafka broker monitoring

Apache Kafka architecture: a complete guide to internals, components, and deployment

Kafka monitoring: a complete guide for platform engineers

Data governance for Kafka: introducing lineage support in Factor Platform

Defense in depth: unifying RBAC and data policies for transparent governance

CMAK: Review, pricing, and best alternatives in 2026

Kadeck: Review, pricing, and best alternatives in 2026

AKHQ: Review, pricing, and best alternatives in 2026

Redpanda Console: Review, pricing, and best alternatives in 2026