
Developer
Knowledge Center
Empowering engineers with everything they need to build, monitor, and scale real-time data pipelines with confidence.
.webp)
Deploy Kpow on EKS via AWS Marketplace using Helm
Streamline your Kpow deployment on Amazon EKS with our guide, fully integrated with the AWS Marketplace. We use eksctl to automate IAM Roles for Service Accounts (IRSA), providing a secure integration for Kpow's licensing and metering. This allows your instance to handle license validation via AWS License Manager and report usage for hourly subscriptions, enabling a production-ready deployment with minimal configuration.
Overview
This guide provides a comprehensive walkthrough for deploying Kpow, a powerful toolkit for Apache Kafka, onto an Amazon EKS (Elastic Kubernetes Service) cluster. We will cover the entire process from start to finish, including provisioning the necessary AWS infrastructure, deploying a Kafka cluster using the Strimzi operator, and finally, installing Kpow using a subscription from the AWS Marketplace.
The guide demonstrates how to set up both Kpow Annual and Kpow Hourly products, highlighting the specific integration points with AWS services like IAM for service accounts, ECR for container images, and the AWS License Manager for the annual subscription. By the end of this tutorial, you will have a fully functional environment running Kpow on EKS, ready to monitor and manage your Kafka cluster.
The source code and configuration files used in this guide can be found in the features/eks-deployment folder of this GitHub repository.
About Factor House
Factor House is a leader in real-time data tooling, empowering engineers with innovative solutions for Apache Kafka® and Apache Flink®.
Our flagship product, Kpow for Apache Kafka, is the market-leading enterprise solution for Kafka management and monitoring.
Explore our live multi-cluster demo environment or grab a free Community license and dive into streaming tech on your laptop with Factor House Local.

Prerequisites
To follow along the guide, you need:
- CLI Tools:
- AWS Infrastructure:
- VPC: A Virtual Private Cloud (VPC) that has both public and private subnets is required.
- IAM Permissions: A user with the necessary IAM permissions to create an EKS cluster with a service account.
- Kpow Subscription:
- A subscription to a Kpow product through the AWS Marketplace is required. After subscribing, you will receive access to the necessary components and deployment instructions.
- The specifics of accessing the container images and Helm chart depend on the chosen Kpow product:
- Kpow Annual product:
- Subscribing to the annual product provides access to the ECR (Elastic Container Registry) image and the corresponding Helm chart.
- Kpow Hourly product:
- For the hourly product, access to the ECR image will be provided and deployment utilizes the public Factor House Helm repository for installation.
- Kpow Annual product:
Deploy an EKS cluster
We will use eksctl to provision an Amazon EKS cluster. The configuration for the cluster is defined in the manifests/eks/cluster.eksctl.yaml file within the repository.
Before creating the cluster, you must open this file and replace the placeholder values for <VPC-ID>, <PRIVATE-SUBNET-ID-* >, and <PUBLIC-SUBNET-ID-* > with your actual VPC and subnet IDs.
⚠️ The provided configuration assumes the EKS cluster will be deployed in theus-east-1region. If you intend to use a different region, you must update themetadata.regionfield and ensure the availability zone keys undervpc.subnets(e.g.,us-east-1a,us-east-1b) match the availability zones of the subnets in your chosen region.
Here is the content of the cluster.eksctl.yaml file:
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: fh-eks-cluster
region: us-east-1
vpc:
id: "<VPC-ID>"
subnets:
private:
us-east-1a:
id: "<PRIVATE-SUBNET-ID-1>"
us-east-1b:
id: "<PRIVATE-SUBNET-ID-2>"
public:
us-east-1a:
id: "<PUBLIC-SUBNET-ID-1>"
us-east-1b:
id: "<PUBLIC-SUBNET-ID-2>"
iam:
withOIDC: true
serviceAccounts:
- metadata:
name: kpow-annual
namespace: factorhouse
attachPolicyARNs:
- "arn:aws:iam::aws:policy/service-role/AWSLicenseManagerConsumptionPolicy"
- metadata:
name: kpow-hourly
namespace: factorhouse
attachPolicyARNs:
- "arn:aws:iam::aws:policy/AWSMarketplaceMeteringRegisterUsage"
nodeGroups:
- name: ng-dev
instanceType: t3.medium
desiredCapacity: 4
minSize: 2
maxSize: 6
privateNetworking: trueThis configuration sets up the following:
- Cluster Metadata: A cluster named
fh-eks-clusterin theus-east-1region. - VPC: Specifies an existing VPC and its public/private subnets where the cluster resources will be deployed.
- IAM with OIDC: Enables the IAM OIDC provider, which allows Kubernetes service accounts to be associated with IAM roles. This is crucial for granting AWS permissions to your pods.
- Service Accounts:
kpow-annual: Creates a service account for the Kpow Annual product. It attaches theAWSLicenseManagerConsumptionPolicy, allowing Kpow to validate its license with the AWS License Manager service.kpow-hourly: Creates a service account for the Kpow Hourly product. It attaches theAWSMarketplaceMeteringRegisterUsagepolicy, which is required for reporting usage metrics to the AWS Marketplace.
- Node Group: Defines a managed node group named
ng-devwitht3.mediuminstances. The worker nodes will be placed in the private subnets (privateNetworking: true).
Once you have updated the YAML file with your networking details, run the following command to create the cluster. This process can take 15-20 minutes to complete.
eksctl create cluster -f cluster.eksctl.yamlOnce the cluster is created, eksctl automatically updates your kubeconfig file (usually located at ~/.kube/config) with the new cluster's connection details. This allows you to start interacting with your cluster immediately using kubectl.
kubectl get nodes
# NAME STATUS ROLES AGE VERSION
# ip-192-168-...-21.ec2.internal Ready <none> 2m15s v1.32.9-eks-113cf36
# ...Launch a Kafka cluster
With the EKS cluster running, we will now launch an Apache Kafka cluster into it. We will use the Strimzi Kafka operator, which simplifies the process of running Kafka on Kubernetes.
Install the Strimzi operator
First, create a dedicated namespace for the Kafka cluster.
kubectl create namespace kafka
Next, download the Strimzi operator installation YAML. The repository already contains the file manifests/kafka/strimzi-cluster-operator-0.45.1.yaml, but the following commands show how it was downloaded and modified for this guide.
## Define the Strimzi version and download URL
STRIMZI_VERSION="0.45.1"
DOWNLOAD_URL=https://github.com/strimzi/strimzi-kafka-operator/releases/download/$STRIMZI_VERSION/strimzi-cluster-operator-$STRIMZI_VERSION.yaml
## Download the operator manifest
curl -L -o manifests/kafka/strimzi-cluster-operator-$STRIMZI_VERSION.yaml ${DOWNLOAD_URL}
## Modify the manifest to install the operator in the 'kafka' namespace
sed -i 's/namespace: .*/namespace: kafka/' manifests/kafka/strimzi-cluster-operator-$STRIMZI_VERSION.yamlNow, apply the manifest to install the Strimzi operator in your EKS cluster.
kubectl apply -f manifests/kafka/strimzi-cluster-operator-0.45.1.yaml -n kafkaDeploy a Kafka cluster
The configuration for our Kafka cluster is defined in manifests/kafka/kafka-cluster.yaml. It describes a simple, single-node cluster suitable for development, using ephemeral storage, meaning data will be lost if the pods restart.
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: fh-k8s-cluster
spec:
kafka:
version: 3.9.1
replicas: 1
listeners:
- name: plain
port: 9092
type: internal
tls: false
# ... (content truncated for brevity)Deploy the Kafka cluster with the following command:
kubectl create -f manifests/kafka/kafka-cluster.yaml -n kafkaVerify the deployment
After a few minutes, all the necessary pods and services for Kafka will be running. You can verify this by listing all resources in the kafka namespace.
kubectl get all -n kafka -o nameThe output should look similar to this, showing the pods for Strimzi, Kafka, Zookeeper, and the associated services. The most important service for connecting applications is the Kafka bootstrap service.
# pod/fh-k8s-cluster-entity-operator-...
# pod/fh-k8s-cluster-kafka-0
# ...
# service/fh-k8s-cluster-kafka-bootstrap <-- Kafka bootstrap service
# ...Deploy Kpow
Now that the EKS and Kafka clusters are running, we can deploy Kpow. This guide covers the deployment of both Kpow Annual and Kpow Hourly products. Both deployments will use a common set of configurations for connecting to Kafka and setting up authentication/authorization.
First, ensure you have a namespace for Kpow. The eksctl command we ran earlier already created the service accounts in the factorhouse namespace, so we will use that. If you hadn't created it, you would run kubectl create namespace factorhouse.
Create ConfigMaps
We will use two Kubernetes ConfigMaps to manage Kpow's configuration. This approach separates the core configuration from the Helm deployment values.
kpow-config-files: This ConfigMap holds file-based configurations, including RBAC policies, JAAS configuration, and user properties for authentication.kpow-config: This ConfigMap provides environment variables to the Kpow container, such as the Kafka bootstrap address and settings to enable our authentication provider.
The contents of these files can be found in the repository at manifests/kpow/config-files.yaml and manifests/kpow/config.yaml.
manifests/kpow/config-files.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: kpow-config-files
namespace: factorhouse
data:
hash-rbac.yml: |
# RBAC policies defining user roles and permissions
admin_roles:
- "kafka-admins"
# ... (content truncated for brevity)
hash-jaas.conf: |
# JAAS login module configuration
kpow {
org.eclipse.jetty.jaas.spi.PropertyFileLoginModule required
file="/etc/kpow/jaas/hash-realm.properties";
};
# ... (content truncated for brevity)
hash-realm.properties: |
# User credentials (username: password, roles)
# admin/admin
admin: CRYPT:adpexzg3FUZAk,server-administrators,content-administrators,kafka-admins
# user/password
user: password,kafka-users
# ... (content truncated for brevity)manifests/kpow/config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: kpow-config
namespace: factorhouse
data:
# Environment Configuration
BOOTSTRAP: "fh-k8s-cluster-kafka-bootstrap.kafka.svc.cluster.local:9092"
REPLICATION_FACTOR: "1"
# AuthN + AuthZ
JAVA_TOOL_OPTIONS: "-Djava.awt.headless=true -Djava.security.auth.login.config=/etc/kpow/jaas/hash-jaas.conf"
AUTH_PROVIDER_TYPE: "jetty"
RBAC_CONFIGURATION_FILE: "/etc/kpow/rbac/hash-rbac.yml"Apply these manifests to create the ConfigMaps in the factorhouse namespace.
kubectl apply -f manifests/kpow/config-files.yaml \
-f manifests/kpow/config.yaml -n factorhouseYou can verify their creation by running:
kubectl get configmap -n factorhouse
# NAME DATA AGE
# kpow-config 5 ...
# kpow-config-files 3 ...Deploy Kpow Annual
Download the Helm chart
The Helm chart for Kpow Annual is in a private Amazon ECR repository. First, authenticate your Helm client.
# Enable Helm's experimental support for OCI registries
export HELM_EXPERIMENTAL_OCI=1
# Log in to the AWS Marketplace ECR registry
aws ecr get-login-password \
--region us-east-1 | helm registry login \
--username AWS \
--password-stdin 709825985650.dkr.ecr.us-east-1.amazonaws.comNext, pull and extract the chart.
# Create a directory, pull the chart, and extract it
mkdir -p awsmp-chart && cd awsmp-chart
# Pull the latest version of the Helm chart from ECR (add --version <x.x.x> to specify a version)
helm pull oci://709825985650.dkr.ecr.us-east-1.amazonaws.com/factor-house/kpow-aws-annual
tar xf $(pwd)/* && find $(pwd) -maxdepth 1 -type f -delete
cd ..Launch Kpow Annual
Now, install Kpow using Helm. We will reference the service account kpow-annual that was created during the EKS cluster setup, which has the required IAM policy for license management.
helm install kpow-annual ./awsmp-chart/kpow-aws-annual/ \
-n factorhouse \
--set serviceAccount.create=false \
--set serviceAccount.name=kpow-annual \
--values ./values/eks-annual.yamlThe Helm values for this deployment are in values/eks-annual.yaml. It mounts the configuration files from our ConfigMaps and sets resource limits.
# values/eks-annual.yaml
env:
ENVIRONMENT_NAME: "Kafka from Kpow Annual"
envFromConfigMap: "kpow-config"
volumeMounts:
- name: kpow-config-volumes
mountPath: /etc/kpow/rbac/hash-rbac.yml
subPath: hash-rbac.yml
- name: kpow-config-volumes
mountPath: /etc/kpow/jaas/hash-jaas.conf
subPath: hash-jaas.conf
- name: kpow-config-volumes
mountPath: /etc/kpow/jaas/hash-realm.properties
subPath: hash-realm.properties
volumes:
- name: kpow-config-volumes
configMap:
name: "kpow-config-files"
resources:
limits:
cpu: 1
memory: 0.5Gi
requests:
cpu: 1
memory: 0.5GiNote: The CPU and memory values are intentionally set low for this guide. For production environments, check the official documentation for recommended capacity.
Verify and access Kpow Annual
Check that the Kpow pod is running successfully.
kubectl get all -l app.kubernetes.io/instance=kpow-annual -n factorhouse
# NAME READY STATUS RESTARTS AGE
# pod/kpow-annual-kpow-aws-annual-c6bc849fb-zw5ww 0/1 Running 0 46s
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# service/kpow-annual-kpow-aws-annual ClusterIP 10.100.220.114 <none> 3000/TCP 47s
# ...To access the UI, forward the service port to your local machine.
kubectl -n factorhouse port-forward service/kpow-annual-kpow-aws-annual 3000:3000You can now access Kpow by navigating to http://localhost:3000 in your browser.

Deploy Kpow Hourly
Configure the Kpow Helm repository
The Helm chart for Kpow Hourly is available in the Factor House Helm repository. First, add the Helm repository.
helm repo add factorhouse https://charts.factorhouse.ioNext, update Helm repositories to ensure you install the latest version of Kpow.
helm repo updateLaunch Kpow Hourly
Install Kpow using Helm, referencing the kpow-hourly service account which has the IAM policy for marketplace metering.
helm install kpow-hourly factorhouse/kpow-aws-hourly \
-n factorhouse \
--set serviceAccount.create=false \
--set serviceAccount.name=kpow-hourly \
--values ./values/eks-hourly.yamlThe Helm values are defined in values/eks-hourly.yaml.
# values/eks-hourly.yaml
env:
ENVIRONMENT_NAME: "Kafka from Kpow Hourly"
envFromConfigMap: "kpow-config"
volumeMounts:
# ... (volume configuration is the same as annual)
volumes:
# ...
resources:
# ...Verify and access Kpow Hourly
Check that the Kpow pod is running.
kubectl get all -l app.kubernetes.io/instance=kpow-hourly -n factorhouse
# NAME READY STATUS RESTARTS AGE
# pod/kpow-hourly-kpow-aws-hourly-68869b6cb9-x9prf 0/1 Running 0 83s
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# service/kpow-hourly-kpow-aws-hourly ClusterIP 10.100.221.36 <none> 3000/TCP 85s
# ...To access the UI, forward the service port to a different local port (e.g., 3001) to avoid conflicts.
kubectl -n factorhouse port-forward service/kpow-hourly-kpow-aws-hourly 3001:3000You can now access Kpow by navigating to http://localhost:3001 in your browser.

Delete resources
To avoid ongoing AWS charges, clean up all created resources in reverse order.
Delete Kpow and ConfigMaps
helm uninstall kpow-annual kpow-hourly -n factorhouse
kubectl delete -f manifests/kpow/config-files.yaml \
-f manifests/kpow/config.yaml -n factorhouseDelete the Kafka cluster and Strimzi operator
STRIMZI_VERSION="0.45.1"
kubectl delete -f manifests/kafka/kafka-cluster.yaml -n kafka
kubectl delete -f manifests/kafka/strimzi-cluster-operator-$STRIMZI_VERSION.yaml -n kafkaDelete the EKS cluster
This command will remove the cluster and all associated resources.
eksctl delete cluster -f manifests/eks/cluster.eksctl.yamlConclusion
In this guide, we have successfully deployed a complete, production-ready environment for monitoring Apache Kafka on AWS. By leveraging eksctl, we provisioned a robust EKS cluster with correctly configured IAM roles for service accounts, a critical step for secure integration with AWS services. We then deployed a Kafka cluster using the Strimzi operator, demonstrating the power of Kubernetes operators in simplifying complex stateful applications.
Finally, we walked through the deployment of both Kpow Annual and Kpow Hourly from the AWS Marketplace. This showcased the flexibility of Kpow's subscription models and their seamless integration with AWS for licensing and metering. You are now equipped with the knowledge to set up, configure, and manage Kpow on EKS, unlocking powerful insights and operational control over your Kafka ecosystem.
Highlights
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
.webp)
Deploy Kpow on EKS via AWS Marketplace using Helm
Streamline your Kpow deployment on Amazon EKS with our guide, fully integrated with the AWS Marketplace. We use eksctl to automate IAM Roles for Service Accounts (IRSA), providing a secure integration for Kpow's licensing and metering. This allows your instance to handle license validation via AWS License Manager and report usage for hourly subscriptions, enabling a production-ready deployment with minimal configuration.
Overview
This guide provides a comprehensive walkthrough for deploying Kpow, a powerful toolkit for Apache Kafka, onto an Amazon EKS (Elastic Kubernetes Service) cluster. We will cover the entire process from start to finish, including provisioning the necessary AWS infrastructure, deploying a Kafka cluster using the Strimzi operator, and finally, installing Kpow using a subscription from the AWS Marketplace.
The guide demonstrates how to set up both Kpow Annual and Kpow Hourly products, highlighting the specific integration points with AWS services like IAM for service accounts, ECR for container images, and the AWS License Manager for the annual subscription. By the end of this tutorial, you will have a fully functional environment running Kpow on EKS, ready to monitor and manage your Kafka cluster.
The source code and configuration files used in this guide can be found in the features/eks-deployment folder of this GitHub repository.
About Factor House
Factor House is a leader in real-time data tooling, empowering engineers with innovative solutions for Apache Kafka® and Apache Flink®.
Our flagship product, Kpow for Apache Kafka, is the market-leading enterprise solution for Kafka management and monitoring.
Explore our live multi-cluster demo environment or grab a free Community license and dive into streaming tech on your laptop with Factor House Local.

Prerequisites
To follow along the guide, you need:
- CLI Tools:
- AWS Infrastructure:
- VPC: A Virtual Private Cloud (VPC) that has both public and private subnets is required.
- IAM Permissions: A user with the necessary IAM permissions to create an EKS cluster with a service account.
- Kpow Subscription:
- A subscription to a Kpow product through the AWS Marketplace is required. After subscribing, you will receive access to the necessary components and deployment instructions.
- The specifics of accessing the container images and Helm chart depend on the chosen Kpow product:
- Kpow Annual product:
- Subscribing to the annual product provides access to the ECR (Elastic Container Registry) image and the corresponding Helm chart.
- Kpow Hourly product:
- For the hourly product, access to the ECR image will be provided and deployment utilizes the public Factor House Helm repository for installation.
- Kpow Annual product:
Deploy an EKS cluster
We will use eksctl to provision an Amazon EKS cluster. The configuration for the cluster is defined in the manifests/eks/cluster.eksctl.yaml file within the repository.
Before creating the cluster, you must open this file and replace the placeholder values for <VPC-ID>, <PRIVATE-SUBNET-ID-* >, and <PUBLIC-SUBNET-ID-* > with your actual VPC and subnet IDs.
⚠️ The provided configuration assumes the EKS cluster will be deployed in theus-east-1region. If you intend to use a different region, you must update themetadata.regionfield and ensure the availability zone keys undervpc.subnets(e.g.,us-east-1a,us-east-1b) match the availability zones of the subnets in your chosen region.
Here is the content of the cluster.eksctl.yaml file:
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: fh-eks-cluster
region: us-east-1
vpc:
id: "<VPC-ID>"
subnets:
private:
us-east-1a:
id: "<PRIVATE-SUBNET-ID-1>"
us-east-1b:
id: "<PRIVATE-SUBNET-ID-2>"
public:
us-east-1a:
id: "<PUBLIC-SUBNET-ID-1>"
us-east-1b:
id: "<PUBLIC-SUBNET-ID-2>"
iam:
withOIDC: true
serviceAccounts:
- metadata:
name: kpow-annual
namespace: factorhouse
attachPolicyARNs:
- "arn:aws:iam::aws:policy/service-role/AWSLicenseManagerConsumptionPolicy"
- metadata:
name: kpow-hourly
namespace: factorhouse
attachPolicyARNs:
- "arn:aws:iam::aws:policy/AWSMarketplaceMeteringRegisterUsage"
nodeGroups:
- name: ng-dev
instanceType: t3.medium
desiredCapacity: 4
minSize: 2
maxSize: 6
privateNetworking: trueThis configuration sets up the following:
- Cluster Metadata: A cluster named
fh-eks-clusterin theus-east-1region. - VPC: Specifies an existing VPC and its public/private subnets where the cluster resources will be deployed.
- IAM with OIDC: Enables the IAM OIDC provider, which allows Kubernetes service accounts to be associated with IAM roles. This is crucial for granting AWS permissions to your pods.
- Service Accounts:
kpow-annual: Creates a service account for the Kpow Annual product. It attaches theAWSLicenseManagerConsumptionPolicy, allowing Kpow to validate its license with the AWS License Manager service.kpow-hourly: Creates a service account for the Kpow Hourly product. It attaches theAWSMarketplaceMeteringRegisterUsagepolicy, which is required for reporting usage metrics to the AWS Marketplace.
- Node Group: Defines a managed node group named
ng-devwitht3.mediuminstances. The worker nodes will be placed in the private subnets (privateNetworking: true).
Once you have updated the YAML file with your networking details, run the following command to create the cluster. This process can take 15-20 minutes to complete.
eksctl create cluster -f cluster.eksctl.yamlOnce the cluster is created, eksctl automatically updates your kubeconfig file (usually located at ~/.kube/config) with the new cluster's connection details. This allows you to start interacting with your cluster immediately using kubectl.
kubectl get nodes
# NAME STATUS ROLES AGE VERSION
# ip-192-168-...-21.ec2.internal Ready <none> 2m15s v1.32.9-eks-113cf36
# ...Launch a Kafka cluster
With the EKS cluster running, we will now launch an Apache Kafka cluster into it. We will use the Strimzi Kafka operator, which simplifies the process of running Kafka on Kubernetes.
Install the Strimzi operator
First, create a dedicated namespace for the Kafka cluster.
kubectl create namespace kafka
Next, download the Strimzi operator installation YAML. The repository already contains the file manifests/kafka/strimzi-cluster-operator-0.45.1.yaml, but the following commands show how it was downloaded and modified for this guide.
## Define the Strimzi version and download URL
STRIMZI_VERSION="0.45.1"
DOWNLOAD_URL=https://github.com/strimzi/strimzi-kafka-operator/releases/download/$STRIMZI_VERSION/strimzi-cluster-operator-$STRIMZI_VERSION.yaml
## Download the operator manifest
curl -L -o manifests/kafka/strimzi-cluster-operator-$STRIMZI_VERSION.yaml ${DOWNLOAD_URL}
## Modify the manifest to install the operator in the 'kafka' namespace
sed -i 's/namespace: .*/namespace: kafka/' manifests/kafka/strimzi-cluster-operator-$STRIMZI_VERSION.yamlNow, apply the manifest to install the Strimzi operator in your EKS cluster.
kubectl apply -f manifests/kafka/strimzi-cluster-operator-0.45.1.yaml -n kafkaDeploy a Kafka cluster
The configuration for our Kafka cluster is defined in manifests/kafka/kafka-cluster.yaml. It describes a simple, single-node cluster suitable for development, using ephemeral storage, meaning data will be lost if the pods restart.
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: fh-k8s-cluster
spec:
kafka:
version: 3.9.1
replicas: 1
listeners:
- name: plain
port: 9092
type: internal
tls: false
# ... (content truncated for brevity)Deploy the Kafka cluster with the following command:
kubectl create -f manifests/kafka/kafka-cluster.yaml -n kafkaVerify the deployment
After a few minutes, all the necessary pods and services for Kafka will be running. You can verify this by listing all resources in the kafka namespace.
kubectl get all -n kafka -o nameThe output should look similar to this, showing the pods for Strimzi, Kafka, Zookeeper, and the associated services. The most important service for connecting applications is the Kafka bootstrap service.
# pod/fh-k8s-cluster-entity-operator-...
# pod/fh-k8s-cluster-kafka-0
# ...
# service/fh-k8s-cluster-kafka-bootstrap <-- Kafka bootstrap service
# ...Deploy Kpow
Now that the EKS and Kafka clusters are running, we can deploy Kpow. This guide covers the deployment of both Kpow Annual and Kpow Hourly products. Both deployments will use a common set of configurations for connecting to Kafka and setting up authentication/authorization.
First, ensure you have a namespace for Kpow. The eksctl command we ran earlier already created the service accounts in the factorhouse namespace, so we will use that. If you hadn't created it, you would run kubectl create namespace factorhouse.
Create ConfigMaps
We will use two Kubernetes ConfigMaps to manage Kpow's configuration. This approach separates the core configuration from the Helm deployment values.
kpow-config-files: This ConfigMap holds file-based configurations, including RBAC policies, JAAS configuration, and user properties for authentication.kpow-config: This ConfigMap provides environment variables to the Kpow container, such as the Kafka bootstrap address and settings to enable our authentication provider.
The contents of these files can be found in the repository at manifests/kpow/config-files.yaml and manifests/kpow/config.yaml.
manifests/kpow/config-files.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: kpow-config-files
namespace: factorhouse
data:
hash-rbac.yml: |
# RBAC policies defining user roles and permissions
admin_roles:
- "kafka-admins"
# ... (content truncated for brevity)
hash-jaas.conf: |
# JAAS login module configuration
kpow {
org.eclipse.jetty.jaas.spi.PropertyFileLoginModule required
file="/etc/kpow/jaas/hash-realm.properties";
};
# ... (content truncated for brevity)
hash-realm.properties: |
# User credentials (username: password, roles)
# admin/admin
admin: CRYPT:adpexzg3FUZAk,server-administrators,content-administrators,kafka-admins
# user/password
user: password,kafka-users
# ... (content truncated for brevity)manifests/kpow/config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: kpow-config
namespace: factorhouse
data:
# Environment Configuration
BOOTSTRAP: "fh-k8s-cluster-kafka-bootstrap.kafka.svc.cluster.local:9092"
REPLICATION_FACTOR: "1"
# AuthN + AuthZ
JAVA_TOOL_OPTIONS: "-Djava.awt.headless=true -Djava.security.auth.login.config=/etc/kpow/jaas/hash-jaas.conf"
AUTH_PROVIDER_TYPE: "jetty"
RBAC_CONFIGURATION_FILE: "/etc/kpow/rbac/hash-rbac.yml"Apply these manifests to create the ConfigMaps in the factorhouse namespace.
kubectl apply -f manifests/kpow/config-files.yaml \
-f manifests/kpow/config.yaml -n factorhouseYou can verify their creation by running:
kubectl get configmap -n factorhouse
# NAME DATA AGE
# kpow-config 5 ...
# kpow-config-files 3 ...Deploy Kpow Annual
Download the Helm chart
The Helm chart for Kpow Annual is in a private Amazon ECR repository. First, authenticate your Helm client.
# Enable Helm's experimental support for OCI registries
export HELM_EXPERIMENTAL_OCI=1
# Log in to the AWS Marketplace ECR registry
aws ecr get-login-password \
--region us-east-1 | helm registry login \
--username AWS \
--password-stdin 709825985650.dkr.ecr.us-east-1.amazonaws.comNext, pull and extract the chart.
# Create a directory, pull the chart, and extract it
mkdir -p awsmp-chart && cd awsmp-chart
# Pull the latest version of the Helm chart from ECR (add --version <x.x.x> to specify a version)
helm pull oci://709825985650.dkr.ecr.us-east-1.amazonaws.com/factor-house/kpow-aws-annual
tar xf $(pwd)/* && find $(pwd) -maxdepth 1 -type f -delete
cd ..Launch Kpow Annual
Now, install Kpow using Helm. We will reference the service account kpow-annual that was created during the EKS cluster setup, which has the required IAM policy for license management.
helm install kpow-annual ./awsmp-chart/kpow-aws-annual/ \
-n factorhouse \
--set serviceAccount.create=false \
--set serviceAccount.name=kpow-annual \
--values ./values/eks-annual.yamlThe Helm values for this deployment are in values/eks-annual.yaml. It mounts the configuration files from our ConfigMaps and sets resource limits.
# values/eks-annual.yaml
env:
ENVIRONMENT_NAME: "Kafka from Kpow Annual"
envFromConfigMap: "kpow-config"
volumeMounts:
- name: kpow-config-volumes
mountPath: /etc/kpow/rbac/hash-rbac.yml
subPath: hash-rbac.yml
- name: kpow-config-volumes
mountPath: /etc/kpow/jaas/hash-jaas.conf
subPath: hash-jaas.conf
- name: kpow-config-volumes
mountPath: /etc/kpow/jaas/hash-realm.properties
subPath: hash-realm.properties
volumes:
- name: kpow-config-volumes
configMap:
name: "kpow-config-files"
resources:
limits:
cpu: 1
memory: 0.5Gi
requests:
cpu: 1
memory: 0.5GiNote: The CPU and memory values are intentionally set low for this guide. For production environments, check the official documentation for recommended capacity.
Verify and access Kpow Annual
Check that the Kpow pod is running successfully.
kubectl get all -l app.kubernetes.io/instance=kpow-annual -n factorhouse
# NAME READY STATUS RESTARTS AGE
# pod/kpow-annual-kpow-aws-annual-c6bc849fb-zw5ww 0/1 Running 0 46s
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# service/kpow-annual-kpow-aws-annual ClusterIP 10.100.220.114 <none> 3000/TCP 47s
# ...To access the UI, forward the service port to your local machine.
kubectl -n factorhouse port-forward service/kpow-annual-kpow-aws-annual 3000:3000You can now access Kpow by navigating to http://localhost:3000 in your browser.

Deploy Kpow Hourly
Configure the Kpow Helm repository
The Helm chart for Kpow Hourly is available in the Factor House Helm repository. First, add the Helm repository.
helm repo add factorhouse https://charts.factorhouse.ioNext, update Helm repositories to ensure you install the latest version of Kpow.
helm repo updateLaunch Kpow Hourly
Install Kpow using Helm, referencing the kpow-hourly service account which has the IAM policy for marketplace metering.
helm install kpow-hourly factorhouse/kpow-aws-hourly \
-n factorhouse \
--set serviceAccount.create=false \
--set serviceAccount.name=kpow-hourly \
--values ./values/eks-hourly.yamlThe Helm values are defined in values/eks-hourly.yaml.
# values/eks-hourly.yaml
env:
ENVIRONMENT_NAME: "Kafka from Kpow Hourly"
envFromConfigMap: "kpow-config"
volumeMounts:
# ... (volume configuration is the same as annual)
volumes:
# ...
resources:
# ...Verify and access Kpow Hourly
Check that the Kpow pod is running.
kubectl get all -l app.kubernetes.io/instance=kpow-hourly -n factorhouse
# NAME READY STATUS RESTARTS AGE
# pod/kpow-hourly-kpow-aws-hourly-68869b6cb9-x9prf 0/1 Running 0 83s
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# service/kpow-hourly-kpow-aws-hourly ClusterIP 10.100.221.36 <none> 3000/TCP 85s
# ...To access the UI, forward the service port to a different local port (e.g., 3001) to avoid conflicts.
kubectl -n factorhouse port-forward service/kpow-hourly-kpow-aws-hourly 3001:3000You can now access Kpow by navigating to http://localhost:3001 in your browser.

Delete resources
To avoid ongoing AWS charges, clean up all created resources in reverse order.
Delete Kpow and ConfigMaps
helm uninstall kpow-annual kpow-hourly -n factorhouse
kubectl delete -f manifests/kpow/config-files.yaml \
-f manifests/kpow/config.yaml -n factorhouseDelete the Kafka cluster and Strimzi operator
STRIMZI_VERSION="0.45.1"
kubectl delete -f manifests/kafka/kafka-cluster.yaml -n kafka
kubectl delete -f manifests/kafka/strimzi-cluster-operator-$STRIMZI_VERSION.yaml -n kafkaDelete the EKS cluster
This command will remove the cluster and all associated resources.
eksctl delete cluster -f manifests/eks/cluster.eksctl.yamlConclusion
In this guide, we have successfully deployed a complete, production-ready environment for monitoring Apache Kafka on AWS. By leveraging eksctl, we provisioned a robust EKS cluster with correctly configured IAM roles for service accounts, a critical step for secure integration with AWS services. We then deployed a Kafka cluster using the Strimzi operator, demonstrating the power of Kubernetes operators in simplifying complex stateful applications.
Finally, we walked through the deployment of both Kpow Annual and Kpow Hourly from the AWS Marketplace. This showcased the flexibility of Kpow's subscription models and their seamless integration with AWS for licensing and metering. You are now equipped with the knowledge to set up, configure, and manage Kpow on EKS, unlocking powerful insights and operational control over your Kafka ecosystem.
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript

Release 94.6: Factor Platform, Ververica Integration, and kJQ Enhancements
The first Factor Platform release candidate is here, a major milestone toward a unified control plane for real-time data streaming technologies. This release also introduces Ververica Platform integration in Flex, plus support for Kafka Clients 4.1 / Confluent 8.0.0 and new kJQ operators for richer stream inspection.
Factor Platform release candidate: Early access to unified streaming control
For organisations operating streaming at scale, the challenge has never been about any one technology. It's about managing complexity across regions, tools, and teams while maintaining governance, performance, and cost control.
We've spent years building tools that bring clarity to Apache Kafka and Apache Flink. Now, we're taking everything we've learned and building something bigger: Factor Platform, a unified control plane for real-time data infrastructure.
Factor Platform delivers complete visibility and federated control across hundreds of clusters, multiple clouds, and distributed teams from a single interface. Engineers gain deep operational insight into jobs, topics, and lineage. Business and compliance teams benefit from native catalogs, FinOps intelligence, and audit-ready transparency.
The first release candidate is live. It's designed for early adopters exploring large-scale, persistent streaming environments, and it's ready to be shaped by the teams who use it.
Interested in early access? Contact sales@factorhouse.io

Unlocking native Flink management with Ververica Platform
Our collaboration with Ververica (the original creators of Apache Flink), enters a new phase with the introduction of Flex + Ververica Platform integration. This brings Flink’s enterprise management and observability capabilities directly into the Factor House ecosystem.
Flex users can now connect to Ververica Platform (Community or Enterprise v2) and instantly visualize session clusters, job deployments, and runtime performance. The current release provides a snapshot view of Ververica resources at startup, with live synchronization planned for future updates. It's a huge step toward true end-to-end streaming visibility—from data ingestion, to transformation, to delivery.
Configuration is straightforward: point to your Ververica REST API, authenticate via secure token, and your Flink environments appear right alongside your clusters.
This release represents just the beginning of our partnership with Ververica. Together, we’re exploring deeper integrations across the Flink ecosystem, including OpenShift and Amazon Managed Service for Apache Flink, to make enterprise-scale stream processing simpler and more powerful.
Read the full Ververica Platform integration guide →
Advancing Kafka support with Kafka Clients 4.1.0 and Confluent Schema SerDes 8.0.0
We’ve upgraded to Kafka Clients 4.1.0 / Confluent Schema SerDes 8.0.0, aligning Kpow with the latest Kafka ecosystem updates. Teams using custom Protobuf Serdes should review potential compatibility changes.
Data Inspect gets more powerful with kJQ enhancements
Data Inspect in Kpow has been upgraded with improvements to kJQ, our lightweight JSON query language for streaming data. The new release introduces map() and select() functions, expanding the expressive power of kJQ for working with nested and dynamic data. These additions make it possible to iterate over collections, filter elements based on complex conditions, and compose advanced data quality or anomaly detection filters directly in the browser. Users can now extract specific values from arrays, filter deeply nested structures, and chain logic with built-in functions like contains, test, and is-empty.
For example, you can now write queries like:
.value.correctingProperty.names | map(.localeLanguageCode) | contains("pt")Or filter and validate nested collections:
.value.names | map(select(.languageCode == "pt-Pt")) | is-empty | notThese updates make Data Inspect far more powerful for real-time debugging, validation, and exploratory data analysis. Explore the full range of examples and interactive demos in the kJQ documentation.
See map() and select() in action in the kJQ Playground →
Schema Registry performance improvements
We’ve greatly improved Schema Registry performance for large installations. The observation process now cuts down on the number of REST calls each schema observation makes by an order of magnitude. Kpow now defaults to SCHEMA_REGISTRY_OBSERVATION_VERSION=2, meaning all customers automatically benefit from these performance boosts.
.webp)
Kpow Custom Serdes and Protobuf v4.31.1
This post explains an update in the version of protobuf libraries used by Kpow, and a possible compatibility impact this update may cause to user defined Custom Serdes.
Kpow Custom Serdes and Protobuf v4.31.1
Note: The potential compatibility issues described in this post only impacts users who have implemented Custom Serdes that contain generated protobuf classes.
Resolution: If you encounter these compatibility issues, resolve them by re-generating any generated protobuf classes with protoc v31.1.
In the upcoming v94.6 release of Kpow, we're updating all Confluent Serdes dependencies to the latest major version 8.0.1.
In io.confluent/kafka-protobuf-serializer:8.0.1 the protobuf version is advanced from 3.25.5 to 4.31.1, and so the version of protobuf used by Kpow changes.
- Confluent protobuf upgrade PR: https://github.com/confluentinc/schema-registry/pull/3569
- Related Github issue: https://github.com/confluentinc/schema-registry/issues/3047
This is a major upgrade of the underlying protobuf libraries, and there are some breaking changes related to generated code.
Protobuf 3.26.6 introduces a breaking change that fails at runtime (deliberately) if the makeExtensionsImmutable method is called as part of generated protobuf code.
The decision to break at runtime was taken because earlier versions of protobuf were found to be vulnerable to the footmitten CVE.
- Protobuf footmitten CVE and breaking change announcement: https://protobuf.dev/news/2025-01-23/
- Apache protobuf discussion thread: https://lists.apache.org/thread/87osjw051xnx5l5v50dt3t81yfjxygwr
- Comment on a Schema Registry ticket: https://github.com/confluentinc/schema-registry/issues/3360
We found that when we advanced to the 8.0.1 version of the libraries; we encountered issues with some test classes generated by 3.x protobuf libraries.
Compilation issues:
Compiling 14 source files to /home/runner/work/core/core/target/kpow-enterprise/classes
/home/runner/work/core/core/modules/kpow/src-java-dev/factorhouse/serdes/MyRecordOuterClass.java:129: error: cannot find symbol
makeExtensionsImmutable();
^
symbol: method makeExtensionsImmutable()
location: class MyRecordRuntime issues:
Bad type on operand stack
Exception Details:
Location:
io/confluent/kafka/schemaregistry/protobuf/ProtobufSchema.toMessage(Lcom/google/protobuf/DescriptorProtos$FileDescriptorProto;Lcom/google/protobuf/DescriptorProtos$DescriptorProto;)Lcom/squareup/wire/schema/internal/parser/MessageElement; : invokestatic
Reason:
Type 'com/google/protobuf/DescriptorProtos$MessageOptions' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage$ExtendableMessage'
Current Frame:
bci:
flags: { }
locals: { 'com/google/protobuf/DescriptorProtos$FileDescriptorProto', 'com/google/protobuf/DescriptorProtos$DescriptorProto', 'java/lang/String', 'com/google/common/collect/ImmutableList$Builder', 'com/google/common/collect/ImmutableList$Builder', 'com/google/common/collect/ImmutableList$Builder', 'com/google/common/collect/ImmutableList$Builder', 'java/util/LinkedHashMap', 'java/util/LinkedHashMap', 'java/util/List', 'com/google/common/collect/ImmutableList$Builder' }
stack: { 'com/google/common/collect/ImmutableList$Builder', 'com/google/protobuf/DescriptorProtos$MessageOptions' }
Bytecode:
0000000: 2bb6 0334 4db2 0072 1303 352c b903 3703
0000010: 00b8 0159 4eb8 0159 3a04 b801 593a 05b8
0000020: 0159 3a06 bb02 8959 b702 8b3a 07bb 0289If you encounter these compatibility issues, resolve them by re-generating any generated protobuf classes with protoc v31.1.
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
All Resources

Join the conversation: Factor House launches open Slack for the real-time data community
Factor House has opened a public Slack for anyone working with streaming data, from seasoned engineers to newcomers exploring real-time systems. This space offers faster peer-to-peer support, open discussion across the ecosystem, and a friendly on-ramp for those just getting started.
Come and #say-hi
We've opened the Factor House Community Slack: a public space for anyone working in the streaming data world. Whether you're a seasoned data engineer, exploring streaming technologies for the first time, or just curious about real-time systems, this is your place to connect, learn, and share.
Join the Factor House Community Slack →
Our community isn’t just a chatroom, it’s part of building the collective intelligence that will shape the future of data in motion. We’re connecting the engineers, operators, and curious learners who will define what streaming infrastructure looks like in the AI era.
What changed (and why)
For years, our team, customers, and broader community were all in one busy workspace. It was a good start, but as our products and community grew, so did the noise. Support conversations were mixed with team chatter and community discussions, leaving newcomers unsure of where to begin.
We've split into two focused spaces:
- Factor House Community (public) - Where the magic happens
- Private workspace - For our team and client engagements
To our existing customers: thank you for your patience during this transition. Your feedback helps us build something better, and we're grateful for engineers who push us to improve.
Why join?
- Current users: Get faster answers from real humans who've solved similar problems. Our team is active daily on weekdays (Australian timezone), and we're building a community that helps each other.
- Newcomers: This is your friendly on-ramp to the data streaming world. Ask the "obvious" questions - we love helping engineers grow.
- Ecosystem: We're vendor-neutral and open-source friendly. Discuss any tools, share knowledge, and make announcements. The more diverse perspectives, the better.
Here’s where you’ll find us (and each other):
- #say-hi: introduce yourself to the community
- #getting-started: new to Kpow or Flex? Begin your journey here
- #ask-anything: all questions welcome, big or small
- #product-kpow & #product-flex: features, releases, and best practices for FH tooling
- #house-party: off-topic bants, memes, and pet pics
Our team is in the mix too. You’ll spot us by the Factor House logo in our status.
Community Guidelines
We want this community to reflect the best parts of engineering culture: openness, generosity, and curiosity. It’s not just about solving problems faster, it’s about building a place where people can do their best work together.
This is a friendly, moderated space. We ask everyone to be respectful and inclusive (read our code of conduct). Keep conversations in public channels wherever possible so everyone benefits.
For our customers: use your dedicated support channels or support@factorhouse.io for SLA-bound requests and bug reports. The community Slack is best-effort support.
Ready to Join?
This Slack is the seed of a wider ecosystem. It's a place where engineers share knowledge, swap stories, and push the boundaries of what’s possible with streaming data. It’s the beginning of a developer community that will grow alongside our platform.
This community will become what we make of it. We're hoping for technical discussions, mutual help, and the kind of engineering conversations that make your day better.
Join the Factor House Community Slack →
Come #say-hi and tell us what you're working on. We're genuinely curious about what keeps data engineers busy.

Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
.webp)
Building a Real-Time Leaderboard with Kafka and Flink
Learn how to build a real-time "Top-K" analytics pipeline from scratch using a modern data stack. This open-source project guides you through using Apache Kafka, Apache Flink, and Streamlit to ingest, process, and visualize live data, turning a continuous stream of events into actionable insights on an interactive dashboard.
Overview
In today's data-driven world, the ability to process and analyze information in real-time is a significant competitive advantage across many industries. Whether it's tracking top-selling products in e-commerce, identifying trending topics on social media, or monitoring high-performing assets in finance, real-time analytics pipelines are essential for gaining immediate insights.
This post explores a complete, open-source project that demonstrates how to build a real-time "Top-K" analytics pipeline. You'll learn how to ingest a continuous stream of data, process it on the fly to compute key performance metrics, and visualize the results on an interactive dashboard.
The core of this project is a robust data pipeline that can be broken down into three key stages:
- Data Generation: A Python script continuously generates a stream of simulated user events, which are then published to an Apache Kafka topic.
- Metrics Processing: Four distinct Apache Flink SQL jobs consume the raw data stream from Kafka. Each job is tailored to calculate a specific real-time leaderboard metric: Top Teams, Top Players, Hot Streakers, and Team MVPs. The results are written to their own dedicated Kafka topics.
- Dashboard Visualization: A Streamlit web application reads the processed metrics from the Flink output topics and presents them on a dynamic, real-time dashboard, offering at-a-glance insights into performance.
💡 The complete project, including all source code and setup instructions, is available on GitHub.
🚀 This project uses Factor House Local to spin up the development environment. See Introduction to Factor House Local to learn more about experimenting with modern data architectures using Kafka, Flink, Spark, Iceberg, and Pinot.

Diving into the Real-Time Metrics
The pipeline continuously computes four different leaderboard-style metrics using Flink SQL. A DDL script initially sets up the necessary source and sink tables. The source table, user_scores, reads directly from a Kafka topic. Each Flink SQL query consumes this stream, performs its calculations, and writes the output to a corresponding sink table (top_teams, top_players, hot_streakers, or team_mvps). These sink tables use the upsert-kafka connector, which ensures that the leaderboards are continuously updated as new data arrives.
- Top Teams: This metric identifies the top 10 entities (grouped as "teams") with the highest cumulative scores, providing a global view of group performance. The underlying Flink SQL query groups the data by
team_id, calculates a running sum of scores, and then ranks the teams. To ensure accuracy over long periods, the state for this data has a time-to-live (TTL) of 60 minutes. - Top Players: Similar to the Top Teams metric, this leaderboard showcases the top 10 individual entities (or "players") with the highest scores. The logic is much the same: the stream is grouped by
user_id, a cumulative score is calculated, and the entities are ranked globally. This also has a 60-minute TTL to maintain consistent stats over extended sessions. - Hot Streakers: This metric is designed to highlight the top 10 entities currently on a "hot streak," meaning their short-term performance is significantly outpacing their historical average. The query for this uses sliding time windows to calculate a short-term average (over 10 seconds) and a long-term average (over 60 seconds). The ratio between these two averages determines the "hotness." Since this metric focuses on recent activity, it uses a shorter state TTL of 5 minutes.
- Team MVPs: This metric first identifies the Most Valuable Player (MVP) for each team—the entity that contributed the largest percentage of the team's total score. It then ranks these MVPs across all teams to find the top 10 overall. This is achieved using Common Table Expressions (CTEs) in SQL to first calculate total scores per entity and per team, and then these are joined to determine each entity's contribution ratio.
Together, these metrics offer a rich, real-time view of system dynamics, highlighting top-performing groups, standout individuals, and rising stars. The final results are streamed to a responsive dashboard that displays the leaderboards in continuously refreshing bar charts, with each chart powered by its own dedicated Kafka topic.

Conclusion
This project serves as a practical blueprint for building powerful, real-time analytics systems. By combining the high-throughput messaging of Apache Kafka, the stateful stream processing of Apache Flink, and the rapid UI development of Streamlit, you can create sophisticated pipelines that deliver valuable insights with minimal latency.
The "Top-K" pattern is a versatile one, applicable to countless domains beyond the example shown here. The principles of stream ingestion, real-time aggregation, and interactive visualization form a solid foundation for any developer looking to harness the power of live data. We encourage you to clone the repository, run the project yourself, and adapt the architecture to your own unique use cases.
.png)
Introduction to Factor House Local
Jumpstart your journey into modern data engineering with Factor House Local. Explore pre-configured Docker environments for Kafka, Flink, Spark, and Iceberg, enhanced with enterprise-grade tools like Kpow and Flex. Our hands-on labs guide you step-by-step, from building your first Kafka client to creating a complete data lakehouse and real-time analytics system. It's the fastest way to learn, prototype, and build sophisticated data platforms.
Factor House Local is a collection of pre-configured Docker Compose environments that demonstrate modern data platform architectures. Each setup is purpose-built around a specific use case and incorporates widely adopted technologies such as Kafka, Flink, Spark, Iceberg, and Pinot. These environments are further enhanced by enterprise-grade tools from Factor House: Kpow, for Kafka management and control, and Flex, for seamless integration with Flink.

Data Stack
Kafka Development & Monitoring with Kpow
This stack provides a comprehensive, locally deployable Apache Kafka environment designed for robust development, testing, and operations. It utilizes Confluent Platform components, featuring a high-availability 3-node Kafka cluster, Zookeeper, Schema Registry for data governance, and Kafka Connect for data integration.
The centerpiece of the stack is Kpow (by Factorhouse), an enterprise-grade management and observability toolkit. Kpow offers a powerful web UI that provides deep visibility into brokers, topics, and consumer groups. Key features include real-time monitoring, advanced data inspection using kJQ (allowing complex queries across various data formats like Avro and Protobuf), and management of Schema Registry and Kafka Connect. Kpow also adds critical enterprise features such as Role-Based Access Control (RBAC), data masking/redaction for sensitive information, and audit logging.
Ideal For: Building and testing microservices, managing data integration pipelines, troubleshooting Kafka issues, and enforcing data governance in event-driven architectures.
Unified Analytics Platform (Flex, Flink, Spark, Iceberg & Hive Metastore)
This architecture establishes a modern Data Lakehouse that seamlessly integrates real-time stream processing and large-scale batch analytics. It eliminates data silos by allowing both Apache Flink (for streaming) and Apache Spark (for batch) to operate on the same data.
The foundation is built on Apache Iceberg tables stored in MinIO (S3-compatible storage), providing ACID transactions and schema evolution. A Hive Metastore, backed by PostgreSQL, acts as the unified catalog for both Flink and Spark. The PostgreSQL instance is also configured for Change Data Capture (CDC), enabling real-time synchronization from transactional databases into the lakehouse.
The stack includes Flex (by Factorhouse), an enterprise toolkit for managing and monitoring Apache Flink, offering enhanced security, multi-tenancy, and deep insights into Flink jobs. A Flink SQL Gateway is also included for interactive queries on live data streams.
Ideal For: Unified batch and stream analytics, real-time ETL, CDC pipelines from operational databases, fraud detection, and interactive self-service analytics on a single source of truth.
Apache Pinot Real-Time OLAP Cluster
This stack deploys the core components of Apache Pinot, a distributed OLAP (Online Analytical Processing) datastore specifically engineered for ultra-low-latency analytics at high throughput. Pinot is designed to ingest data from both batch sources (like S3) and streaming sources (like Kafka) and serve analytical queries with millisecond response times.
Ideal For: Powering real-time, interactive dashboards; user-facing analytics embedded within applications (where immediate feedback is crucial); anomaly detection; and rapid A/B testing analysis.
Centralized Observability & Data Lineage
This stack provides a complete solution for understanding both system health and data provenance. It combines Marquez, the reference implementation of the OpenLineage standard, with the industry-standard monitoring suite of Prometheus, Grafana, and Alertmanager.
At its core, OpenLineage enables automated data lineage tracking for Kafka, Flink, and Spark workloads by providing a standardized API for emitting metadata about jobs and datasets. Marquez consumes these events to build a living, interactive map of your data ecosystem. This allows you to trace how datasets are created and consumed, making it invaluable for impact analysis and debugging. The Prometheus stack complements this by collecting time-series metrics from all applications, visualizing them in Grafana dashboards, and using Alertmanager to send proactive notifications about potential system issues.
Ideal For: Tracking data provenance, performing root cause analysis for data quality issues, monitoring the performance of the entire data platform, and providing a unified view of both data lineage and system health.
Factor House Local Labs

The Factor House Local labs are a series of 12 hands-on tutorials designed to guide developers through building real-time data pipelines and analytics systems. The labs use a common dataset of orders from a Kafka topic to demonstrate a complete, end-to-end workflow, from data ingestion to real-time analytics.
The labs are organized around a few key themes:
💧 Lab 1 - Streaming with Confidence:
- Learn to produce and consume Avro data using Schema Registry. This lab helps you ensure data integrity and build robust, schema-aware Kafka streams.
🔗 Lab 2 - Building Data Pipelines with Kafka Connect:
- Discover the power of Kafka Connect! This lab shows you how to stream data from sources to sinks (e.g., databases, files) efficiently, often without writing a single line of code.
🧠 Labs 3, 4, 5 - From Events to Insights:
- Unlock the potential of your event streams! Dive into building real-time analytics applications using powerful stream processing techniques. You'll work on transforming raw data into actionable intelligence.
🏞️ Labs 6, 7, 8, 9, 10 - Streaming to the Data Lake:
- Build modern data lake foundations. These labs guide you through ingesting Kafka data into highly efficient and queryable formats like Parquet and Apache Iceberg, setting the stage for powerful batch and ad-hoc analytics.
💡 Labs 11, 12 - Bringing Real-Time Analytics to Life:
- See your data in motion! You'll construct reactive client applications and dashboards that respond to live data streams, providing immediate insights and visualizations.
Overall, the labs provide a practical, production-inspired journey, showing how to leverage Kafka, Flink, Spark, Iceberg, and Pinot together to build sophisticated, real-time data platforms.
Conclusion
Factor House Local is more than just a collection of Docker containers; it represents a holistic learning and development ecosystem for modern data engineering.
The pre-configured stacks serve as the ready-to-use "what," providing the foundational architecture for today's data platforms. The hands-on labs provide the practical "how," guiding users step-by-step through building real-world data pipelines that solve concrete problems.
By bridging the gap between event streaming (Kafka), large-scale processing (Flink, Spark), modern data storage (Iceberg), and low-latency analytics (Pinot), Factor House Local demystifies the complexity of building integrated data systems. Furthermore, the inclusion of enterprise-grade tools like Kpow and Flex demonstrates how to operate these systems with the observability, control, and security required for production environments.
Whether you are a developer looking to learn new technologies, an architect prototyping a new design, or a team building the foundation for your next data product, Factor House Local provides the ideal starting point to accelerate your journey.
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
%20(1).webp)
Integrate Kpow with Google Managed Schema Registry
Kpow 94.3 now integrates with Google Cloud's managed Schema Registry, enabling native OAuth authentication. This guide walks through the complete process of configuring authentication and using Kpow to create, manage, and inspect data validated against Avro schemas.
Overview
Google Cloud has enhanced its platform with the launch of a managed Schema Registry for Apache Kafka, a critical service for ensuring data quality and schema evolution in streaming architectures. Kpow 94.3 expands its support for Google Managed Service for Apache Kafka by integrating the managed schema registry. This allows users to manage Kafka clusters, topics, consumer groups, and schemas from a single interface.
Building on our earlier setup guide, this post details how to configure the new schema registry integration and demonstrates how to leverage the Kpow UI for working effectively with Avro schemas.
About Factor House
Factor House is a leader in real-time data tooling, empowering engineers with innovative solutions for Apache Kafka® and Apache Flink®.
Our flagship product, Kpow for Apache Kafka, is the market-leading enterprise solution for Kafka management and monitoring.
Explore our live multi-cluster demo environment or grab a free Community license and dive into streaming tech on your laptop with Factor House Local.

Prerequisites
In this tutorial, we will use the Community Edition of Kpow, where the default user has all the necessary permissions to complete the tasks. For those using the Kpow Enterprise Edition with user authorization enabled, the logged-in user must have the SCHEMA_CREATE permission for Role-Based Access Control or have ALLOW_SCHEMA_CREATE=true set for Simple Access Control. More information can be found in the Kpow User Authorization documentation.
We also assume that a Managed Kafka cluster has already been created, as detailed in the earlier setup guide. This cluster will serve as the foundation for the configurations and operations covered in this tutorial.
Create a Google Managed Schema Registry
We can create a schema registry using the gcloud beta managed-kafka schema-registries create command as shown below.
gcloud compute instances create kafka-test-instance \
--scopes=https://www.googleapis.com/auth/cloud-platform \
--tags=http-server \
--subnet=projects/$PROJECT_ID/regions/$REGION/subnetworks/default \
--zone=$REGION-aOnce the command completes, we can verify that the new registry, demo_schema_registry, is visible in the GCP Console under the Kafka services.

Set up a client VM
The default service account used by the client VM is granted the following roles. While these roles provide Kpow with administrative access, user-level permissions can still be controlled using User Authorization - an enterprise-only feature:
- Managed Kafka Admin: Grants full access to manage Kafka topics, configurations, and access controls in GCP’s managed Kafka environment.
- Schema Registry Admin: Allows registering, evolving, and managing schemas and compatibility settings in the Schema Registry.
To connect to the Kafka cluster, Kpow must run on a machine with network access to it. In this setup, we use a Google Cloud Compute Engine VM that must be in the same region, VPC, and subnet as the Kafka cluster. We also attach the http-server tag to allow HTTP traffic, enabling browser access to Kpow’s UI.
We can create the client VM using the following command:
gcloud compute instances create kafka-test-instance \ --scopes=https://www.googleapis.com/auth/cloud-platform \ --tags=http-server \ --subnet=projects/$PROJECT_ID/regions/$REGION/subnetworks/default \ --zone=$REGION-a
Launch a Kpow Instance
Once our client VM is up and running, we'll connect to it using the SSH-in-browser tool provided by Google Cloud. After establishing the connection, the first step is to install Docker Engine, as Kpow will be launched using Docker. Refer to the official installation and post-installation guides for detailed instructions.
Preparing Kpow Configuration
To get Kpow running with a Google Cloud managed Kafka cluster and its schema registry, we prepare a configuration file (gcp-trial.env) that defines all necessary connection and authentication settings, as well as the Kpow license details.
The configuration is divided into three main parts: Kafka cluster connection, schema registry integration, and license activation.
## Kafka Cluster Configuration
ENVIRONMENT_NAME=GCP Kafka Cluster
BOOTSTRAP=bootstrap.<cluster-id>.<gcp-region>.managedkafka.<gcp-project-id>.cloud.goog:9092
SECURITY_PROTOCOL=SASL_SSL
SASL_MECHANISM=OAUTHBEARER
SASL_LOGIN_CALLBACK_HANDLER_CLASS=com.google.cloud.hosted.kafka.auth.GcpLoginCallbackHandler
SASL_JAAS_CONFIG=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required;
## Schema Registry Configuration
SCHEMA_REGISTRY_NAME=GCP Schema Registry
SCHEMA_REGISTRY_URL=https://managedkafka.googleapis.com/v1/projects/<gcp-project-id>/locations/<gcp-region>/schemaRegistries/<registry-id>
SCHEMA_REGISTRY_BEARER_AUTH_CUSTOM_PROVIDER_CLASS=com.google.cloud.hosted.kafka.auth.GcpBearerAuthCredentialProvider
SCHEMA_REGISTRY_BEARER_AUTH_CREDENTIALS_SOURCE=CUSTOM
## Your License Details
LICENSE_ID=<license-id>
LICENSE_CODE=<license-code>
LICENSEE=<licensee>
LICENSE_EXPIRY=<license-expiry>
LICENSE_SIGNATURE=<license-signature>In the Kafka Cluster Configuration section, the ENVIRONMENT_NAME variable sets a friendly label visible in the Kpow user interface. The BOOTSTRAP variable specifies the Kafka bootstrap server address, incorporating the cluster ID, Google Cloud region, and project ID.
Authentication and secure communication are handled via SASL over SSL using OAuth tokens. The SASL_MECHANISM is set to OAUTHBEARER, enabling OAuth-based authentication. The class GcpLoginCallbackHandler automatically manages OAuth tokens using the VM's service account or a specified credentials file, simplifying token management and securing Kafka connections.
The Schema Registry Configuration section integrates Kpow with Google Cloud's managed Schema Registry service. The SCHEMA_REGISTRY_NAME is a descriptive label for the registry in Kpow. The SCHEMA_REGISTRY_URL points to the REST API endpoint for the schema registry; placeholders must be replaced with the actual project ID, region, and registry ID.
For authentication, Kpow uses Google's GcpBearerAuthCredentialProvider to acquire OAuth2 tokens when accessing the schema registry API. Setting SCHEMA_REGISTRY_BEARER_AUTH_CREDENTIALS_SOURCE to CUSTOM tells Kpow to use this provider, allowing seamless and secure schema fetch and management with Google Cloud's identity controls.
Finally, the License Details section contains essential license parameters required to activate and run Kpow.
Launching Kpow
Once the gcp-trial.env file is ready, we can launch Kpow using Docker. The command below pulls the latest Community Edition image, loads the environment config, and binds port 3000 (Kpow UI) to port 80 on the host VM. This allows us to access the Kpow UI directly in the browser at http://<vm-external-ip>:
docker run --pull=always -p 80:3000 --name kpow \
--env-file gcp-trial.env -d factorhouse/kpow-ce:latest

Schema Management
With our environment up and running, we can use Kpow to create a new schema subject in the GCP Schema Registry.
- In the Schema menu, click Create subject.
- Since we only have one registry configured, the GCP Schema Registry is selected by default.
- Enter a subject name (e.g.,
demo-gcp-value), chooseAVROas the type, and provide a schema definition. Click Create.

Once created, the new subject appears in the Schema menu within Kpow. This allows us to easily view, manage, and interact with the schema.

Working with Avro Data
Next, we'll produce and inspect an Avro record that is validated against the schema we just created.
First, create a new topic named demo-gcp from the Kpow UI.

Now, to produce a record to the demo-gcp topic:
- Go to the Data menu, select the topic, and open the Produce tab.
- Select
Stringas the Key Serializer - Set the Value Serializer to
AVRO. - Choose GCP Schema Registry as the Schema Registry.
- Select the
demo-gco-valuesubject. - Enter key/value data and click Produce.

To see the result, navigate back to the Inspect tab and select the demo-gcp topic. In the deserializer options, choose String as the Key deserializer and AVRO as the Value deserializer, then select GCP Schema Registry. Kpow automatically fetches the correct schema version, deserializes the binary Avro message, and presents the data as easy-to-read JSON.
Tip: Kpow 94.3 introduces automatic deserialization of keys and values. For users unfamiliar with a topic's data format, selecting Auto lets Kpow attempt to infer and deserialize the records automatically as they are consumed.

Conclusion
Integrating Kpow with Google Cloud's Managed Schema Registry consolidates our entire Kafka management workflow into a single, powerful platform. By following this guide, we have seen how to configure Kpow to securely connect to both GCP Managed Kafka and the Schema Registry using native OAuth authentication, completely removing the need for manual token handling.
The result is a seamless, end-to-end experience where we can create and manage schemas, produce and consume schema-validated data, and inspect the records—all from the Kpow UI. This powerful combination streamlines development, enhances data governance, and empowers engineering teams to fully leverage Google Cloud's managed Kafka services.
.webp)
Improvements to Data Inspect in Kpow 94.3
Kpow's 94.3 release is here, transforming how you work with Kafka. Instantly query topics using plain English with our new AI-powered filtering, automatically decode any message format without manual setup, and leverage powerful new enhancements to our kJQ language. This update makes inspecting Kafka data more intuitive and powerful than ever before.
Overview
Kpow's Data Inspect feature has always been a cornerstone for developers working with Apache Kafka, offering a powerful way to query and understand topic data, as introduced in our earlier guide on how to query a Kafka topic.
The 94.3 release dramatically enhances this experience by introducing a suite of intelligent and user-friendly upgrades. This release focuses on making data inspection more accessible for all users while adding even more power for advanced use cases. The key highlights include AI-powered message filtering, which allows you to query Kafka using plain English; automatic deserialization, which removes the guesswork when dealing with unknown data formats; and significant enhancements to the kJQ language itself, providing more flexible and powerful filtering capabilities.
About Factor House
Factor House is a leader in real-time data tooling, empowering engineers with innovative solutions for Apache Kafka® and Apache Flink®.
Our flagship product, Kpow for Apache Kafka, is the market-leading enterprise solution for Kafka management and monitoring.
Explore our live multi-cluster demo environment or grab a free Community license and dive into streaming tech on your laptop with Factor House Local.

AI-Powered Message Filtering
Kpow now supports the integration of external AI models to enhance its capabilities, most notably through its "bring your own" (BYO) AI model functionality. This allows you to connect Kpow with various AI providers to power features within the platform.
AI Model Configuration
You have the flexibility to configure one or more AI model providers. Within your Kpow user preferences, you can then set a default model for all AI-assisted tasks. Configuration is managed through environment variables and is supported for the following providers:
| Provider |
Environment Variable
|
Description | Default |
Example
|
|---|---|---|---|---|
| OpenAI |
'OPENAI_API_KEY'
|
Your OpenAI API key | (required) | 'XXXX' |
|
'OPENAI_MODEL'
|
Model ID to use | 'gpt-4o-mini' | 'o3-mini' | |
| Anthropic |
'ANTHROPIC_API_KEY'
|
Your Anthropic API key | (required) | 'XXXX' |
|
'ANTHROPIC_MODEL'
|
Model ID to use | 'claude-3-7-sonnet-20250219' | 'claude-opus-4-20250514' | |
| Ollama |
'OLLAMA_MODEL'
|
Model ID to use (must support tools) | - | 'llama3.1:8b' |
|
OLLAMA_URL
|
URL of the Ollama model server | 'http://localhost:11434' |
https://prod.ollama.mycorp.io
|
If you need support for a different AI provider, you can contact the Factor House support team.
Enhanced AI Features
The primary AI-driven feature is the kJQ filter generation. This powerful tool enables you to query Kafka topics using natural language. Instead of writing complex kJQ expressions, you can simply describe the data you're looking for in plain English.
Here's how it works:
- Natural Language Processing: The system converts your conversational prompts (e.g., "show me all orders over $100 from the last hour") into precise kJQ filter expressions.
- Schema-Awareness: To improve accuracy, the AI can optionally use the schemas of your Kafka topics to understand field names, data types, and the overall structure of your data.
- Built-in Validation: Every filter generated by the AI is automatically checked against Kpow's kJQ engine to ensure it is syntactically correct before you run it.
This feature is accessible from the Data Inspect view for any topic. After the AI generates a filter, you have the option to execute it immediately, modify it for more specific needs, or save it for later use. For best results, it is recommended to provide specific and actionable descriptions in your natural language queries.
It is important to be mindful that AI-generated filters are probabilistic and may not always be perfect. Additionally, when using cloud-based AI providers, your data will be processed by them, so for sensitive information, using local models via Ollama or enterprise-grade AI services with strong privacy guarantees is recommended.
For more details, see the AI Models documentation.

Automatic Deserialization
Kpow simplifies data inspection with its "Auto SerDes" feature. In the Data Inspect view, you can select "Auto" as the deserializer, and Kpow will analyze the raw data to determine its format (like JSON, Avro, etc.) and decode it for you. This is especially useful in several scenarios, including:
- When you are exploring unfamiliar topics for the first time.
- While working with topics that may contain mixed or inconsistent data formats.
- When debugging serialization problems across different environments.
- For onboarding new team members who need to get up to speed on topic data quickly.
To make these findings permanent, you can enable the Topic SerDes Observation job by setting INFER_TOPIC_SERDES=true. When active, this job saves the automatically detected deserializer settings and any associated schema IDs, making them visible and persistent in the Kpow UI for future reference.

kJQ Language Enhancements
In response to our customers' evolving filtering needs, we've significantly improved the kJQ language to make Kafka record filtering more powerful and flexible. Check out the updated kJQ filters documentation for full details.
Below are some highlights of the improvements:
Chained alternatives
Selects the first non-null email address and checks if it ends with ".com":
.value.primary_email // .value.secondary_email // .value.contact_email | endswith(".com")String/Array slices
Matches where the first 3 characters of transaction_id equal TXN:
.value.transaction_id[0:3] == "TXN"For example, { "transaction_id": "TXN12345" } matches, while { "transaction_id": "ORD12345" } does not
UUID type support
kJQ supports UUID types out of the box, including the UUID deserializer, AVRO + logical types, or Transit / JSON and EDN deserializers that have richer data types.
To compare against literal UUID strings, prefix them with #uuid to coerce into a UUID:
.key == #uuid "fc1ba6a8-6d77-46a0-b9cf-277b6d355fa6"
Conclusion
The 94.3 release marks a significant leap forward for data exploration in Kpow. By integrating AI for natural language queries, automating the complexities of deserialization, and enriching the kJQ language with advanced functions, Kpow now caters to an even broader range of users. These updates streamline workflows for everyone, from new team members who can now inspect topics without prior knowledge of data formats, to seasoned engineers who can craft more sophisticated and precise queries than ever before. This release reaffirms our commitment to simplifying the complexities of Apache Kafka and empowering teams to unlock the full potential of their data with ease and efficiency.
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript

Release 94.4: Auto SerDes improvements
This minor hotfix release from Factor House resolves a bug when using Auto SerDes without Data policies, and adds support for UTF-8 String Auto SerDes inference.
Auto SerDes improvements
94.4 is a small hotfix release following up from last week's 94.3 release.
Kpow's Auto SerDes feature works alongside our data policies feature. Data policies allow you to configure declarative redaction policies against your data. When data policies are configured, any SerDes marked as non-redactable (e.g., String) will be excluded from the list of deserializers Kpow will try to use to infer the topic's data.
94.3 had a bug with this implementation where Auto SerDes detection was failing unless you had configured Kpow with a data policies file. 94.4 fixes this bug.
We have also improved the Auto SerDes inference based on customer feedback. Kpow will attempt to infer String records as the lowest priority. Kpow ensures the inferred data is a valid UTF-8 encoded string when inferring.
Events & Webinars
Stay plugged in with the Factor House team and our community.

[MELBOURNE, AUS] Apache Kafka and Apache Flink Meetup, 27 November
Melbourne, we’re making it a double feature. Workshop by day, meetup by night - same location, each with valuable content for data and software engineers, or those working with Data Streaming technologies. Build the backbone your apps deserve, then roll straight into the evening meetup.

[SYDNEY, AUS] Apache Kafka and Apache Flink Meetup, 26 November
Sydney, we’re making it a double feature. Workshop by day, meetup by night - same location, each with valuable content for data and software engineers, or those working with Data Streaming technologies. Build the backbone your apps deserve, then roll straight into the evening meetup.
Join the Factor Community
We’re building more than products, we’re building a community. Whether you're getting started or pushing the limits of what's possible with Kafka and Flink, we invite you to connect, share, and learn with others.