Overview
Apache Kafka is a cornerstone for many real-time data pipelines, but managing its infrastructure can be complex.
Google Cloud Managed Service for Apache Kafka offers a fully managed solution, simplifying deployment and operations, however effective monitoring and management remain crucial for ensuring the health and performance of Kafka clusters.
This article provides a practical, step-by-step guide on setting up a Google Cloud Managed Service for Apache Kafka cluster and connecting it from Kpow using the OAUTHBEARER
mechanism. We will walk through creating the necessary GCP resources, configuring a client virtual machine, and deploying a Kpow instance using Docker to demonstrate examples of monitoring and managing Kafka brokers and topics.
About Factor House
Factor House is a leader in real-time data tooling, empowering engineers with innovative solutions for Apache Kafka® and Apache Flink®.
Our flagship product, Kpow for Apache Kafka, is the market-leading enterprise solution for Kafka management and monitoring.
Explore our live multi-cluster demo environment or grab a free Community license and dive into streaming tech on your laptop with Factor House Local.
Create a Managed Kafka Cluster
We create GCP resources using the gcloud CLI. Once it is initialised, we should enable the Managed Kafka, Compute Engine, and Cloud DNS APIs as prerequisites.
gcloud services enable managedkafka.googleapis.com compute.googleapis.com dns.googleapis.com
To create a Managed Service for Apache Kafka cluster, we can use the gcloud managed-kafka clusters create
command by specifying the cluster ID, location, number of vCPUs (cpu), RAM (memory), and subnets.
export CLUSTER_ID=<cluster-id> export PROJECT_ID=<gcp-project-id> export PROJECT_NUMBER=<gcp-project-number> export REGION=<gcp-region> gcloud managed-kafka clusters create $CLUSTER_ID \ --location=$REGION \ --cpu=3 \ --memory=3GiB \ --subnets=projects/$PROJECT_ID/regions/$REGION/subnetworks/default \ --async
Set up a client VM
To connect to the Kafka cluster, Kpow must run on a machine with network access to it. In this setup, we use a Google Cloud Compute Engine virtual machine (VM). The VM must be located in the same region as the Kafka cluster and deployed within the same VPC and subnet specified during the cluster's configuration. We can create the client VM using the command shown below. We also attach the http-server
tag to the VM, which allows HTTP traffic and enables browser access to the Kpow instance.
gcloud compute instances create kafka-test-instance \ --scopes=https://www.googleapis.com/auth/cloud-platform \ --tags=http-server \ --subnet=projects/$PROJECT_ID/regions/$REGION/subnetworks/default \ --zone=$REGION-a
Also, we need to update the permissions of the default service account used by the client VM. To ensure that the Kpow instance running on the VM has full access to Managed Service for Apache Kafka resources, bind the predefined admin role (roles/managedkafka.admin
) to the service account. This grants Kpow the necessary administrative privileges. For more fine-grained access control within a Kafka cluster, it is recommended to use Kafka ACLs. The Enterprise Edition of Kpow provides robust support for it - see Kpow's ACL management documentation for more details.
gcloud projects add-iam-policy-binding $PROJECT_ID \ --member="serviceAccount:[email protected]" \ --role=roles/managedkafka.admin
Launch a Kpow Instance
Once our client VM is up and running, we'll connect to it using the SSH-in-browser tool provided by Google Cloud. After establishing the connection, install Docker Engine, as Kpow will be launched using Docker. Refer to the official installation and post-installation guides for detailed instructions.
With Docker ready, we'll then create Kpow's configuration file (e.g., gcp-trial.env
). This file defines Kpow's connection settings to the Google managed kafka cluster and include Kpow license details. To get started, confirm that a valid Kpow license is in place, whether we're using the Community or Enterprise edition.
The main section has the following config variables. The ENVIRONMENT_NAME
is a display label used within Kpow to identify the Kafka environment, while the BOOTSTRAP
value specifies the Kafka bootstrap server address, which Kpow uses to establish a connection. Connection security is managed through SASL over SSL, as indicated by the SECURITY_PROTOCOL
value. The SASL_MECHANISM
is set to OAUTHBEARER
, enabling OAuth-based authentication. To facilitate this, the SASL_LOGIN_CALLBACK_HANDLER_CLASS
is configured to use Google's GcpLoginCallbackHandler
, which handles OAuth token management for Kafka authentication. Lastly, SASL_JAAS_CONFIG
specifies the JAAS login module used for OAuth-based authentication.
As mentioned, this configuration file also houses our Kpow license details, which are essential to activate and run Kpow.
## Managed Service for Apache Kafka Cluster Configuration ENVIRONMENT_NAME=GCP Kafka Cluster BOOTSTRAP=bootstrap.<cluster-id>.<gcp-region>.managedkafka.<gcp-project-id>.cloud.goog:9092 SECURITY_PROTOCOL=SASL_SSL SASL_MECHANISM=OAUTHBEARER SASL_LOGIN_CALLBACK_HANDLER_CLASS=com.google.cloud.hosted.kafka.auth.GcpLoginCallbackHandler SASL_JAAS_CONFIG=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required; ## Your License Details LICENSE_ID=<license-id> LICENSE_CODE=<license-code> LICENSEE=<licensee> LICENSE_EXPIRY=<license-expiry> LICENSE_SIGNATURE=<license-signature>
Once the gcp-trial.env
file is prepared, we'll launch the Kpow instance using the docker run
command below. This command maps port 3000 (Kpow's UI port) to port 80 on the host. As a result, we can access the Kpow UI in the browser simply at http://<vm-external-ip>
, with no port number needed.
docker run --pull=always -p 80:3000 --name kpow \ --env-file gcp-trial.env -d factorhouse/kpow-ce:latest
Monitor and Manage Resources
With Kpow now running, we can use its user-friendly UI to monitor brokers, create a topic, send a message to it, and then watch that message get consumed.
Conclusion
By following the steps outlined in this post, we have successfully established a Google Cloud Managed Service for Apache Kafka cluster and deployed a Kpow instance on a Compute Engine VM. With this setup, we can immediately start exploring and managing Kafka brokers and topics, giving us valuable insights into our Kafka environment and streamlining operations.
Kpow is packed with powerful features, and it also integrates seamlessly with Kafka connectors deployed on Google Cloud Managed Kafka Connect clusters. This opens up a world of possibilities for managing data pipelines with ease. Stay tuned as we continue to roll out more integration examples in the future, enabling us all to unlock even more value from our Kafka and Kpow setups.