CUDs, compression, and more ways to cut expenses using Managed Services for Apache Kafka
As your company and streaming demands increase, it may become expensive to run and manage Apache Kafka clusters. Managed Services for Apache Kafka committed usage discounts (CUDs), which allow you to save up to 40% on compute expenses, are now generally available.
Apache Managed Service Kafka is an open-source Apache Kafka service that is fully managed, secure, and scalable on Google Cloud. By managing cluster construction with automatic broker size and rebalancing, this managed Kafka solution lowers the expense of human broker administration. Additionally, it includes Identity and Access Management (IAM), Cloud Monitoring, and Cloud Logging. By default, it makes all deployments extremely accessible.
Here are some pointers for maximising the cost-effectiveness of your Managed Services for Apache Kafka by utilising dedicated usage discounts and additional cost-saving features.
Use committed use discounts to save on compute
Committed usage discounts (CUDs) for Managed Services for Apache Kafka allow you to commit to utilising these resources in advance and save up to 40% on vCPU and RAM. For all Kafka clusters and projects under a single billing account, you may use the spend-based CUDs to reduce steady-state compute utilisation.
You save 20% above on-demand pricing for a one-year commitment and 40% for a three-year commitment. Google Cloud advise investing in Managed Services for Apache Kafka CUDs when you can commit to a minimum compute cost for a minimum of one year.
Optimize inter-zone transfer costs
There may be fees associated with data transfers between Kafka brokers and customers in various zones. Private Service Connect (PSC) charges transfer fees for data moving between zones, but not for data moving between Kafka brokers and customers in the same zone.
Data is automatically retrieved by consumers straight from the leader replica. Reading data across zones in this system results in inter-zone data transmission expenses. You may set up Kafka consumers to read messages from the replica in the same zone as your consumer starting with Apache Kafka version 2.4 (KIP-392). This feature, in which a rack denotes a cloud zone, is known as “rack awareness.” Both latency and data transport costs are decreased with this setup.
Customers must be configured so that their client rack IDs correspond to the cloud zone in which they are installed in order to establish consumer rack awareness. The cloud zone is assigned to each broker’s rack ID. Look for “broker.rack =” in the cluster logs to determine the Kafka brokers deployment location. The algorithm selects the most caught-up replica if there are many replicas with the same rack ID. If the client rack ID is null or there isn’t a replica in the same zone as the consumer, the replica selector reverts to the replica leader. You must have enough replicas set up in your subject and deploy at least one consumer in each zone with a broker.
What are Kafka Brokers?
Brokers are crucial to Apache Kafka. Apache Kafka speeds up data processing and exchange. Data messages are stored by Kafka brokers. Additionally, they oversee and transmit these data signals to the other system components that require them. A Kafka broker functions as a sort of intermediary between producers and consumers, facilitating the flow of information. All requests to write new information and read current information are handled by the broker. A collection of one or more Kafka brokers cooperating is known as a Kafka cluster. Every broker inside the cluster has a distinct numerical ID.
Turn on compression
By lowering the size of your communications, compression lowers network traffic and latency while using less storage, which can lower your networking and storage expenses. Compression can be turned on at the broker and producer levels. Although topic-level compression is also possible, it is advised to activate producer- and broker-level compression to ensure that it applies to all subjects.
The efficacy of compression can be enhanced by using a higher message batch size. Gzip, LZ4, or Snappy are the possible compression options (the ZStd compression type is not yet available). By default, compression will be utilised for broker-to-consumer communication, producer-to-broker communication, and broker-to-broker storage if you enable compression for producers.
Finetune batch size
You may increase performance while saving CPU and network resources by experimenting with different batch sizes. Depending on the batch size, Google Cloud’s own benchmarks reveal that producer throughput might vary by a factor of three. You can determine the ideal batch size for a particular cluster setup with a little tweaking. You are trading batch size for latency while trying to find the optimal batch size. Google Cloud advise experimenting with various batch sizes to get the best one for your particular cluster configuration after first establishing the maximum acceptable wait time with linger.ms in the producer option.
Reduce retention and storage in test environments
You may reduce the quantity of data retained and the related storage expenses by limiting the retention duration of your Kafka messages if you don’t need to keep them for a long time. For example, you may use log.retention.ms and log.retention.bytes to reduce the retention duration and size if you do not require the message backlog.
This results in messages expiring after the partition’s maximum size or after the retention period. You may further reduce storage in test environments by creating topics with a replication factor of two and setting the minimum number of insync replicas to one. But availability and dependability suffer as a result. This should only be done in test settings; it is not advised for production workloads.
Avoid message conversions
Kafka brokers and clients may not be in agreement on the underlying binary message format if they are using different versions of Kafka. Message conversion is necessary due to this mismatch, which adds processing cost and raises the cluster’s CPU and memory use. Ensure that all Kafka clients (consumers and producers) are running the same version of Kafka as the Kafka brokers in order to avoid message conversions and reduce expenses.