Friday, February 7, 2025

How Does GKE Horizontal Pod Autoscaler Evaluate Metrics

GKE makes a breakthrough Performance of the Horizontal Pod Autoscaler.

GKE Horizontal Pod Autoscaler

At Google Cloud are dedicated to offering Google Kubernetes Engine (GKE), the most dependable and quick Kubernetes platform. Google Cloud Thrilled to present an enhanced version of the Kubernetes feature known as Horizontal Pod Autoscaler (HPA), which automatically adjusts workload resources to meet demand. To significantly improved scaling performance by re-architecting the HPA stack. With the new Performance HPA profile, you can implement it in your environment and get the following benefits:

  • 2x quicker scaling: Workloads are now able to scale up more than twice as quickly, which improves application performance and response times.
  • Better metrics resolution: More granular scaling and response are made possible by a new fast metrics channel with better metrics resolution.
  • HPA now enables high-scale deployments with reliable performance, enabling you to confidently execute large-scale applications with linear scaling to up to 1000 HPA objects.

Horizontal Pod autoscaler

By automatically scaling the number of Pods in response to the workload’s CPU or memory usage, or to custom metrics reported from Kubernetes or external metrics from sources outside your cluster, the Horizontal Pod Autoscaler modifies the shape of your Kubernetes workload.

When the number of pods in a GKE cluster varies, the cluster’s node count is automatically adjusted using node auto provisioning. Therefore, for all clusters, the advise using horizontal pod autoscaling.

Why use horizontal Pod autoscaling

The resource requirements of a Kubernetes cluster may be unknown when you first deploy your workload there, and they may vary based on external dependencies, usage patterns, and other variables. By only purchasing additional capacity when required, horizontal pod autoscaling allows you to keep expenses under control and guarantees that your workload operates consistently under various conditions.

Predicting the signs that indicate if your workload is underutilised or underresourced is not always simple. Using one or more of the following metrics, the Horizontal Pod Autoscaler can automatically scale the number of pods in your workload:

  • Actual resource utilisation: when the CPU or memory usage of a particular pod surpasses a predetermined level. Either a raw value or a percentage of the amount the Pod requests for that resource can be used to represent this.
  • Custom metrics: based on any measure, like the number of client requests or I/O writes per second, that a Kubernetes object in a cluster reports.

If your program experiences network bottlenecks instead than CPU or memory bottlenecks, this can be helpful.

  • External metrics: derived from a measure provided by a service or application outside of your cluster.

For instance, when consuming a lot of requests from a pipeline like Pub/Sub, your workload may require additional CPU power. The Horizontal Pod Autoscaler can automatically increase Pods when the queue size reaches a threshold and decrease them when it decreases. You can also specify an external measure for the queue size.

With some restrictions, it is possible to combine a vertical pod autoscaler with a horizontal pod autoscaler.

How horizontal Pod autoscaling works

A control loop powers each preset GKE Horizontal Pod Autoscaler. Every process has its own Horizontal Pod Autoscaler. Every Horizontal Pod Autoscaler automatically modifies the workload’s form after comparing its measurements to the target thresholds you set up on a regular basis.

Per-Pod resources

The controller asks the resource metrics API for every container operating in the Pod for resources that are allotted per-Pod, such CPU.

  • The value is used if you enter a raw value for the CPU or RAM.
  • The GKE Horizontal Pod Autoscaler determines the average utilization value as a percentage of that Pod’s CPU or memory requests if you provide a percentage number for either.
  • Raw or average numbers are used to convey both external and custom measures.

A reported metric’s average or raw value is used by the controller to create a ratio, which is then used to autoscale the workload. The Kubernetes project documentation has a description of the Horizontal Pod Autoscaler method.

Responding to multiple metrics

The Horizontal Pod Autoscaler assesses each statistic independently and applies the scaling algorithm to establish the new workload scale depending on each one if you set up a task to autoscale based on several metrics. For the autoscale action, the greatest scale is chosen.

The Horizontal Pod Autoscaler does not scale down in the event that one or more of the metrics are not available; instead, it scales up according to the largest size determined.

Preventing thrashing

When the Horizontal Pod Autoscaler tries to execute more autoscaling operations before the workload has finished reacting to earlier autoscaling operations, this is referred to as thrashing. The GKE Horizontal Pod Autoscaler selects the greatest recommendation from the last five minutes in order to avoid thrashing.

Restrictions

  • Avoid using the Horizontal Pod Autoscaler on CPU or RAM in conjunction with the Vertical Pod Autoscaler. For additional metrics, you can combine the Horizontal Pod Autoscaler with the Vertical Pod Autoscaler.
  • Don’t set up horizontal pod autoscaling on the ReplicaSet or Replication Controller supporting your deployment. A new Replication Controller takes the place of the Deployment or Replication Controller when a rolling update is performed. On the deployment itself, set up horizontal pod autoscaling instead.
  • For workloads like DaemonSets that cannot be scaled, Horizontal Pod autoscaling is not an option.
  • Horizontal Pod autoscaling does not allow you to scale down to zero pods and then scale back up using custom or external metrics.
  • Because metrics are exposed as Kubernetes resources by horizontal pod autoscaling, metric names are restricted to have no capital letters or ‘/’ characters. You may be able to rename using your metric adapter. Take the prometheus-adapter as an operator, for instance.
  • If any of the metrics that GKE Horizontal Pod Autoscaler is set up to track are unavailable, it won’t scale down.
Thota nithya
Thota nithya
Thota Nithya has been writing Cloud Computing articles for govindhtech from APR 2023. She was a science graduate. She was an enthusiast of cloud computing.
RELATED ARTICLES

Recent Posts

Popular Post

Govindhtech.com Would you like to receive notifications on latest updates? No Yes