[Serverless Knative] Knative Docs - Supported Autoscaler types

Posted on 2021-12-18 Edited on 2025-06-10 In Cloud Native , Cloud Computing , Serverless , Knative Views: Word count in article: 5k Reading time ≈ 5 mins.

Supported Autoscaler types

Knative Serving supports the implementation of Knative Pod Autoscaler (KPA) and Kubernetes’ Horizontal Pod Autoscaler (HPA). This topic lists the features and limitations of each of these Autoscalers, as well as how to configure them.

Important

If you want to use Kubernetes Horizontal Pod Autoscaler (HPA), you must install it after you install Knative Serving.

For how to install HPA, see Install optional Serving extensions - https://knative.dev/docs/install/serving/install-serving-with-yaml/#install-optional-serving-extensions.

Knative Pod Autoscaler (KPA)

Part of the Knative Serving core and enabled by default once Knative Serving is installed.
Supports scale to zero functionality.
Does not support CPU-based autoscaling.

Horizontal Pod Autoscaler (HPA)¶

Not part of the Knative Serving core, and you must install Knative Serving first.
Does not support scale to zero functionality.
Supports CPU-based autoscaling.

Configuring the Autoscaler implementation

The type of Autoscaler implementation (KPA or HPA) can be configured by using the class annotation.

Global settings key: pod-autoscaler-class
Per-revision annotation key: autoscaling.knative.dev/class
Possible values: "kpa.autoscaling.knative.dev" or "hpa.autoscaling.knative.dev"
Default: "kpa.autoscaling.knative.dev"

Example:

Per Revision

# Service.helloworld-go.yaml

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-go
  namespace: default
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/class: "kpa.autoscaling.knative.dev"
    spec:
      containers:
        - image: gcr.io/knative-samples/helloworld-go

Once you’ve created your YAML file (named something like “Service.helloworld-go.yaml”):

1	$ kubectl apply -f Service.helloworld-go.yaml

Global (ConfigMap)

# ConfigMap.config-autoscaler.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: config-autoscaler
  namespace: knative-serving
data:
  pod-autoscaler-class: "kpa.autoscaling.knative.dev"

Once you’ve created your YAML file (named something like “ConfigMap.config-autoscaler.yaml”):

1	$ kubectl apply -f ConfigMap.config-autoscaler.yaml

Global (Operator)

# KnativeServing.yaml

apiVersion: operator.knative.dev/v1alpha1
kind: KnativeServing
metadata:
  name: knative-serving
spec:
  config:
    autoscaler:
      pod-autoscaler-class: "kpa.autoscaling.knative.dev"

Once you’ve created your YAML file (named something like “KnativeServing.yaml”):

1	$ kubectl apply -f KnativeServing.yaml

Global versus per-revision settings

Configuring for autoscaling in Knative can be set using either global or per-revision settings.

If no per-revision autoscaling settings are specified, the global settings will be used.
If per-revision settings are specified, these will override the global settings when both types of settings exist.

Global settings

Global settings for autoscaling are configured using the config-autoscaler ConfigMap. If you installed Knative Serving using the Operator, you can set global configuration settings in the spec.config.autoscaler ConfigMap, located in the KnativeServing custom resource (CR).

EXAMPLE OF THE DEFAULT AUTOSCALING CONFIGMAP

# ConfigMap.config-autoscaler.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: config-autoscaler
  namespace: knative-serving
data:
  container-concurrency-target-default: "100"
  container-concurrency-target-percentage: "0.7"
  enable-scale-to-zero: "true"
  max-scale-up-rate: "1000"
  max-scale-down-rate: "2"
  panic-window-percentage: "10"
  panic-threshold-percentage: "200"
  scale-to-zero-grace-period: "30s"
  scale-to-zero-pod-retention-period: "0s"
  stable-window: "60s"
  target-burst-capacity: "200"
  requests-per-second-target-default: "200"

Once you’ve created your YAML file (named something like “ConfigMap.config-autoscaler.yaml”):

1	$ kubectl apply -f ConfigMap.config-autoscaler.yaml

Per-revision settings

Per-revision settings for autoscaling are configured by adding annotations to a revision.

Example:

# Service.helloworld-go.yaml

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-go
  namespace: default
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/target: "70"

Once you’ve created your YAML file (named something like “Service.helloworld-go.yaml”):

1	$ kubectl apply -f Service.helloworld-go.yaml

Important

If you are creating revisions by using a service or configuration, you must set the annotations in the revision template so that any modifications will be applied to each revision as they are created. Setting annotations in the top level metadata of a single revision will not propagate the changes to other revisions and will not apply changes to the autoscaling configuration for your application.