The metric configuration defines which metric type is watched by the Autoscaler.
Setting metrics per revision
For per-revision - https://knative.dev/docs/serving/autoscaling/autoscaler-types/#global-versus-per-revision-settings configuration, this is determined using the
autoscaling.knative.dev/metric annotation. The possible metric types that can be configured per revision depend on the type of Autoscaler implementation you are using:
The default KPA Autoscaler supports the
The HPA Autoscaler supports the
For more information about KPA and HPA, see the documentation on Supported Autoscaler types - https://knative.dev/docs/serving/autoscaling/autoscaler-types/.
Per-revision annotation key:
"memory"or any custom metric name, depending on your Autoscaler type. The
"custom"metrics are only supported on revisions that use the HPA class.
Per-revision concurrency configuration
Per-revision rps configuration
Per-revision cpu configuration
Per-revision memory configuration
Per-revision custom metric configuration
You can create an HPA to scale the revision by a metric that you specify. The HPA will be configured to use the average value of your metric over all the Pods of the revision.
<metric-name> is your custom metric.
Configure concurrency targets - https://knative.dev/docs/serving/autoscaling/concurrency/ for applications
Configure requests per second targets - https://knative.dev/docs/serving/autoscaling/rps-target/ for replicas of an application