Autoscaling Kubernetes
The Cortex Helm chart comes bundled with the cluster-autoscaler helm chart (version 3.1.0
) that allows for automatic scaling of Kubernetes worker nodes based on resource utilization.
We highly recommend that you enable autoscaling for variable workloads and cost efficiency, for example to manage bringing up and down GPU nodes or for periods of heavy utilization.
The official documentation for the cluster-autoscaler
chart provides more information on how to configure against supported cloud providers, such as AWSand Azure
Note: For Azure it is currently not possible to have an autoscaling node group with a desired count of 0, so you must have an active instance deployed for each node group. Changes to this requirement are currently on the Azure Roadmap slated for Q2 2020.
Example configuration
In order to enable the cluster-autoscaler
sub-chart that is included with your Cortex installation, you must set the following property in the Cortex helm overrides file (or optionally on the helm command line using --set
):
cluster-autoscaler: enabled: true
Any additional configuration for the sub-chart must be provided according to your Kubernetes deployment and cloud provider, a full list of the sub-chart configuration parameters is available here.
AWS
The Cortex elm chart values override provided with Cortex in cortex5/examples/values-cortex-autoscaler-aws.yaml
gives an example of how to enable the autoscaling functionality of an EKS deployment that will auto-discover which resources to manage in AWS based on tags.
You must substitute your own values for your AWS access key, secret key, region the cluster is deployed in, and EKS cluster name.