Google Kubernetes Engine AutoScaling

So you have an application running on Google Kubernetes Engine (GKE) and your monitoring data suggests that huge traffic only comes for the first five days of each month. For every other days, traffic is comparatively lower and as such, you don’t require as much application instances (aka replicas). You decided that for the first five days of each month, the minimum number of replicas should at least be 10 while on other days, it can be 5. The maximum number of replicas is 50 for the entire month.

So what do you need to do?
1. Ensure GKE Cluster with Vertical AutoScaling enabled. Typically, this comes with a Balanced AutoScaling Profile and can easily be verified on the GKE console.
2. Ensure Node Pool is configured with AutoScaling enabled. This is also easily verified on the GKE console.
3. Create HorizontalPodAutoscaler, Role, RoleBinding and CronJob. Refer to https://github.com/azneita/gke-autoscaling.git for the code.

kubectl apply -f 01-horizontalpodautoscaler.yaml

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: sample-application-hpa
spec:
  maxReplicas: 50
  minReplicas: 5
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: sample-application
  targetCPUUtilizationPercentage: 80

kubectl apply -f 02-serviceaccount.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: sample-service-account

kubectl apply -f 03-role.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: modify-hpas
rules:
  - apiGroups: ["autoscaling"]
    resources:
      - horizontalpodautoscalers
    verbs:
      - get
      - list
      - patch

kubectl apply -f 04-rolebinding.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: modify-hpas-to-sa
subjects:
  - kind: ServiceAccount
    name: sample-service-account
roleRef:
  kind: Role
  name: modify-hpas
  apiGroup: rbac.authorization.k8s.io

kubectl apply -f 05-scale-up-cronjob.yaml

apiVersion: batch/v1
kind: CronJob
metadata:
  name: scale-up-cronjob
spec:
  schedule: "0 0 1 * *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: sample-service-account
          containers:
          - name: scale-up-cronjob
            image: gcr.io/google.com/cloudsdktool/google-cloud-cli:latest
            args:
            - kubectl patch hpa sample-application-hpa --patch-file scale-up-patch.yaml
          restartPolicy: "OnFailure"

kubectl apply -f 06-scale-down-cronjob.yaml

apiVersion: batch/v1
kind: CronJob
metadata:
  name: scale-down-cronjob
spec:
  schedule: "0 0 6 * *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: sample-service-account
          containers:
          - name: scale-down-cronjob
            image: gcr.io/google.com/cloudsdktool/google-cloud-cli:latest
            args:
            - kubectl patch hpa sample-application-hpa --patch-file scale-down-patch.yaml
          restartPolicy: "OnFailure"

References
https://trstringer.com/kubectl-from-within-pod/
https://medium.com/symbl-ai-engineering-and-data-science/time-based-scaling-for-kubernetes-deployments-9ef7ada93eb7

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.