So you have an application running on Google Kubernetes Engine (GKE) and your monitoring data suggests that huge traffic only comes for the first five days of each month. For every other days, traffic is comparatively lower and as such, you don’t require as much application instances (aka replicas). You decided that for the first five days of each month, the minimum number of replicas should at least be 10 while on other days, it can be 5. The maximum number of replicas is 50 for the entire month.
So what do you need to do?
1. Ensure GKE Cluster with Vertical AutoScaling enabled. Typically, this comes with a Balanced AutoScaling Profile and can easily be verified on the GKE console.
2. Ensure Node Pool is configured with AutoScaling enabled. This is also easily verified on the GKE console.
3. Create HorizontalPodAutoscaler, Role, RoleBinding and CronJob. Refer to https://github.com/azneita/gke-autoscaling.git for the code.
kubectl apply -f 01-horizontalpodautoscaler.yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: sample-application-hpa
spec:
maxReplicas: 50
minReplicas: 5
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: sample-application
targetCPUUtilizationPercentage: 80
kubectl apply -f 02-serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: sample-service-account
kubectl apply -f 03-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: modify-hpas
rules:
- apiGroups: ["autoscaling"]
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- patch
kubectl apply -f 04-rolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: modify-hpas-to-sa
subjects:
- kind: ServiceAccount
name: sample-service-account
roleRef:
kind: Role
name: modify-hpas
apiGroup: rbac.authorization.k8s.io
kubectl apply -f 05-scale-up-cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: scale-up-cronjob
spec:
schedule: "0 0 1 * *"
jobTemplate:
spec:
template:
spec:
serviceAccountName: sample-service-account
containers:
- name: scale-up-cronjob
image: gcr.io/google.com/cloudsdktool/google-cloud-cli:latest
args:
- kubectl patch hpa sample-application-hpa --patch-file scale-up-patch.yaml
restartPolicy: "OnFailure"
kubectl apply -f 06-scale-down-cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: scale-down-cronjob
spec:
schedule: "0 0 6 * *"
jobTemplate:
spec:
template:
spec:
serviceAccountName: sample-service-account
containers:
- name: scale-down-cronjob
image: gcr.io/google.com/cloudsdktool/google-cloud-cli:latest
args:
- kubectl patch hpa sample-application-hpa --patch-file scale-down-patch.yaml
restartPolicy: "OnFailure"
References
https://trstringer.com/kubectl-from-within-pod/
https://medium.com/symbl-ai-engineering-and-data-science/time-based-scaling-for-kubernetes-deployments-9ef7ada93eb7