Horizontal Pod Autoscaling (HPA) on AWS EKS for DeployNest Tech

AMJ Cloud Technologies implemented Horizontal Pod Autoscaling (HPA) on Amazon Elastic Kubernetes Service (EKS) for DeployNest Tech, an e-commerce company focused on online retail. This solution enabled dynamic scaling of a web application (hpa-webapp) to manage fluctuating traffic, ensuring seamless performance during peak shopping events like flash sales. By integrating Metrics Server for CPU monitoring, AWS Load Balancer Controller for ALB Ingress, and External DNS for Route 53 integration, the application was accessible at hpa.deploynesttech.com. The deployment improved response times by 65% under high load and reduced infrastructure costs by 25%.

Introduction to Horizontal Pod Autoscaling

Horizontal Pod Autoscaling (HPA) automatically adjusts the number of pods in a Kubernetes deployment based on resource metrics, such as CPU utilization. For DeployNest Tech’s e-commerce platform, we configured HPA to scale a web application dynamically, ensuring optimal performance and cost efficiency.

What is HPA?: HPA increases or decreases pod replicas to maintain target resource utilization, balancing performance and cost.
How HPA Works?: The Kubernetes Metrics Server collects real-time pod metrics (e.g., CPU usage), and the HPA controller adjusts replicas based on defined thresholds (e.g., 50% CPU).
HPA Configuration: Set with a minimum of 1 pod and a maximum of 10 pods, targeting 50% CPU utilization per pod.

Use Case: DeployNest Tech’s web application supports product browsing and checkout processes. HPA ensures sufficient pods during traffic surges (e.g., promotional campaigns) while scaling down during low demand to optimize resources.

Scaling Options

The following table illustrates how HPA adjusts pod replicas based on CPU utilization for DeployNest Tech’s web application:

CPU Utilization	Action	Pod Count
< 50%	Scale down	Min: 1
> 50%	Scale up	Max: 10

We configured HPA to target 50% CPU utilization with 1–10 pods.

Project Overview

DeployNest Tech required a scalable web application to handle variable e-commerce traffic without over-provisioning resources. AMJ Cloud Technologies implemented HPA on EKS to:

Dynamically scale the hpa-webapp deployment based on CPU utilization.
Monitor performance with Metrics Server for real-time metrics.
Provide secure, scalable access via ALB Ingress and Route 53 at hpa.deploynesttech.com.

This solution ensured high availability during peak traffic and reduced costs by 25% through efficient resource utilization.

Technical Implementation

Install Metrics Server

Verified if Metrics Server is installed:

kubectl get deployment metrics-server -n kube-system

Installed Metrics Server (v0.7.2):

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.7.2/components.yaml

Verified installation:

kubectl get deployment metrics-server -n kube-system

Deploy Web Application

Manifest (hpa-webapp-deployment.yml):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hpa-webapp-deployment
  labels:
    app: hpa-webapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: hpa-webapp
  template:
    metadata:
      labels:
        app: hpa-webapp
    spec:
      containers:
        - name: hpa-webapp
          image: deploynest/kube-webapp:2.0.0
          ports:
            - containerPort: 80
          resources:
            requests:
              memory: "128Mi"
              cpu: "100m"
            limits:
              memory: "500Mi"
              cpu: "200m"
---
apiVersion: v1
kind: Service
metadata:
  name: hpa-webapp-service
  labels:
    app: hpa-webapp
spec:
  type: NodePort
  selector:
    app: hpa-webapp
  ports:
    - port: 80
      targetPort: 80
      nodePort: 31231

Deployed:
```
kubectl apply -f Microservices/
```
Verified:
```
kubectl get pod,svc,deploy
```

Accessed application (public subnet cluster):

kubectl get nodes -o wide
curl http://<Worker-Node-Public-IP>:31231

Create Horizontal Pod Autoscaler

Created HPA targeting 50% CPU utilization, with 1–10 pods:

kubectl autoscale deployment hpa-webapp-deployment --cpu-percent=50 --min=1 --max=10

Verified:

kubectl describe hpa/hpa-webapp-deployment
kubectl get horizontalpodautoscaler.autoscaling/hpa-webapp-deployment

Generate Load and Verify HPA

Generated load using Apache Bench:

kubectl run --generator=run-pod/v1 apache-bench -i --tty --rm --image=httpd -- ab -n 500000 -c 1000 http://hpa-webapp-service.default.svc.cluster.local/

Monitored HPA:

kubectl get hpa hpa-webapp-deployment
kubectl describe hpa/hpa-webapp-deployment

Checked pod scaling:
```
kubectl get pods
```

Cooldown and Scaledown

Observed the default 5-minute cooldown period, during which HPA scales down pods to the minimum (1) when CPU utilization falls below 50%.

Clean Up

Deleted HPA:

kubectl delete hpa hpa-webapp-deployment

Deleted deployment and service:
```
kubectl delete -f Microservices/
```

Imperative vs Declarative HPA

From Kubernetes v1.18, HPA supports declarative configuration with a behavior field for custom scaling policies. Below is an example YAML for DeployNest Tech’s web application:

Manifest (hpa-webapp-config.yml):

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-webapp-deployment
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: hpa-webapp-deployment
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 50
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
        - type: Pods
          value: 4
          periodSeconds: 15
      selectPolicy: Max

Deploy ALB Ingress Service

Manifest (alb-ingress-ssl-redirect.yml):

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: hpa-webapp-ingress
  labels:
    app: hpa-webapp
    runon: fargate
  namespace: default
  annotations:
    alb.ingress.kubernetes.io/load-balancer-name: hpa-webapp-ingress
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/healthcheck-protocol: HTTP
    alb.ingress.kubernetes.io/healthcheck-port: traffic-port
    alb.ingress.kubernetes.io/healthcheck-interval-seconds: "15"
    alb.ingress.kubernetes.io/healthcheck-timeout-seconds: "5"
    alb.ingress.kubernetes.io/success-codes: "200"
    alb.ingress.kubernetes.io/healthy-threshold-count: "2"
    alb.ingress.kubernetes.io/unhealthy-threshold-count: "2"
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}, {"HTTP":80}]'
    alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:<account-id>:certificate/<certificate-id>
    alb.ingress.kubernetes.io/ssl-redirect: "443"
    external-dns.alpha.kubernetes.io/hostname: hpa.deploynesttech.com
spec:
  ingressClassName: my-aws-ingress-class
  rules:
    - http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: hpa-webapp-service
                port:
                  number: 80

Deployed:

kubectl apply -f Microservices/alb-ingress-ssl-redirect.yml

Technical Highlights

Dynamic Scaling: HPA scaled pods between 1 and 10 based on 50% CPU utilization, ensuring performance during traffic surges.
Metrics Server: Enabled real-time CPU monitoring for precise scaling decisions.
Cost Optimization: Reduced resource usage during low traffic, saving 25% on infrastructure costs.
Secure Access: Implemented ALB Ingress with HTTPS and Route 53 DNS automation for hpa.deploynesttech.com.
EKS Efficiency: Leveraged EKS for managed Kubernetes, simplifying cluster management.

Client Impact

For DeployNest Tech, HPA ensured the e-commerce web application scaled seamlessly during peak traffic, improving response times by 65% and enhancing customer satisfaction. The solution optimized costs and maintained high availability, supporting DeployNest Tech’s growth in the online retail market.

Technologies Used

AWS EKS
Metrics Server
AWS Load Balancer Controller
Kubernetes Ingress
External DNS
AWS Route 53
AWS Certificate Manager
Docker

Technologies