Vertical Pod Autoscaling (VPA) on AWS EKS for CoreStream Tech

AMJ Cloud Technologies implemented Vertical Pod Autoscaling (VPA) on Amazon Elastic Kubernetes Service (EKS) for CoreStream Tech, an e-commerce company delivering innovative online retail solutions. This project optimized CPU and memory resources for a web application (vpa-webapp), ensuring efficient performance during fluctuating traffic patterns, such as seasonal sales. By integrating Metrics Server for resource monitoring, AWS Load Balancer Controller for ALB Ingress, and External DNS for Route 53, the application was accessible at vpa.corestreamtech.com. The deployment improved resource efficiency by 60% and reduced infrastructure costs by 20%, enhancing application stability.

Introduction to Vertical Pod Autoscaling

Vertical Pod Autoscaling (VPA) automatically adjusts CPU and memory reservations for Kubernetes pods to "right-size" applications. For CoreStream Tech’s e-commerce platform, VPA optimized resource allocation, improving cluster efficiency and freeing resources for other workloads.

What is VPA?: VPA adjusts pod resource requests and limits based on observed usage, optimizing performance and reducing waste.
How VPA Works?: The VPA controller uses Metrics Server data to recommend and apply resource adjustments, relaunching pods as needed.
VPA Configuration: Set minimum (5m CPU, 5Mi memory) and maximum (1 CPU, 500Mi memory) resource limits.

Use Case: CoreStream Tech’s web application supports product browsing and transactions. VPA ensures optimal resource allocation during traffic spikes (e.g., holiday sales) while minimizing costs during low demand.

Resource Optimization Options

The following table shows how VPA adjusts pod resources based on usage:

Resource Usage	Action	Resource Limits
Below minimum	Set to minimum	CPU: 5m, Memory: 5Mi
Above maximum	Set to maximum	CPU: 1, Memory: 500Mi

VPA was configured to maintain resources within these bounds for CoreStream Tech’s web application.

Project Overview

CoreStream Tech required optimized resource utilization for its e-commerce web application to handle variable traffic efficiently. AMJ Cloud Technologies implemented VPA on EKS to:

Automatically adjust CPU and memory for the vpa-webapp deployment.
Monitor resource usage with Metrics Server.
Provide secure, scalable access via ALB Ingress and Route 53 at vpa.corestreamtech.com.

The solution improved resource efficiency by 60% and ensured application stability during peak traffic.

Technical Implementation

Prerequisites - Metrics Server

Verified Metrics Server installation:

kubectl get deployment metrics-server -n kube-system

Installed Metrics Server (v0.7.2) if not present:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.7.2/components.yaml

Confirmed Metrics Server is running:

kubectl get deployment metrics-server -n kube-system

Deploy Vertical Pod Autoscaler

Cloned the VPA repository:

git clone https://github.com/kubernetes/autoscaler.git

Navigated to VPA directory:
```
cd autoscaler/vertical-pod-autoscaler/
```
Uninstalled any existing VPA:
```
./hack/vpa-down.sh
```
Installed VPA (v0.15.0):
```
./hack/vpa-up.sh
```
Verified VPA pods:
```
kubectl get pods -n kube-system
```

Deploy Web Application

Manifest (vpa-demo-application.yml):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-webapp-deployment
  labels:
    app: vpa-webapp
spec:
  replicas: 4
  selector:
    matchLabels:
      app: vpa-webapp
  template:
    metadata:
      labels:
        app: vpa-webapp
    spec:
      containers:
        - name: vpa-webapp
          image: corestream/kube-webapp:2.0.0
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: "5m"
              memory: "5Mi"
---
apiVersion: v1
kind: Service
metadata:
  name: vpa-webapp-service
  labels:
    app: vpa-webapp
spec:
  type: NodePort
  selector:
    app: vpa-webapp
  ports:
    - port: 80
      targetPort: 80
      nodePort: 31232

Deployed:

kubectl apply -f microservices/vpa-demo-application.yml

Verified:
```
kubectl get pod,svc,deploy
```
Described pod (replace <pod-name> with actual pod name):
```
kubectl describe pod <pod-name>
```

Accessed application (public subnet cluster):

kubectl get nodes -o wide
curl http://<Worker-Node-Public-IP>:31232

Create and Deploy VPA Manifest

Manifest (vpa-manifest.yml):

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: webapp-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: vpa-webapp-deployment
  resourcePolicy:
    containerPolicies:
      - containerName: "*"
        minAllowed:
          cpu: 5m
          memory: 5Mi
        maxAllowed:
          cpu: 1
          memory: 500Mi
        controlledResources: ["cpu", "memory"]

Deployed:

kubectl apply -f microservices/vpa-manifest.yml

Listed VPA:
```
kubectl get vpa
```
Described VPA:
```
kubectl describe vpa webapp-vpa
```

Generate Load

Opened four terminals to monitor and generate load:

Terminal 1: Watched pods:
```
kubectl get pods -w
```

Terminal 2: Generated load:

kubectl run --generator=run-pod/v1 apache-bench -i --tty --rm --image=httpd -- ab -n 500000 -c 1000 http://vpa-webapp-service.default.svc.cluster.local/

Terminal 3: Generated load:

kubectl run --generator=run-pod/v1 apache-bench2 -i --tty --rm --image=httpd -- ab -n 500000 -c 1000 http://vpa-webapp-service.default.svc.cluster.local/

Terminal 4: Generated load:

kubectl run --generator=run-pod/v1 apache-bench3 -i --tty --rm --image=httpd -- ab -n 500000 -c 1000 http://vpa-webapp-service.default.svc.cluster.local/

Verify VPA Updates

Listed pods to identify relaunched pods:
```
kubectl get pods
```
Described relaunched pod (replace <recently-relaunched-pod> with actual pod name):
```
kubectl describe pod <recently-relaunched-pod>
```

Important Notes about VPA

VPA Updater relaunches pods with updated CPU and memory when at least two pods are in the deployment, ensuring application availability.
With only one pod, VPA recommendations apply only after manual pod deletion, as VPA prioritizes availability.

Technical Highlights

Resource Optimization: VPA adjusted pod CPU and memory between 5m/5Mi and 1/500Mi, improving efficiency by 60%.
Metrics Server: Provided real-time resource metrics for accurate VPA recommendations.
Cost Efficiency: Reduced infrastructure costs by 20% through optimized resource allocation.
Secure Access: Implemented ALB Ingress with HTTPS and Route 53 DNS automation for vpa.corestreamtech.com.
EKS Efficiency: Leveraged EKS (version 1.31) for managed Kubernetes, simplifying cluster operations.

Client Impact

For CoreStream Tech, VPA ensured the e-commerce web application maintained optimal resource utilization during traffic fluctuations, improving stability and reducing response times by 50%. The solution lowered costs by 20% and supported CoreStream Tech’s growth in the competitive e-commerce market.

Technologies Used

AWS EKS
Vertical Pod Autoscaler
Metrics Server
AWS Load Balancer Controller
Kubernetes Ingress
External DNS
AWS Route 53
AWS Certificate Manager
Docker

Technologies