Warning
You are currently viewing v0.15 of the documentation and it is not the latest. For the most recent documentation, kindly click here.
Getting Started
Sample HTTP application deployment and autoscaling with the KEDA HTTP Add-on
In this tutorial, we will install the KEDA HTTP Add-on and use it to autoscale an HTTP application based on incoming traffic — including scaling to zero when idle.
By the end, we will have:
Before we begin, we need:
A Kubernetes cluster (kind, minikube, or a cloud provider)
kubectl configured to access the cluster
Helm 3 installed
KEDA core installed:
helm install keda kedacore/keda --namespace keda --create-namespace
See the KEDA deployment docs for other installation methods.
If we have not already added the KEDA Helm repository, we add it now and update our local chart index:
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
We install the HTTP Add-on into the same keda namespace where KEDA core is running:
helm install http-add-on kedacore/keda-add-ons-http --namespace keda
We verify that all components are running:
kubectl get pods -n keda
We will see pods for the operator, interceptor, and scaler — all with a Running status:
NAME READY STATUS RESTARTS AGE
keda-add-ons-http-interceptor-... 1/1 Running 0 30s
keda-add-ons-http-operator-... 1/1 Running 0 30s
keda-add-ons-http-scaler-... 1/1 Running 0 30s
keda-admission-webhooks-... 1/1 Running 0 2m
keda-operator-... 1/1 Running 0 2m
keda-operator-metrics-apiserver-... 1/1 Running 0 2m
We create a namespace and deploy a sample HTTP application using traefik/whoami, a lightweight HTTP server that responds with request metadata.
kubectl create namespace demo
We deploy a Deployment and Service:
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: sample-app
namespace: demo
spec:
selector:
matchLabels:
app: sample-app
template:
metadata:
labels:
app: sample-app
spec:
containers:
- name: sample-app
image: traefik/whoami
args: ["--port=8080"]
ports:
- containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: sample-app
namespace: demo
spec:
selector:
app: sample-app
ports:
- port: 80
targetPort: 8080
EOF
We verify the Deployment was created:
kubectl get deployment -n demo
We will see the Deployment with 1 replica running (the Kubernetes default):
NAME READY UP-TO-DATE AVAILABLE AGE
sample-app 1/1 1 1 10s
The InterceptorRoute tells the interceptor how to route requests to our sample app and what scaling metric to use.
kubectl apply -f - <<EOF
apiVersion: http.keda.sh/v1beta1
kind: InterceptorRoute
metadata:
name: sample-app
namespace: demo
spec:
target:
service: sample-app
port: 80
rules:
- hosts:
- sample-app.example.com
scalingMetric:
requestRate:
targetValue: 5
window: 1m
granularity: 1s
EOF
The requestRate metric scales based on requests per second, averaged over the configured window.
A targetValue: 5 means the add-on targets 5 requests per second per replica.
We use a low value here so that scaling is visible during testing.
See Scaling for details on scaling metrics and how to tune them.
We verify the InterceptorRoute is ready:
kubectl get interceptorroute -n demo
We will see:
NAME TARGETSERVICE READY AGE
sample-app sample-app True 10s
The ScaledObject tells KEDA how to scale our sample-app deployment.
It uses the external-push trigger type, which receives metrics from the HTTP Add-on’s scaler component.
kubectl apply -f - <<EOF
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: sample-app
namespace: demo
spec:
scaleTargetRef:
name: sample-app
minReplicaCount: 0
maxReplicaCount: 10
cooldownPeriod: 30
triggers:
- type: external-push
metadata:
scalerAddress: keda-add-ons-http-external-scaler.keda:9090
interceptorRoute: sample-app
EOF
The interceptorRoute value must match the name of the InterceptorRoute we created in the previous step.
See Architecture for details on how these components connect.
We verify the ScaledObject was created:
kubectl get scaledobject -n demo
We will see:
NAME SCALETARGETKIND SCALETARGETNAME MIN MAX TRIGGERS ...
sample-app apps/v1.Deployment sample-app 0 10 external-push ...
Now we test that autoscaling works. Since there is no traffic, KEDA has scaled the deployment to 0 replicas. We verify this:
kubectl get deployment sample-app -n demo
NAME READY UP-TO-DATE AVAILABLE AGE
sample-app 0/0 0 0 2m
For testing, we use kubectl port-forward to access the interceptor proxy.
In production, your ingress or gateway must route traffic to the interceptor proxy service (keda-add-ons-http-interceptor-proxy) instead of directly to your application — see Configure Ingress for details.
kubectl port-forward -n keda svc/keda-add-ons-http-interceptor-proxy 8090:8080
In another terminal, we send a request with the matching Host header:
curl -H "Host: sample-app.example.com" localhost:8090
The first request may take a few seconds. This is the cold start: KEDA is scaling the deployment from 0 to 1 replica, and the interceptor holds the request until the pod is ready. We will see a response from the sample app once the pod starts.
We check replicas again:
kubectl get deployment sample-app -n demo
We will see 1 replica running:
NAME READY UP-TO-DATE AVAILABLE AGE
sample-app 1/1 1 1 3m
To see scaling beyond 1 replica, we generate a burst of traffic.
The wait=50ms query parameter tells whoami to hold each response for 50 milliseconds, which produces a steady rate of about 20 requests per second — enough to trigger scaling with our targetValue of 5:
for i in $(seq 1 300); do curl -s -H "Host: sample-app.example.com" "localhost:8090/?wait=50ms" > /dev/null; done
After the burst finishes, we check the deployment:
kubectl get deployment sample-app -n demo
We will see the replica count has increased:
NAME READY UP-TO-DATE AVAILABLE AGE
sample-app 2/2 2 2 5m
After the burst ends and the cooldown period passes (30 seconds, as configured in our ScaledObject), KEDA scales the deployment back to 0. We can watch this happen:
kubectl get deployment sample-app -n demo -w
We will see replicas decrease to 0:
NAME READY UP-TO-DATE AVAILABLE AGE
sample-app 2/2 2 2 5m
sample-app 1/1 1 1 6m
sample-app 0/0 0 0 7m
To remove the sample application and all its resources:
kubectl delete namespace demo
#keda channel (join here)