Kubernetes setup guide
This tutorial shows how to run idliveface-server
on a Kubernetes cluster. You will deploy idliveface-server
as a load-balanced set of replicas that can scale to the needs of your users.
Objectives¶
- Deploy
idliveface-server
to the cluster. - Expose
idliveface-server
to the internet. - Deploy a new version of
idliveface-server
. - Manage autoscaling for the deployment.
- Clean up all changes.
Before you begin¶
You should have a functioning Kubernetes cluster with a configured kubectl
. To check it, you can run kubectl get nodes
. If everything is OK, you will see something like this:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-20-34-235.us-west-1.compute.internal Ready master 78d v1.16.10
ip-172-20-35-13.us-west-1.compute.internal Ready node 78d v1.16.10
...
Deploying idliveface-server
to the cluster¶
To get started, we will define a deployment named idliveface-server
by writing a YAML file that Kubernetes can understand. YAML is a human-readable data serialization format that Kubernetes can read and interpret.
Our YAML file will define a Deployment object that launches and manages our application container. You can copy the following file to replicate this demonstration on your own cluster, which we’ll call idliveface-server-deployment.yaml.
Let’s take a closer look at this file to describe the specifics of what it defines.
The YAML creates a Kubernetes Deployment object with the name idliveface-server
, which also uses the label app: idliveface-server
throughout. The spec for the deployment asks for a single replica. This replica is spawned from a Pod template that launches a container based on the 367672406076.dkr.ecr.eu-central-1.amazonaws.com/facesdk/idface-server-prod:1.46.0
container. The spec indicates that the container will listen on port 8080.
Once you’ve saved the file, you can apply it to deploy it to your cluster:
kubectl apply -f idliveface-server-deployment.yaml
Output:
deployment.apps/idliveface-server-deployment created
You can check the details of the deployed pod by typing:
kubectl get pods
NAME READY STATUS RESTARTS AGE
idliveface-server-deployment-5565574f89-stl8l 1/1 Running 0 14s
Exposing the app idliveface-server
to the internet¶
Kubernetes Pods are designed to be ephemeral, spinning up or down based on scaling needs within your cluster. Pods have individually-assigned IP addresses and these IPs can only be reached from inside your cluster. When a Pod crashes due to an error, Kubernetes will automatically redeploy that Pod, assigning a new Pod IP address each time. This process requires us to work with a dynamic set of IP addresses for those Pods.
Kubernetes Services allow us to 1) group those Pods into one static hostname and 2) expose a Pod group outside the cluster to the internet. Pods are grouped using Services into one static IP address, reachable from any Pod inside the cluster. Kubernetes also assigns a DNS hostname to that static IP.
The default Service type in Kubernetes is called ClusterIP, where the Service gets an IP address reachable only from inside the cluster. To expose a Kubernetes Service outside the cluster, you will create a Service of type LoadBalancer. This type of Service spawns an External Load Balancer IP for a set of Pods, reachable via the internet.
We will now expose the idliveface-server-deployment
deployment to the internet using a Service of type LoadBalancer.
Use the kubectl expose
command to generate a Kubernetes Service for the idliveface-server-deployment
deployment.
kubectl expose deployment idliveface-server-deployment --name=idliveface-server-service --type=LoadBalancer --port 8080 --target-port 8080
Here, the --port flag specifies the port number configured on the Load Balancer, and the --target-port flag specifies the port number that the idliveface-server
container is listening on.
Run the following command to get the Service details for idliveface-server-service.
kubectl get service
Output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
idliveface-server LoadBalancer 100.68.250.228 ab5aab6fe6d3240b785923bd6d64a69f4-291836869.us-west-1.elb.amazonaws.com 8080:31502/TCP 78d
Now that the idliveface-server
pods are exposed to the internet via a Kubernetes Service, you can check it using curl:
curl ab5aab6fe6d3240b785923bd6d64a69f4-291836869.us-west-1.elb.amazonaws.com:8080/api_version
{
"product":"IDLive Face Server",
"version":"1.46.0",
"defaultPipeline":"iris",
"availablePipelines":["iris"],
"expirationDate":"2024-12-30T23:59:59Z"
}
Modifying the version of idliveface-server
¶
Now that we have a deployment running on our Kubernetes cluster, we can manage and modify it as circumstances dictate. Kubernetes will take care of a lot of the automated management tasks, but there are still times when we want to influence the behavior of our applications.
To demonstrate this, we will update the idliveface-server
version associated with our deployment. Because the application is already running within the cluster, editing the deployment YAML file we created earlier won't make the change we need. We need to modify the spec as stored in the actual cluster.
We can edit existing objects with the kubectl edit
command. The target for the command is the object type and the object name, separated by a forward slash. For our example, we can edit our deployment’s spec by typing:
kubectl edit deploy idliveface-server-deployment
The deployment spec will open in the system’s default editor. Modify the following line:
Original line: image: 367672406076.dkr.ecr.eu-central-1.amazonaws.com/facesdk/idface-server-prod:1.46.0
Replaced with: image: 367672406076.dkr.ecr.eu-central-1.amazonaws.com/facesdk/idface-server-prod:<version>
Once you save the file, Kubernetes will recognize the difference in the spec and begin to automatically update the Deployment within the cluster.
Scaling applications¶
Now that we’ve demonstrated how to update our applications by modifying the Deployment spec, we can discuss how to scale our containerized workload using Kubernetes’ built-in replication primitives.
We can modify the scale of our deployment with the kubectl
scale command. To complete our request, we need to specify the number of replicas we desire as well as the Kubernetes object we wish to target (in this case, it’s our deploy/mysite object).
To scale our Deployment from one replica up to two, we can type:
kubectl scale deployment idliveface-server-deployment --replicas=2
Output:
deployment.apps/idliveface-server-deployment scaled
We can check the progress of the scaling operation by asking for the details on our Deployment object:
kubectl get deploy idliveface-server-deployment
Output:
NAME READY UP-TO-DATE AVAILABLE AGE
idliveface-server-deployment 2/2 2 2 15m
Here, we can see that 2 out of 2 replicas are ready and operational. The output confirms that each of these replicas is serving the most up-to-date version of the spec and that each is capable of serving traffic. The service idliveface-server-service
will be ready to serve client requests since at least one pod is in a Ready state. This application now demonstrates High Availability of its services. We will now extend these multiple replicas via autoscaling.
Horizontal Pod Autoscaler¶
The Horizontal Pod Autoscaler automatically scales the number of pods in the deployment set based on the observed CPU utilization. Other metrics can be used with custom support for alternative metrics in the application.
Horizontal Pod Autoscaler, like every API resource, is supported in a standard way by kubectl
. There is a special kubectl autoscale
command for easy creation of a Horizontal Pod Autoscaler. For instance, executing the following command will create an autoscaler for deployment idliveface-server-deployment, with target CPU utilization set to 80% and the number of replicas between 2 and 5. The HPA object increased that minimum to 2 and will increase the Pods up to 5 if CPU usage on the Pods reaches 80%:
kubectl autoscale deployment idliveface-server-deployment --min=2 --max=5 --cpu-percent=80
Cleaning up Deployment and Service¶
We’ve created a deployment, updated it, and scaled it. Since this is not a real production workload, we should remove it from our cluster once we’re done to clean up after ourselves.
To remove the resources we’ve set up, we only need to delete the Deployment and Service objects. Kubernetes will automatically remove all other child resources associated with it, like the pods and containers that it manages.
Delete the Deployment by typing:
kubectl delete deploy idliveface-server-deployment
Output:
deployment.extensions "idliveface-server-deployment" deleted
You can double-check that the resources have been removed by getting the list of these resources in the default namespace:
kubectl get deploy idliveface-server
kubectl get pods
kubectl get service idliveface-server-service
These commands should indicate that the Deployment, Service and all of its associated resources are no longer running.