Skip to content

Kubernetes setup guide

This tutorial shows how to run idliveface-server on a Kubernetes cluster. You will deploy idliveface-server as a load-balanced set of replicas that can scale to the needs of your users.

Objectives

  • Deploy idliveface-server to the cluster.
  • Expose idliveface-server to the internet.
  • Deploy a new version of idliveface-server.
  • Manage autoscaling for the deployment.
  • Clean up all changes.

Before you begin

You should have a functioning Kubernetes cluster with a configured kubectl. To check it, you can run kubectl get nodes. If everything is OK, you will see something like this:

$ kubectl get nodes
NAME                                          STATUS   ROLES    AGE   VERSION
ip-172-20-34-235.us-west-1.compute.internal   Ready    master   78d   v1.16.10
ip-172-20-35-13.us-west-1.compute.internal    Ready    node     78d   v1.16.10
...

Deploying idliveface-server to the cluster

To get started, we will define a deployment named idliveface-server by writing a YAML file that Kubernetes can understand. YAML is a human-readable data serialization format that Kubernetes can read and interpret.

Our YAML file will define a Deployment object that launches and manages our application container. You can copy the following file to replicate this demonstration on your own cluster, which we’ll call idliveface-server-deployment.yaml.

Let’s take a closer look at this file to describe the specifics of what it defines.

The YAML creates a Kubernetes Deployment object with the name idliveface-server, which also uses the label app: idliveface-server throughout. The spec for the deployment asks for a single replica. This replica is spawned from a Pod template that launches a container based on the 367672406076.dkr.ecr.eu-central-1.amazonaws.com/facesdk/idface-server-prod:1.46.0 container. The spec indicates that the container will listen on port 8080.

Once you’ve saved the file, you can apply it to deploy it to your cluster:

kubectl apply -f idliveface-server-deployment.yaml

Output:

deployment.apps/idliveface-server-deployment created

You can check the details of the deployed pod by typing:

kubectl get pods
NAME                                            READY   STATUS    RESTARTS   AGE
idliveface-server-deployment-5565574f89-stl8l   1/1     Running   0          14s

Exposing the app idliveface-server to the internet

Kubernetes Pods are designed to be ephemeral, spinning up or down based on scaling needs within your cluster. Pods have individually-assigned IP addresses and these IPs can only be reached from inside your cluster. When a Pod crashes due to an error, Kubernetes will automatically redeploy that Pod, assigning a new Pod IP address each time. This process requires us to work with a dynamic set of IP addresses for those Pods.

Kubernetes Services allow us to 1) group those Pods into one static hostname and 2) expose a Pod group outside the cluster to the internet. Pods are grouped using Services into one static IP address, reachable from any Pod inside the cluster. Kubernetes also assigns a DNS hostname to that static IP.

The default Service type in Kubernetes is called ClusterIP, where the Service gets an IP address reachable only from inside the cluster. To expose a Kubernetes Service outside the cluster, you will create a Service of type LoadBalancer. This type of Service spawns an External Load Balancer IP for a set of Pods, reachable via the internet.

We will now expose the idliveface-server-deployment deployment to the internet using a Service of type LoadBalancer.

Use the kubectl expose command to generate a Kubernetes Service for the idliveface-server-deployment deployment.

kubectl expose deployment idliveface-server-deployment --name=idliveface-server-service --type=LoadBalancer --port 8080 --target-port 8080

Here, the --port flag specifies the port number configured on the Load Balancer, and the --target-port flag specifies the port number that the idliveface-server container is listening on.

Run the following command to get the Service details for idliveface-server-service.

kubectl get service

Output:

NAME                TYPE           CLUSTER-IP       EXTERNAL-IP                                                              PORT(S)          AGE
idliveface-server   LoadBalancer   100.68.250.228   ab5aab6fe6d3240b785923bd6d64a69f4-291836869.us-west-1.elb.amazonaws.com   8080:31502/TCP   78d

Now that the idliveface-server pods are exposed to the internet via a Kubernetes Service, you can check it using curl:

curl ab5aab6fe6d3240b785923bd6d64a69f4-291836869.us-west-1.elb.amazonaws.com:8080/api_version

{
  "product":"IDLive Face Server",
  "version":"1.46.0",
  "defaultPipeline":"iris",
  "availablePipelines":["iris"],
  "expirationDate":"2024-12-30T23:59:59Z"
}

Modifying the version of idliveface-server

Now that we have a deployment running on our Kubernetes cluster, we can manage and modify it as circumstances dictate. Kubernetes will take care of a lot of the automated management tasks, but there are still times when we want to influence the behavior of our applications.

To demonstrate this, we will update the idliveface-server version associated with our deployment. Because the application is already running within the cluster, editing the deployment YAML file we created earlier won't make the change we need. We need to modify the spec as stored in the actual cluster.

We can edit existing objects with the kubectl edit command. The target for the command is the object type and the object name, separated by a forward slash. For our example, we can edit our deployment’s spec by typing:

kubectl edit deploy idliveface-server-deployment

The deployment spec will open in the system’s default editor. Modify the following line:

Original line: image: 367672406076.dkr.ecr.eu-central-1.amazonaws.com/facesdk/idface-server-prod:1.46.0

Replaced with: image: 367672406076.dkr.ecr.eu-central-1.amazonaws.com/facesdk/idface-server-prod:<version>

Once you save the file, Kubernetes will recognize the difference in the spec and begin to automatically update the Deployment within the cluster.

Scaling applications

Now that we’ve demonstrated how to update our applications by modifying the Deployment spec, we can discuss how to scale our containerized workload using Kubernetes’ built-in replication primitives.

We can modify the scale of our deployment with the kubectl scale command. To complete our request, we need to specify the number of replicas we desire as well as the Kubernetes object we wish to target (in this case, it’s our deploy/mysite object).

To scale our Deployment from one replica up to two, we can type:

kubectl scale deployment idliveface-server-deployment --replicas=2

Output:

deployment.apps/idliveface-server-deployment scaled

We can check the progress of the scaling operation by asking for the details on our Deployment object:

kubectl get deploy idliveface-server-deployment

Output:

NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
idliveface-server-deployment   2/2     2            2           15m

Here, we can see that 2 out of 2 replicas are ready and operational. The output confirms that each of these replicas is serving the most up-to-date version of the spec and that each is capable of serving traffic. The service idliveface-server-service will be ready to serve client requests since at least one pod is in a Ready state. This application now demonstrates High Availability of its services. We will now extend these multiple replicas via autoscaling.

Horizontal Pod Autoscaler

The Horizontal Pod Autoscaler automatically scales the number of pods in the deployment set based on the observed CPU utilization. Other metrics can be used with custom support for alternative metrics in the application.

Horizontal Pod Autoscaler, like every API resource, is supported in a standard way by kubectl. There is a special kubectl autoscale command for easy creation of a Horizontal Pod Autoscaler. For instance, executing the following command will create an autoscaler for deployment idliveface-server-deployment, with target CPU utilization set to 80% and the number of replicas between 2 and 5. The HPA object increased that minimum to 2 and will increase the Pods up to 5 if CPU usage on the Pods reaches 80%:

kubectl autoscale deployment idliveface-server-deployment --min=2 --max=5 --cpu-percent=80

Cleaning up Deployment and Service

We’ve created a deployment, updated it, and scaled it. Since this is not a real production workload, we should remove it from our cluster once we’re done to clean up after ourselves.

To remove the resources we’ve set up, we only need to delete the Deployment and Service objects. Kubernetes will automatically remove all other child resources associated with it, like the pods and containers that it manages.

Delete the Deployment by typing:

kubectl delete deploy idliveface-server-deployment

Output:

deployment.extensions "idliveface-server-deployment" deleted

You can double-check that the resources have been removed by getting the list of these resources in the default namespace:

kubectl get deploy idliveface-server
kubectl get pods
kubectl get service idliveface-server-service

These commands should indicate that the Deployment, Service and all of its associated resources are no longer running.