Create a High-Availability Kubernetes Cluster on AWS with Kops

This article has a bit more of DevOps flavour than the previous articles, focused more on Elixir. In this article I show how to easily run a multi-zone Kubernetes cluster on AWS, where we’ll deploy a Phoenix Chat application.

There are many ways to deploy a Kubernetes cluster on AWS (Amazon Web Services). At the moment, AWS offers EKS, Elastic Kubernetes Service, which helps us deploy and manage our Kubernetes clusters. It costs $0.20/hr, which is $144/month: that’s not cheap actually, especially if we want to run a small cluster. And it’s not just about the cost: I find EKS still too young, preferring kops over it.

Kops (Kubernetes Operations), it’s an open-source free tool which helps us to easily deploy and manage a HA (High Availability) Kubernetes cluster on different cloud providers.

The provider we’ll focus on here is AWS. It’s really well supported by kops, giving us the ability to easily integrate EC2 resources (Volumes, LoadBalancers…) into our Kubernetes cluster.

Once created an empty High-Availability Kubernetes Cluster on AWS, we will see how to deploy, at the beginning, a simple nginx server connected to an ELB (Elastic Load Balancer), and later a Phoenix Chat Example app. We will also see why scaling out the chat app doesn’t work straight out of the box.

High-Availability Cluster

The goal of this article is to create a HA Kubernetes Cluster, which means we want to have multiple Kubernetes Master and Workers, running across multiple zones.

High Availability Kubernetes cluster - 3 masters, 6 workers across 3 availability zones
3 masters, 6 workers across 3 availability zones

In the example above, to make our cluster Highly Available, we spread the EC2 instances over multiple AZ (Availability Zones) : us-east-1a, us-east-1d and us-east-1f.

Each Availability Zone runs on its own physically distinct, independent infrastructure, and is engineered to be highly reliable. Common points of failures like generators and cooling equipment are not shared across Availability Zones.

How_isolated_are_Availability_Zones_from_one_another

To have a HA cluster we need at least three masters (servers that manage the whole Kubernetes cluster) and two workers, in three different availability zones. In this way, If one master or, even worse, a zone goes down, we have the two other zones with two masters and workers. If a worker (or a master) node fails, kops will spawn a new EC2 instance to replace that node.

The advantage of availability zones is that they are close to each other and they have a really low latency. This means that the communication between masters and between containers (running on worker nodes) is really fast. At the moment, the round trip time I see pinging instances in the same zone is around 0.1ms (us-east-1a), and between us-east-1a and us-east-1d I get almost the same time. This latency also depends by the kind of network your EC2 instances have.

Consider that, while traffic between instances within the same zone is free, the traffic between different zones is charged $0.01/GB. This price could seem low, but if you have a replicated database across multiple zones, with thousands of updates each minute, this traffic could end up to be a noticeable part of your cluster cost.

AWS account and IAM role

Let’s now setup our AWS account, so we can create our cluster via kops cli.

We obviously need an AWS account. If you don’t have it yet, just go here and click on “Create a Free Account”.
If you are not used to AWS billing, be really careful to the resources you use and periodically check the billing page!

Once the account is ready, we need to create and configure our IAM user, creating the access key and the secret access key. If you don’t know how to manage a IAM user take a look at these two pages: adding a user and access keys.

Once you’ve created the keys, set them on your computer using the aws-cli. If you’ve never used the aws-cli, take a look at: Installing the AWS CLI and Configuring the AWS CLI. The AWS CLI installation is also briefly explained into the kops install page.
When you have the aws-cli installed, start the configuration and enter your access and secret access keys.

$ aws configure

AWS Access Key ID: YOUR_ACCESS_KEY
AWS Secret Access Key: YOUR_SECRET_ACCESS_KEY
Default region name [None]:
Default output format [None]:

Important. When configuring the IAM user, we need to add the AdministratorAccess permissions policy. In this way the kops command, running on your local computer, will be able to create all the resources it needs.

aws administrator access
IAM user permission policy – AdministratorAccess

To know that the credentials are setup correctly in our system, we can use the aws command to list the users.

$ aws iam list-users
{
    "Users": [
        {
            "Path": "/",
            "UserName": "alvise",
            "UserId": ...,
            "Arn": ...
        }
    ]
}

Install kops and kubectl

kopsis the tool we need to create the Kubernetes cluster on AWS. kubectl is the cli we use to manage the cluster once it’s up and running.
For both linux and mac, the kops install page quickly shows how to install both kops and kubectl tools.

If you have a mac, my advise is to install both tools using Homebrew. It makes the installation and upgrade of these binaries really easy.

For Windows people, I didn’t find the binaries but it seems to be possible to compile the kops cli on a Windows machine. Honestly I would directly use Docker on Windows to run both kops and kubectl. There are different docker images on dockerhub with kops and kubectl: dockerhub kops images.
To install kubectl natively on Windows using the Powershell, this seems to be an easy solution: Install with Powershell from PSGallery.
I don’t have an easy way to test these tools on Windows at the moment, so if you are a Windows user please leave a comment saying what worked best for you!

Real domain in Route53

UPDATE I’d like to thanks Mark O’Connor (comments below) who made me aware that is now possible to use kops without a real domain. Instead of using a Route53 domain, we can create a cluster using a subdomain of k8s.local, like chat.k8s.local. A cluster will be created with a load-balancer pointing to our masters.

If you are using Kops 1.6.2 or later, then DNS configuration is optional. Instead, a gossip-based cluster can be easily created. The only requirement to trigger this is to have the cluster name end with .k8s.local. If a gossip-based cluster is created then you can skip this section.

https://github.com/kubernetes/kops/blob/master/docs/aws.md#configure-dns

In our example I’ll continue to use a real domain in Route53, since the idea is to have our chat available on chat.poeticoding.com

Kops needs a real domain and valid zone setup into AWS Route53. I know, this can be a blocking step, especially if you just want to just try kops on AWS. Unfortunately it doesn’t seem to be a way to around this. You can temporarily move a domain you have into Route53, or to buy a cheap domain at the Route53 domain registration page.

I’ve personally changed my poeticoding.com domain nameservers to Route53 time ago. It was super easy. I just had to download the zone file from GoDaddy and import it into Route53, telling GoDaddy to use the Route53 nameservers.

AWS provides a handy documentation for this: Creating Records By Importing a Zone File. Remember, if you have any question or doubt about this process, please leave a comment at the bottom of this article, I’ll do my best to help you!

Now that we have our domain configured correctly into Route53 (poeticoding.com in my case), it should looks something like this

aws route53
poeticoding.com domain on Route53

S3 bucket to store the cluster state

This is the last step before being really ready to create our cluster! We just need to create an S3 bucket which kops will use to save the cluster’s state files. Since we are planning to deploy the Phoenix Chat example, I’ve called my bucket like a subdomain, state.chat.poeticoding.com, but you can call it whatever you want, it doesn’t have to be a domain name.

aws s3 bucket
Creating the S3 bucket to save the Kubernetes cluster state

Creating the Kubernetes cluster

kubernetes ha cluster, two availability zones
2 Availability Zones – 3 masters – 2 workers

Unlike the example at the beginning, where we had 3 masters and 6 workers over 3 availability zones, for the sake of simplicity we are now going to create a much smaller cluster, using just two zones. This is ok for our test, but in a production cluster is not that great, since we could have issues with consesus/quorum. To have a properly HA cluster we should use, at least, 3 zones, with one master in each one.

$ kops create cluster \
       --state "s3://state.chat.poeticoding.com" \
       --zones "us-east-1d,us-east-1f"  \
       --master-count 3 \
       --master-size=t2.micro \
       --node-count 2 \
       --node-size=t2.micro \
       --name chat.poeticoding.com \
       --yes

With this single command kops knows everything about the cluster we want to build.

  • --state is the S3 bucket, where kops stores the state files
  • --zones we specify two availability zones in the same region, us-east-1d and us-east-1f
  • --master-count the number of masters must be odd (1,3,5…), so if we want to have a HA cluster we need at least 3 masters. Since for simplicity we’ve chosen to use just two AZ, one of the two zones will have two masters.
  • --master-size this is the type of EC2 Instance for the master servers. For a medium size cluster I usually use C4/C5.large masters, but for this example t2.micro works well. You find t2 instances pricing here.
  • --node-count and --node-size in this example we just need two nodes, which in this case are two t2.micro instances.
  • --name the name of our cluster, which is also a real subdomain which will be created on route53.

The nodes are the kubernetes workers, the servers where we run our containers. Usually these servers are much bigger than the masters, since is where most of the load located.

If you run the command without --yes, kops prints the list of the whole actions is going to do on your AWS account. Creation of IAM roles, security groups, Volumes, EC2 instances etc.
It’s usually a good practice to take a look at what kops is going to do, before running the command with the --yes option.

$ kops create cluster ... --yes

Inferred --cloud=aws from zone "us-east-1d"
Running with masters in the same AZs; redundancy will be reduced
Assigned CIDR 172.20.32.0/19 to subnet us-east-1d
Assigned CIDR 172.20.64.0/19 to subnet us-east-1f
Using SSH public key: /Users/alvise/.ssh/id_rsa.pub
...
Tasks: 83 done / 83 total; 0 can run
Pre-creating DNS records
Exporting kubecfg for cluster
kops has set your kubectl context to chat.poeticoding.com

Cluster is starting.  It should be ready in a few minutes.

Just wait few minutes and the cluster should be up and running. We can use the validate command to check the state of the cluster creation.

$ kops validate cluster \
       --state "s3://state.chat.poeticoding.com" \
       --name chat.poeticoding.com
kops cluster validation
kops validate cluster – cluster ready
AWS Console – EC2 instances

Kops exports the Kubernetes configuration for us, so the cluster should be accessible right away with kubectl.

$ kubectl get nodes
NAME                            STATUS    ROLES     AGE       VERSION
ip-172-20-33-199.ec2.internal   Ready     master    11m       v1.11.6
ip-172-20-49-249.ec2.internal   Ready     node      10m       v1.11.6
ip-172-20-59-126.ec2.internal   Ready     master    11m       v1.11.6
ip-172-20-71-37.ec2.internal    Ready     master    11m       v1.11.6
ip-172-20-88-143.ec2.internal   Ready     node      10m       v1.11.6

We also see how kops creates a VPC (Virtual Private Cloud) for our cluster and adds new DNS records in our Route53 zone.

aws vpc
AWS Virtual Private Cloud
New DNS records

Kubernetes API and Security Group

The Kubernetes API is by default exposed on the internet. At the end it’s the only way we can easily connect to our cluster (without using VPN connections to our VPC). Honestly, I don’t like the idea of exposing the API to the world, especially after the bug of last December. If we have a static IP in our office or home, we can set a firewall rule to allow only our ip. When we have a dynamic IP we can open the external access only when we need it. This can be a bit tedious since every time we want to access to our cluster, we need to go to the AWS console and temporarily change the firewall rules. We can create a script to change the firewall using the aws-cli.
These firewall rules are handled by the master instances security group.

aws security group
Kubernetes API HTTP port open by default
aws ec2 firewall
Restricting the API access

UPDATE another way to update the security group (thanks Mark O’Connor to let me also aware of this option) is to use the kops cluster configuration. We can restrict and control the access to the API editing and updating the cluster configuration like so

kops edit cluster \
     --state "s3://state.chat.poeticoding.com"

A vim session is started where we can change some settings like kubernetesApiAccess, which by default is 0.0.0.0/0 (all ip). To confirm the update we then need to update the cluster.

kops update cluster  \
     --state "s3://state.chat.poeticoding.com" \
     --yes

Deploy an Nginx server

It’s now time to use the kubectl command and deploy a simple Nginx server. First of all, let’s check if the command works and the cluster configuration was imported correctly. We can list our nodes with kubectl get nodes command

$ kubectl get nodes
NAME                            STATUS    ROLES     AGE       VERSION
ip-172-20-33-199.ec2.internal   Ready     master    11m       v1.11.6
ip-172-20-49-249.ec2.internal   Ready     node      10m       v1.11.6
ip-172-20-59-126.ec2.internal   Ready     master    11m       v1.11.6
ip-172-20-71-37.ec2.internal    Ready     master    11m       v1.11.6
ip-172-20-88-143.ec2.internal   Ready     node      10m       v1.11.6

To deploy an Nginx server, we need to create a kubernetes deployment. Adding multiple replicas of the pods, they will run on different nodes, spreading the load across different workers.

# nginx_deploy.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
  name: nginx
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx

  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.15
        ports:
        - containerPort: 80

This is a really simple deployment. We ask kubernetes to run one single pod with an nginx container and exposing the container port 80.

$ kubectl create -f nginx_deploy.yaml
deployment.apps "nginx" created

$ kubectl get pod
NAME                   READY     STATUS    RESTARTS   AGE
nginx-c9bd9bc4-jqvb5   1/1       Running   0          1m

Perfect, our pod is running. We need now a way to access to it. We can use a load balancer.

# nginx_svc.yaml
kind: Service
apiVersion: v1

metadata:
  name: nginx-elb
  namespace: default
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

spec:
  type: LoadBalancer
  selector:
    app: nginx
  ports:
    - name: http
      port: 80
      targetPort: 80

A great thing about the Kubernetes integration with AWS, is that we can manage the cloud resources directly from Kubernetes configuration files. In the nginx_svc.yaml we define a LoadBalancer service that redirects its port 80 traffic to the port 80 of the Nginx’s pod.
We can use annotations to set what type of load balancer we want (in this case a Network Load Balancer), SSL certificates etc. You can find the full list of service annotations here.

aws load balancer
Network Load Balancer
$ kubectl create -f nginx_svc.yaml
service "nginx-elb" created

$ kubectl describe svc nginx-elb
Name:                     nginx-elb
...
LoadBalancer Ingress:     a41626d3d169811e995970e07eeed2b2-243343502.us-east-1.elb.amazonaws.com
Port:                     http  80/TCP
TargetPort:               80/TCP
NodePort:                 http  31225/TCP
...

Once the load balancer is created, we can see its details using the describe command. All the resources are also visible on the AWS console.

aws load balancer instances
AWS Console – Load Balancer

In the description of the load balancer service, we see theLoadBalancer Ingress property, which is the DNS name we’ll use to connect to our web service. Usually we don’t use it directly, instead we create a CNAME record with a readable domain (like chat.poeticoding.com) which points to the load balancer dns name.
The load balancer exposes the port 80 and redirects this traffic to the kubernetes node port 31225. This node will then redirect the traffic to the nginx container.
To test if it works we just need to use the LoadBalancer Ingress address.

nginx index
Nginx index page

Great, it works!
If doesn’t work to you, try to wait few minutes to let the load balancer DNS to propagate.

Before moving to the next step, let’s remove both nginx pod and load balancer.

$ kubectl delete svc nginx-elb
service "nginx-elb" deleted
$ kubectl delete deploy nginx

Deploy the Phoenix Chat

Since we are going to use an image ready to use, the deployment of our Phoenix chat will be really similar for what we did with Nginx.
I’ve prepared an image you can find on DockerHub, alvises/phoenix-chat-example. You can also find the full code on GitHub: poeticoding/phoenix_chat_example

kind: Deployment
apiVersion: apps/v1
metadata:
  name: chat
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: chat

  template:
    metadata:
      labels:
        app: chat
    spec:
      containers:
      - name: phoenix-chat
        image: alvises/phoenix-chat-example:1_kops_chat
        ports:
        - containerPort: 4000
        env:
        - name: PORT
          value: "4000"
        - name: PHOENIX_CHAT_HOST
          value: "chat.poeticoding.com"

The configuration of this deployment is pretty similar to the previous one. We’ve added two environment variables to configure the app

  • PORT to set the phoenix app port to 4000
  • PHOENIX_CHAT_HOST to let Phoenix know in which domain the chat is hosted, in this case "chat.poeticoding.com"

The load balancer configuration is also very similar. We use the 4000 target port in this case.

kind: Service
apiVersion: v1

metadata:
  name: chat-elb
  namespace: default
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

spec:
  type: LoadBalancer
  selector:
    app: chat
  ports:
    - name: http
      port: 80
      targetPort: 4000

In few minutes you’ll see the pod running, and the load balancer up with its DNS.

$ kubectl get pod
NAME                   READY     STATUS    RESTARTS   AGE
chat-b4d7d4b98-vxckn   1/1       Running   0          3m

$ kubectl get svc
NAME         TYPE           CLUSTER-IP      EXTERNAL-IP        PORT(S)        AGE
chat-elb     LoadBalancer   100.66.10.231   a28419b91169b...   80:31181/TCP   3m

Instead of using directly the dns of the load balancer as we did before, let’s manually add a human-readable record in our zone.

This step can be also automated, using external-dns which is a tool that updates the Route53 records accordingly to the service annotations.

aws route53 cname
Route53 CNAME

It’s now time to chat! Let’s open two browsers and connect to the Phoenix chat.

phoenix chat
Working Phoenix Chat

Each browser opens a WebSocket connection to send and receive messages. With one single container this works pretty well. All the traffic is redirected to just one Phoenix Chat server.

nlb websocket
Two Browsers – 1 Container

Multiple Chat replicas

The cluster is not fully utilised, we have just one chat pod/container running in one node. What happens if we try to scale out, adding another replica?

$ kubectl scale --replicas=2 deploy/chat
Two replicas – Out of sync
phoenix chat containers
Two Phoenix Chat Containers

Since the load-balancer uses round-robin to distribute the connections between different containers, we see that the chat-1 connects to the chat in the node-1 and chat-2 to the chat in node-2.

With this simple configuration, the two phoenix servers don’t talk to each other, so they act like two separate servers running different chat rooms. We’ll see in future articles how to deal with these situations, especially on a Kubernetes cluster.

In the Distributed Phoenix Chat using Redis PubSub, we see a way of solving this issue.

Destroy the cluster

It’s now time to destroy our cluster and free the AWS resources. We do it using the kops cli with the delete cluster subcommand.

$ kops delete cluster \
       --state "s3://state.chat.poeticoding.com" \
       --name chat.poeticoding.com \
       --yes
...
Deleted kubectl config for chat.poeticoding.com
Deleted cluster: "chat.poeticoding.com"
terminated ec2 instances
kops delete cluster – EC2 terminated instances

As we did before, we need to confirm the action with the --yes option.
The deletion process could take few minutes. When the cluster is deleted, we see that the EC2 instances are terminated and volumes, load-balancer and the VPC are also deleted.