Our first post introduced Terraform using Tack as a way of instrumenting the creation of a Kubernetes cluster on AWS. But using Tack certainly isn’t the only way you can do this. In this post, we will step you through how to use kops to create and manage your AWS Kubernetes cluster. Kops, in its own words is ‘kubectl for clusters’. If you’re familiar with Kubernetes, you’ll know that kubectl is the command line tool you use to interact with Kubernetes.

These instructions should work on either macOS or Linux. Windows users might need to do a bit of googling.

Prerequisites

Before you get started you’ll need to install a couple of the kubernetes cli tools. You’ll also need to ensure the aws-cli is setup on your machine.

  • Kops : This is the main control tool for kops and can be downloaded here. Choose a release that relates to the version of kubernetes that you want to install, e.g if you are wanting to install a 1.7.x version of kubernetes then download the 1.7.0 version of kops.
  • Kubectl: This is the main cli tool for controlling kubernetes clusters. You’ll find the binary here. Choose the binary that has the same version as the kubernetes cluster version that you want to install. You can also download the binary directly from https://storage.googleapis.com/kubernetes-release/release/v$KUBEVER/bin/$ARCH/amd64/kubectl where $KUBEVER is your desired version and $ARCH is either darwin or linux.

Now that you have your pre-reqs sorted you can get started configuring your cluster. The first step is to export the secret values that you want to use with your cluster as environment variables:

AWS credentials

You’ll firstly need to tell kops which AWS credentials to use. If you don’t export these, kops will just use the [default] credentials in your ~/.aws/credentials file. Replace the XXXXXs with your actual key and secret:

export AWS_ACCESS_KEY_ID=XXXXXXXXXXXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXXXXXXXXXXXXXXXXX

You’ll also need to export the name of your cluster. This will be used in several different configuration points, including DNS:

export NAME=kops-cluster-a.connect.cd

Store for KOPS state

You’ll need an S3 bucket to keep the current state of your kops cluster in. This is the source of truth at all times for your kops cluster. You can create the bucket with the following command (assuming you have the correct rights in AWS):

aws s3api create-bucket --bucket $NAME-state-store --acl private

You can add any additional Policies that suit your requirements but those options will not be covered here.

Then, you’ll simply need to export the bucketname value as an environment variable so kops knows which store to use:

export KOPS_STATE_STORE=s3://$NAME-state-store

Availability Zones for your cluster to live in

This will determine which AWS availability zones your cluster will live in. The master nodes in a kops cluster run both the kubernetes controller functions as well as the etcd distributed key value store. As per the FAQ, it makes sense to use 3 zones to provide robustness, high-availability and avoid split brain with your etcd cluster in the unlikely event of an AWS availability zone failure. In this example cluster, we spread both the masters and nodes across 3 different availability zones. You can change the zones specified below if you want to run etcd in a different region.

export ZONES="us-east-1a,us-east-1b,us-east-1c"
export REGION=$(echo $ZONES | awk -F, '{ print $1 }' | sed 's/-/_/g' | sed 's/.$//')

CoreOS image to use

Kops supports many different operating systems (RedHat, Centos, Ubuntu, Debian etc), however we’re big fans of Container Linux from CoreOS. This will enable you to find the latest CoreOS AMI available in your AWS region:

export IMAGE=$(curl -s https://coreos.com/dist/aws/aws-stable.json|sed 's/-/_/g'|jq '.'$REGION'.hvm'|sed 's/_/-/g' | sed 's/\"//g')

Generate the ssh keypair to use

By default kops will use ~/.ssh/id_rsa.pub as the public key that’s allowed to login to the nodes within the cluster. This isn’t always ideal, so you can instead generate a new keypair and set kops to allow login using it:

ssh-keygen -t rsa -f $NAME.key -N ''
export PUBKEY="$NAME.key.pub"

Choose your kubernetes version

Kops will ensure that this is the version of kubernetes that is deployed, so it obviously must be a valid choice. You’ll choose an older version 1.7.2 so that you can upgrade this later

export KUBEVER="1.7.2"

You can test that your version is valid by downloading the same version of kubectl from the URL below (where ARCH=linux or ARCH=darwin)

https://storage.googleapis.com/kubernetes-release/release/v$KUBEVER/bin/$ARCH/amd64/kubectl eg. https://storage.googleapis.com/kubernetes-release/release/v1.7.2/bin/linux/amd64/kubectl

DNS Zones

If you have more than one DNS zone within your AWS account, it’s best to find the correct zone you’d like to use and then add that to the creation command e.g

--dns-zone=Z266PQZ112373 \

Cluster Creation

Now you’re ready to use kops create cluster to create your cluster. There are quite a few parameters we’ve passed to the command so let’s go over a few of them. We’re using flannel for our Container Network Interface (CNI) layer, but you can use any of the providers that Kubernetes supports. You can change the --node-count if you want more nodes, and you can also change the instance type you’d like to use for your masters and nodes respectively by passing them to --master-size and --node-size.  We’re choosing to keep things small below so we don’t blow out any budgets! We’ve also specified --authorization RBAC so that our cluster has Role-Based Access Control, which is a must for any enterprise-grade cluster. You should also create a bastion jump box so that you can easily and securely get into and out of your cluster. We’ve done that with the --bastion argument.

This command will create your new cluster based on the choices you have made above. If you need to use a custom DNS zone, remember to add the --dns-zone argument and parameter.

kops create cluster --topology private \
--zones $ZONES \
--master-zones $ZONES \
--networking flannel \
--node-count 2 \
--master-size t2.small \
--node-size t2.medium \
--image $IMAGE \
--kubernetes-version $KUBEVER \
--api-loadbalancer-type public \
--admin-access 0.0.0.0/0 \
--authorization RBAC \
--ssh-public-key $PUBKEY \
--cloud aws \
--bastion \
--name ${NAME} \
--yes

Sit back and give kops a few minutes to create your cluster and you should be good to go! There are a number of ways you can test and validate your cluster. First off, let’s get kops to validate the cluster:

kops validate cluster

This will give you a basic overview at the machine level of what your cluster looks like:

$ kops validate cluster
 
Validating cluster kops-cluster-a.connect.cd
 
INSTANCE GROUPS
NAME                ROLE    MACHINETYPE MIN MAX SUBNETS
bastions            Bastion t2.micro    1   1   utility-us-east-1a,utility-us-east-1b,utility-us-east-1c
master-us-east-1a   Master  t2.small    1   1   us-east-1a
master-us-east-1b   Master  t2.small    1   1   us-east-1b
master-us-east-1c   Master  t2.small    1   1   us-east-1c
nodes               Node    t2.medium   2   2   us-east-1a,us-east-1b,us-east-1c
 
NODE STATUS
NAME                            ROLE    READY
ip-172-20-32-241.ec2.internal   master  True
ip-172-20-36-145.ec2.internal   node    True
ip-172-20-83-199.ec2.internal   master  True
ip-172-20-88-2.ec2.internal     node    True
ip-172-20-98-109.ec2.internal   master  True

Or, since kops will also automatically populate your ~/.kube/config file with a new configuration context for your cluster, you can also use kubectl to view your cluster:

$ kubectl get nodes -o wide
 
NAME                            STATUS    AGE       VERSION   EXTERNAL-IP   OS-IMAGE                                       KERNEL-VERSION
ip-172-20-32-241.ec2.internal   Ready     29m       v1.7.2    <none>        Container Linux by CoreOS 1465.8.0 (Ladybug)   4.12.14-coreos
ip-172-20-36-145.ec2.internal   Ready     28m       v1.7.2    <none>        Container Linux by CoreOS 1465.8.0 (Ladybug)   4.12.14-coreos
ip-172-20-83-199.ec2.internal   Ready     29m       v1.7.2    <none>        Container Linux by CoreOS 1465.8.0 (Ladybug)   4.12.14-coreos
ip-172-20-88-2.ec2.internal     Ready     28m       v1.7.2    <none>        Container Linux by CoreOS 1465.8.0 (Ladybug)   4.12.14-coreos
ip-172-20-98-109.ec2.internal   Ready     28m       v1.7.2    <none>        Container Linux by CoreOS 1465.8.0 (Ladybug)   4.12.14-coreos

You can try logging into your bastion using ssh and the keypair that you created earlier. As this is CoreOS that we are trying to log into, the user name is ‘core’:

$ chmod 600 $NAME.key
$ ssh -i $NAME.key core@bastion.$NAME
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'bastion.kops-cluster-a.connect.cd,52.21.84.129' (ECDSA) to the list of known hosts.
Container Linux by CoreOS stable (1465.8.0)
core@ip-172-20-10-163 ~ $

As we chose --topology private when we created our cluster, this means that none of our CoreOS instances will have a public IP. When you choose a private topology along with the --bastion option, kops doesn’t assign a public IP to your bastion server; instead it creates an ELB that passes traffic through to your bastion on port 22 as well as a DNS alias which points to that ELB. This means that your ssh sessions will be at the mercy of the ELB’s idle timeout, so you may need to adjust this to suit your needs. You’ll also need to copy the same private key to the bastion, so you can use it to jump onto any of the other CoreOS instances within your cluster.

Controlling the cluster

You can edit the values in your cluster at any stage after the initial creation, by using the edit cluster command. This will download the current cluster state from the S3 bucket that you’ve previously defined and will open it in a vim editor session for you to edit. Let’s try that out so we can update our Kubernetes version to 1.7.6 and also add idleTimeoutSeconds: 1200 under the bastion section:

$ kops edit cluster $NAME
 
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: 2017-10-01T18:45:34Z
  name: kops-cluster-a.connect.cd
spec:
  api:
    loadBalancer:
      type: Public
  authorization:
    rbac: {}
  channel: stable
  cloudProvider: aws
  configBase: s3://kops-cluster-a.connect.cd-state-store/kops-cluster-a.connect.cd
  etcdClusters:
  - etcdMembers:
    - instanceGroup: master-us-east-1a
      name: a
    - instanceGroup: master-us-east-1b
      name: b
    - instanceGroup: master-us-east-1c
      name: c
    name: main
  - etcdMembers:
    - instanceGroup: master-us-east-1a
      name: a
    - instanceGroup: master-us-east-1b
      name: b
    - instanceGroup: master-us-east-1c
      name: c
    name: events
  kubernetesApiAccess:
  - 0.0.0.0/0
  kubernetesVersion: 1.7.6
  masterInternalName: api.internal.kops-cluster-a.connect.cd
  masterPublicName: api.kops-cluster-a.connect.cd
  networkCIDR: 172.20.0.0/16
networking:
    flannel: {}
  nonMasqueradeCIDR: 100.64.0.0/10
  sshAccess:
  - 0.0.0.0/0
  subnets:
  - cidr: 172.20.32.0/19
    name: us-east-1a
    type: Private
    zone: us-east-1a
  - cidr: 172.20.64.0/19
    name: us-east-1b
    type: Private
    zone: us-east-1b
  - cidr: 172.20.96.0/19
    name: us-east-1c
    type: Private
    zone: us-east-1c
  - cidr: 172.20.0.0/22
    name: utility-us-east-1a
    type: Utility
    zone: us-east-1a
  - cidr: 172.20.4.0/22
    name: utility-us-east-1b
    type: Utility
    zone: us-east-1b
  - cidr: 172.20.8.0/22
    name: utility-us-east-1c
    type: Utility
    zone: us-east-1c
  topology:
    bastion:
      bastionPublicName: bastion.kops-cluster-a.connect.cd
      idleTimeoutSeconds: 1200
    dns:
      type: Public
    masters: private
    nodes: private
Saving the changes and exiting with !wq means you can now update your cluster. Note that if you make a mistake with your syntax, it will usually show an error when exiting the edit function.

Using the update command shows you what’s going to be changed and you can see references to your version change as well as the new IdleTimeout changing from the default 300 seconds to 1200:

$ kops update cluster $NAME
 
.......
+ - ee007f4d30a9f5002a7e4e7ea4ae446b34a174cf@https://storage.googleapis.com/kubernetes-release/release/v1.7.6/bin/linux/amd64/kubelet
- - bad424eee321f4c9b2b800d44de2e1789843da19@https://storage.googleapis.com/kubernetes-release/release/v1.7.2/bin/linux/amd64/kubelet
......
LoadBalancer/bastion.kops-cluster-a.connect.cd
Lifecycle <nil> -> Sync
ConnectionSettings {"IdleTimeout":300} -> {"IdleTimeout":1200}
Must specify --yes to apply changes

You must specify the --yes argument to the update command for the changes to be implemented:

$ kops update cluster $NAME --yes
I1002 08:12:50.709868   34486 executor.go:91] Tasks: 0 done / 119 total; 42 can run
I1002 08:12:52.354857   34486 executor.go:91] Tasks: 42 done / 119 total; 26 can run
I1002 08:12:53.528533   34486 executor.go:91] Tasks: 68 done / 119 total; 34 can run
I1002 08:12:57.824783   34486 executor.go:91] Tasks: 102 done / 119 total; 10 can run
I1002 08:12:58.706678   34486 dnsname.go:110] AliasTarget for "bastion.kops-cluster-a.connect.cd." is "bastion-kops-cluster-a-cq4ep0-326636151.us-east-1.elb.amazonaws.com."
I1002 08:12:58.707416   34486 dnsname.go:110] AliasTarget for "api.kops-cluster-a.connect.cd." is "api-kops-cluster-a-connect-cd-4mnc52-1446777117.us-east-1.elb.amazonaws.com."
I1002 08:12:59.544597   34486 executor.go:91] Tasks: 112 done / 119 total; 7 can run
I1002 08:13:00.568075   34486 executor.go:91] Tasks: 119 done / 119 total; 0 can run
I1002 08:13:00.568141   34486 dns.go:152] Pre-creating DNS records
I1002 08:13:02.447421   34486 update_cluster.go:247] Exporting kubecfg for cluster
Kops has set your kubectl context to kops-cluster-a.connect.cd
 
Cluster changes have been applied to the cloud.
 
Changes may require instances to restart: kops rolling-update cluster

As you’ll see from the output above, you’ll need to run the kops rolling-update cluster command to update your cluster. If you run this without the --yes argument it will show you which components will be updated:

$ kops rolling-update cluster
Using cluster from kubectl context:kops-cluster-a.connect.cd
 
NAME                STATUS      NEEDUPDATE  READY   MIN MAX NODES
bastions            Ready       0           1       1   1   0
master-us-east-1a   NeedsUpdate 1           0       1   1   1
master-us-east-1b   NeedsUpdate 1           0       1   1   1
master-us-east-1c   NeedsUpdate 1           0       1   1   1
nodes               NeedsUpdate 2           0       2   2   2
 
Must specify --yes to rolling-update.

This looks good! Now add the --yes argument to update the cluster. Kops will roll out the update one master or node at a time ensuring that the cluster is always available during this upgrade, so you won’t have any downtime:

$ kops rolling-update cluster --yes
Using cluster from kubectl context: kops-cluster-a.connect.cd
 
NAME            STATUS      NEEDUPDATE  READY   MIN MAX NODES
bastions        Ready       0       1   1   1   0
master-us-east-1a   NeedsUpdate 1       0   1   1   1
master-us-east-1b   NeedsUpdate 1       0   1   1   1
master-us-east-1c   NeedsUpdate 1       0   1   1   1
nodes           NeedsUpdate 2       0   2   2   2
I1002 08:18:08.382068   35110 instancegroups.go:350] Stopping instance "i-0e1f0494b9a6b8f96", node "ip-172-20-58-2.ec2.internal", in AWS ASG "master-us-east-1a.masters.kops-cluster-a.connect.cd".
I1002 08:23:08.735138   35110 instancegroups.go:350] Stopping instance "i-06ccd8ab7b738c46d", node "ip-172-20-95-238.ec2.internal", in AWS ASG "master-us-east-1b.masters.kops-cluster-a.connect.cd".
I1002 08:28:09.957542   35110 instancegroups.go:350] Stopping instance "i-0188c66cb462b9d5b", node "ip-172-20-121-101.ec2.internal", in AWS ASG "master-us-east-1c.masters.kops-cluster-a.connect.cd".
I1002 08:33:11.424412   35110 instancegroups.go:350] Stopping instance "i-0566513d1cea62aa1", node "ip-172-20-86-219.ec2.internal", in AWS ASG "nodes.kops-cluster-a.connect.cd".
I1002 08:35:12.577605   35110 instancegroups.go:350] Stopping instance "i-0584346a492c3364a", node "ip-172-20-50-34.ec2.internal", in AWS ASG "nodes.kops-cluster-a.connect.cd".
I1002 08:37:13.786625   35110 rollingupdate.go:174] Rolling update completed!

Your new Kubernetes version should have been implemented – great stuff!

$ kubectl get nodes -o wide
NAME                             STATUS    AGE       VERSION   EXTERNAL-IP   OS-IMAGE                                       KERNEL-VERSION
ip-172-20-105-199.ec2.internal   Ready     7m        v1.7.6    <none>        Container Linux by CoreOS 1465.8.0 (Ladybug)   4.12.14-coreos
ip-172-20-123-192.ec2.internal   Ready     3m        v1.7.6    <none>        Container Linux by CoreOS 1465.8.0 (Ladybug)   4.12.14-coreos
ip-172-20-55-106.ec2.internal    Ready     18m       v1.7.6    <none>        Container Linux by CoreOS 1465.8.0 (Ladybug)   4.12.14-coreos
ip-172-20-80-30.ec2.internal     Ready     1m        v1.7.6    <none>        Container Linux by CoreOS 1465.8.0 (Ladybug)   4.12.14-coreos
ip-172-20-94-168.ec2.internal    Ready     12m       v1.7.6    <none>        Container Linux by CoreOS 1465.8.0 (Ladybug)   4.12.14-coreos

Controlling instance groups

Kops also has the concept of instance groups, which are groups of machines that have similar functions. When using kops on AWS, these instance groups map to autoscaling groups. You can view these groups as follows:

$ kops get ig
Using cluster from kubectl context: kops-cluster-a.connect.cd
 
NAME                ROLE    MACHINETYPE MIN MAX SUBNETS
bastions            Bastion t2.micro    1   1   utility-us-east-1a,utility-us-east-1b,utility-us-east-1c
master-us-east-1a   Master  t2.small    1   1   us-east-1a
master-us-east-1b   Master  t2.small    1   1   us-east-1b
master-us-east-1c   Master  t2.small    1   1   us-east-1c
nodes               Node    t2.medium   2   2   us-east-1a,us-east-1b,us-east-1c

You can also change the details of each instance group using the edit command, for example to add a new node we would edit the node group and increase our minSize of the group from 2 to 3 .
Once again you’ll be presented with a vim session for editing:

$ kops edit ig nodes
Using cluster from kubectl context: kops-cluster-a.connect.cd
 
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: 2017-10-01T18:45:36Z
  labels:
    kops.k8s.io/cluster: kops-cluster-a.connect.cd
  name: nodes
spec:
  image: ami-e2d33d98
  machineType: t2.medium
  maxSize: 4
  minSize: 3
  role: Node
  subnets:
  - us-east-1a
  - us-east-1b
  - us-east-1c

You’ll need to update the cluster again using the --yes argument to add your new node:

$ kops update cluster --yes
Using cluster from kubectl context: kops-cluster-a.connect.cd
 
I1002 08:49:28.910484   37911 executor.go:91] Tasks: 0 done / 119 total; 42 can run
I1002 08:49:30.608112   37911 executor.go:91] Tasks: 42 done / 119 total; 26 can run
I1002 08:49:31.992969   37911 executor.go:91] Tasks: 68 done / 119 total; 34 can run
I1002 08:49:34.362580   37911 executor.go:91] Tasks: 102 done / 119 total; 10 can run
I1002 08:49:34.609685   37911 dnsname.go:110] AliasTarget for "api.kops-gdt.clearpoint.nz." is "api-kops-cluster-a-connect-cd-4mnc52-1446777117.us-east-1.elb.amazonaws.com."
I1002 08:49:34.849999   37911 dnsname.go:110] AliasTarget for "bastion.kops-gdt.clearpoint.nz." is "bastion-kops-cluster-a-cq4ep0-326636151.us-east-1.elb.amazonaws.com."
I1002 08:49:35.804106   37911 executor.go:91] Tasks: 112 done / 119 total; 7 can run
I1002 08:49:36.915647   37911 executor.go:91] Tasks: 119 done / 119 total; 0 can run
I1002 08:49:36.915789   37911 dns.go:152] Pre-creating DNS records
I1002 08:49:38.802101   37911 update_cluster.go:247] Exporting kubecfg for cluster
Kops has set your kubectl context to kops-cluster-a.connect.cd
 
Cluster changes have been applied to the cloud.
 
Changes may require instances to restart: kops rolling-update cluster

Even though it says that you may need a rolling upgrade, to resize an instance group you don’t need to perform this step. Give it a couple of minutes and then check your cluster again and you’ll see your third node:

$ kops validate cluster
Using cluster from kubectl context: kops-cluster-a.connect.cd
 
Validating cluster kops-cluster-a.connect.cd
 
INSTANCE GROUPS
NAME                ROLE    MACHINETYPE MIN MAX SUBNETS
bastions            Bastion t2.micro    1   1   utility-us-east-1a,utility-us-east-1b,utility-us-east-1c
master-us-east-1a   Master  t2.small    1   1   us-east-1a
master-us-east-1b   Master  t2.small    1   1   us-east-1b
master-us-east-1c   Master  t2.small    1   1   us-east-1c
nodes               Node    t2.medium   3   4   us-east-1a,us-east-1b,us-east-1c
 
NODE STATUS
NAME                            ROLE    READY
ip-172-20-105-199.ec2.internal  master  True
ip-172-20-123-192.ec2.internal  node    True
ip-172-20-33-127.ec2.internal   node    True
ip-172-20-55-106.ec2.internal   master  True
ip-172-20-80-30.ec2.internal    node    True
ip-172-20-94-168.ec2.internal   master  True

Deleting the cluster

Once you’ve finished with your cluster, deleting it is very simple so take care not to do this by mistake!

$ kops delete cluster $NAME --yes

Other Options

Kops is quite configurable and there many different options and architectures you can choose. The command line help is very useful, e.g: kops create cluster --help which gives you a good description of the various different options. If you need more documentation on using kops it can be found here.

More Resources

There’s plenty of additional resources and reading available online so that you can better familiarise yourself with kops. Here’s a small collection of official resources you might find useful:

  1. The official kops repository
  2. The official kops ‘getting started’ guide
  3. AWS Compute Blog article on kops

We hope you’ve found this useful! Stay tuned for our next post…


Brent Newson

Brent is a Senior DevOps Engineer at ClearPoint

Leave a Reply

Your email address will not be published. Required fields are marked *