Kubernetes on CoreOS with distributed shared storage (with Ceph) : step by step procedure

Abstract

Now that we have a running CoreOS/Etcd v3 cluster, we want to get the best of it. We have the means, but no great software to manage the containers automatically.

Kubernetes, like Mesos or Openshift, is that kind of software that can do the job. Powered by Google, benefiting from years of experience (Borg…), it’s a great platform as you will see in this post.

Requirements

  • a decent homelab. Example here
  • a running CoreOS/Etcd v3 cluster
  • access to a shared storage for your data. we will use here a Ceph distributed storage (details here
  • access to the internet (until you have your own container registries and other software repositories)
  • You will want to get an automated tool to configure your cluster. I will use a little UNIX tool (dsh) in this tutorial.

Credits

This post is inspired from this post : https://www.upcloud.com/support/deploy-kubernetes-coreos/, but it was written for kubernetes 1.3.0. Since 1.6.0 there are a few options that are deprecated. Use the following guide if you want to install 1.6.

The rest of this guide has been written following the “source” documentations : https://coreos.com/kubernetes/docs/latest/getting-started.html

Architecture

result

Preparing the CoreOS nodes

Take 2 minutes to setup dsh in order to automate your deployment.

Use groups :

  1. One machine.list file that list all your nodes
  2. One group file (~/.dsh/group

    [core@admin .dsh]$ pwd
    /home/core/.dsh
    [core@admin .dsh]$ find
    .
    ./machines.list
    ./group
    ./group/km
    ./group/kw

the machines.list and group files have to include the nodes, like the worker nodes :

[core@admin .dsh]$ cat ./group/kw
c1.int.intra
c2.int.intra

Create your CoreOS nodes

Follow this guide

A bit of change since my previous post : etcd v3 seems not to be supported very well by flannel 0.7

I will use etcd v2, and as you will see, with core os and with containers (etcd on coreos is containerized), it’s very very simple. Juste change the user_data file and apply the procedure described in my previous post.

Only two changes :

  1. map etcd v2 as the same level (2.3.7) as the client installed on CoreOS (at the time this post is published)
  2. removed one argument (auto compaction..) not supported in etcd v2 2.3.7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
#cloud-config
ssh_authorized_keys:
- "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDjgUAV27yJT6YRxzDfGiDB4fwZwmx7EWzcZU3LXRWjaaSgvlizQtHwG8OCJYQN0aG29CTQNgJs+EY40/VQyeidVOdVmaClzmVSMruB68msEuvMrz5DA/v1FXVrYJCAyy+3l719DI9eA++nYyDo//LEj5cf7/4Xcs+12o4ADCJzYMNXazQ8f/1d3EPqJhfcuL+spehCYzDGCYyDDGeIsUaZYnWdOY4z4+2wWtd++9WBfCrVE2g8I3k0+U0iVEM1tZXfrvIzj+fw17zs/FNuzAunQRAIagqUcIswc9aPJbN1H9W31N6gX1X5IV/SvqPl39QXV/wgXtc9M+0oEKJQOvOj core"
hostname: "@@VMNAME@@"
users:
- name: "user"
passwd: "$6$D6SWuIMCinNlXP4b$QXnT3RVvPwHcR6ElHP1hWRBxq03gA/TbM1iMKRz3NukT7AYGGf3uSGC9WIkBI1s0GlQ9a1wWUvil.e/OBu6ax/"
groups:
- "sudo"
write_files:
- path: /etc/systemd/resolved.conf
permissions: 0644
owner: root
content: |
# This file is part of systemd.
#
# systemd is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.
#
# Entries in this file show the compile time defaults.
# You can change settings by editing this file.
# Defaults can be restored by simply deleting this file.
#
# See resolved.conf(5) for details
[Resolve]
#DNS=
#FallbackDNS=
Domains=int.intra
#LLMNR=yes
#DNSSEC=allow-downgrade
#Cache=yes
#DNSStubListener=udp
write_files:
- path: /etc/systemd/system/etcd-member.service.d/override.conf
permissions: 0644
owner: root
content: |
[Service]
Environment="ETCD_IMAGE_TAG=v2.3.7"
Environment="ETCD_DATA_DIR=/var/lib/etcd"
Environment="ETCD_SSL_DIR=/etc/ssl/certs"
Environment="ETCD_OPTS=--name=@@VMNAME@@ --listen-client-urls=http://@@IP@@:2379 --advertise-client-urls=http://@@IP@@:2379 --listen-peer-urls=http://@@IP@@:2380 --initial-advertise-peer-urls=http://@@IP@@:2380 --initial-cluster-token=my-etcd-token --discovery-srv=int.intra"
coreos:
update:
reboot-strategy: off
units:
- name: systemd-networkd.service
command: stop
- name: static.network
runtime: true
content: |
[Match]
Name=eth0
[Network]
Address=@@IP@@/16
Gateway=10.0.0.1
DNS=10.0.0.1
- name: down-interfaces.service
command: start
content: |
[Service]
Type=oneshot
ExecStart=/usr/bin/ip link set eth0 down
ExecStart=/usr/bin/ip addr flush dev eth0
- name: systemd-networkd.service
command: restart
- name: systemd-resolved.service
command: restart
- name: etcd-member.service
command: start

Go..

So we have a running CoreOS cluster.

core@c0 ~ $ export ETCDCTL_ENDPOINTS=http://c0:2379; etcdctl cluster-health 
member 65ac5f05513b1262 is healthy: got healthy result from http://10.0.1.101:2379
member 6c2733a4aad35dac is healthy: got healthy result from http://10.0.1.100:2379
member 84d5ec72bbd714d2 is healthy: got healthy result from http://10.0.1.102:2379

We will do the following :

c0 : Kubernetes master
c1 : Kubernetes node #1
c2 : Kubernetes node #2
admin : admin node, hosting the user “core”, that have access to the cluster

Networking : only one network card for the hosts (private nic =public nic =defaut NIC, private ip = public ip = default ip)

Remember you have setup dsh on the admin node, in order to execute commands on each node (previous posts)

create the Root CA public and private key

On the admin node as user “core” :

mkdir ~/kube-ssl
cd ~/kube-ssl
openssl genrsa -out ca-key.pem 2048
openssl req -x509 -new -nodes -key ca-key.pem -days 10000 -out ca.pem -subj "/CN=kube-ca"

Create a DNS CNAME record to match the name “kubernetes” for the master
I use bind on my network.

main zone for my internal DNS network int.intra:

c0                      A       10.0.1.100
kubernetes              CNAME   c0.int.intra.
c1                      A       10.0.1.101   
c2                      A       10.0.1.102

Configure the master node

Create the API server keys

vi openssl.cnf

[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = kubernetes
DNS.2 = kubernetes.default
DNS.3 = kubernetes.default.svc
DNS.4 = kubernetes.default.svc.cluster.local
DNS.5 = kubernetes.int.intra
IP.1 = 10.0.1.100
IP.2 = 10.2.0.1
IP.3 = 10.3.0.1

and issue :

openssl genrsa -out apiserver-key.pem 2048
openssl req -new -key apiserver-key.pem -out apiserver.csr -subj "/CN=kube-apiserver" -config openssl.cnf
openssl x509 -req -in apiserver.csr -CA ca.pem -CAkey ca-key.pem -CAcreateserial -out apiserver.pem -days 365 -extensions v3_req -extfile openssl.cnf

Create the keys for remote management from the admin node

openssl genrsa -out admin-key.pem 2048
openssl req -new -key admin-key.pem -out admin.csr -subj "/CN=kube-admin"
openssl x509 -req -in admin.csr -CA ca.pem -CAkey ca-key.pem -CAcreateserial -out admin.pem -days 365

Apply the TLS assets

Deploy the root CA pem on all nodes

You’re always managing form the “admin” machine, under user “core”

dsh -g km "sudo mkdir -p /etc/kubernetes/ssl"
cat ca.pem | dsh -g km -i -c 'sudo tee /etc/kubernetes/ssl/ca.pem'
cat apiserver.pem | dsh -g km -i -c 'sudo tee /etc/kubernetes/ssl/apiserver.pem'
cat apiserver-key.pem | dsh -g km -i -c 'sudo tee /etc/kubernetes/ssl/apiserver-key.pem'

And set the permissions

[core@admin ssl]$ dsh -g km "sudo chmod 600 /etc/kubernetes/ssl/*-key.pem ; sudo chown root:root /etc/kubernetes/ssl/*-key.pem"

Setting up flannel

This step could have been setup using cloud-config when you have setup coreos and etcd.

We will do this “manually” (under quotes because it’s a few command lines automatically executed by dsh or the shell, one time only at setup time”. -> no such a cost.

Create the flannel conf dircetory on all nodes :

[core@admin ssl]$ dsh -g km "sudo mkdir /etc/flannel"

And the config file. note : only valid for these three nodes. Review these commands if you need more nodes.

[core@admin ssl]$ dsh -g km 'rm -f options.env ; export IP=`ip addr show eth0 | grep -Po "inet \K[\d.]+"`; echo "FLANNELD_IFACE=$IP" > options.env ; echo "FLANNELD_ETCD_ENDPOINTS=http://10.0.1.100:2379,http://10.0.1.101:2379,http://10.0.1.102:2379" >> options.env ; sudo cp options.env /etc/flannel/'

We also need to create de drop-in for systemd, overriding the existing flanneld systemd service located in /usr/lib64/systemd/system

[core@admin ssl]$ dsh -g km "sudo mkdir -p /etc/systemd/system/flanneld.service.d/"
[core@admin ssl]$ dsh -g km 'sudo rm -f /etc/systemd/system/flanneld.service.d/40-ExecStartPre-symlink.conf; printf "[Service]\nExecStartPre=/usr/bin/ln -sf /etc/flannel/options.env /run/flannel/options.env\n" | sudo tee -a /etc/systemd/system/flanneld.service.d/40-ExecStartPre-symlink.conf'

Set the flanneld configuration into the etcd cluster. For this operation, I use an etcd binary for linux available here

On the master node, run

dsh -m c0 'export ETCDCTL_API=2 ; export ETCDCTL_ENDPOINTS=http://c1:2379; etcdctl set /coreos.com/network/config "{\"Network\":\"10.2.0.0/16\",\"Backend\":{\"Type\":\"vxlan\"}}"'
dsh -m c0 'export ETCDCTL_API=2 ; export ETCDCTL_ENDPOINTS=http://c1:2379; etcdctl get /coreos.com/network/config'

Docker configuration

Create another systemd drop-in on all master nodes :

/etc/systemd/system/docker.service.d/40-flannel.conf :

[Unit]
Requires=flanneld.service
After=flanneld.service
[Service]
EnvironmentFile=/run/flannel/flannel_docker_opts.env

NOTE : when writing this doc, there’s an error on the officiel documentation (coreos.com). the environment file is actually /run/flannel/flannel_docker_opts.env and NOT /etc/kubernetes/cni/docker_opts_cni.env. Either specify it like I did, or create a link pointing from /etc/kubernetes/cni/docker_opts_cni.env to /run/flannel/flannel_docker_opts.env

create the file on the admin node then push it on all master nodes (only one for me)

dsh -g km "sudo mkdir -p /etc/systemd/system/docker.service.d/"
cat docker.conf | dsh -g km -i -c 'sudo tee /etc/systemd/system/docker.service.d/40-flannel.conf'

Create the Docker CNI Options file:

/etc/kubernetes/cni/docker_opts_cni.env

dsh -g km 'sudo mkdir -p /etc/kubernetes/cni/; echo DOCKER_OPT_BIP="" | sudo tee  /etc/kubernetes/cni/docker_opts_cni.env ; echo DOCKER_OPT_IPMASQ="" | sudo tee -a /etc/kubernetes/cni/docker_opts_cni.env' 

Create the flannel conf file, we will use flannel for the overlay network

/etc/kubernetes/cni/net.d/10-flannel.conf

{
    "name": "podnet",
    "type": "flannel",
    "delegate": {
        "isDefaultGateway": true
    }
}

again, don’t do that by hand on each node, use an automated way.

Here`s one another..

[core@admin ~]$ dsh -g km 'sudo mkdir -p /etc/kubernetes/cni/net.d/ ; printf "{\n    "name": "podnet",\n    "type": "flannel",\n    "delegate": {\n        "isDefaultGateway": true\n    }\n}\n" | sudo tee /etc/kubernetes/cni/net.d/10-flannel.conf'

Create the kubelet Unit

Install the systemd unit service

/etc/systemd/system/kubelet.service

[Service]
Environment=KUBELET_IMAGE_TAG=v1.6.1_coreos.0
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/run/kubelet-pod.uuid \
  --volume var-log,kind=host,source=/var/log \
  --mount volume=var-log,target=/var/log \
  --volume dns,kind=host,source=/etc/resolv.conf \
  --mount volume=dns,target=/etc/resolv.conf \
  --volume modprobe,kind=host,source=/usr/sbin/modprobe \
  --mount volume=modprobe,target=/usr/sbin/modprobe \
  --volume lib-modules,kind=host,source=/lib/modules \
  --mount volume=lib-modules,target=/lib/modules \
  --uuid-file-save=/var/run/kubelet-pod.uuid"
ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/usr/bin/mkdir -p /var/log/containers
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/run/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
  --api-servers=http://127.0.0.1:8080 \
  --register-schedulable=false \
  --cni-conf-dir=/etc/kubernetes/cni/net.d \
  --container-runtime=docker \
  --allow-privileged=true \
  --pod-manifest-path=/etc/kubernetes/manifests \
  --hostname-override=@@IP@@ \
  --cluster_dns=10.3.0.10 \
  --cluster_domain=cluster.local
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/run/kubelet-pod.uuid
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

Then push this file (with the real IP) on the master nodes,

cat kubelet.service | dsh -g km -i -c 'export IP=`ip addr show eth0 | grep -Po "inet \K[\d.]+"`; sed s/@@IP@@/${IP}/g | sudo tee /etc/systemd/system/kubelet.service'

Set Up the kube-apiserver Pod

Create the manifests directory for all kubernetes pods.

dsh -g km "sudo mkdir -p /etc/kubernetes/manifests"

NOTE : you MUST add - –storage-backend=etcd2 to the apiserver YAML manifest. I’ve lost 4 hours debugging for that. I used etcd2 because flannel is not actually ETCD v3 compliant..

NOTE : note the - –insecure-bind-address=0.0.0.0 flag. not to do in a production env. Disable it when you will have secured the api server with cryptographic mechanisms.

/etc/kubernetes/manifests/kube-apiserver.yaml

apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-apiserver
    image: quay.io/coreos/hyperkube:v1.6.1_coreos.0
    command:
    - /hyperkube
    - apiserver
    - --bind-address=0.0.0.0
    - --insecure-bind-address=0.0.0.0
    - --insecure-port=8080
    - --storage-media-type=application/json
    - --storage-backend=etcd2
    - --etcd-servers=http://10.0.1.100:2379,http://10.0.1.101:2379,http://10.0.1.102:2379
    - --allow-privileged=true
    - --service-cluster-ip-range=10.3.0.0/24
    - --secure-port=443
    - --advertise-address=10.0.1.100
    - --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota
    - --tls-cert-file=/etc/kubernetes/ssl/apiserver.pem
    - --tls-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
    - --client-ca-file=/etc/kubernetes/ssl/ca.pem
    - --service-account-key-file=/etc/kubernetes/ssl/apiserver-key.pem
    - --runtime-config=extensions/v1beta1/networkpolicies=true
    - --anonymous-auth=false
    livenessProbe:
      httpGet:
        host: 127.0.0.1
        port: 8080
        path: /healthz
      initialDelaySeconds: 15
      timeoutSeconds: 15
    ports:
    - containerPort: 443
      hostPort: 443
      name: https
    - containerPort: 8080
      hostPort: 8080
      name: local
    volumeMounts:
    - mountPath: /etc/kubernetes/ssl
      name: ssl-certs-kubernetes
      readOnly: true
    - mountPath: /etc/ssl/certs
      name: ssl-certs-host
      readOnly: true
  volumes:
  - hostPath:
      path: /etc/kubernetes/ssl
    name: ssl-certs-kubernetes
  - hostPath:
      path: /usr/share/ca-certificates
    name: ssl-certs-host

push it :

cat kube-apiserver.yaml | dsh -g km -i -c 'sudo tee /etc/kubernetes/manifests/kube-apiserver.yaml'

Set Up the kube-proxy Pod

/etc/kubernetes/manifests/kube-proxy.yaml

apiVersion: v1
kind: Pod
metadata:
  name: kube-proxy
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-proxy
    image: quay.io/coreos/hyperkube:v1.6.1_coreos.0
    command:
    - /hyperkube
    - proxy
    - --master=http://127.0.0.1:8080
    securityContext:
      privileged: true
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: ssl-certs-host
      readOnly: true
  volumes:
  - hostPath:
      path: /usr/share/ca-certificates
    name: ssl-certs-host

Again, push it :

cat kube-proxy.yaml | dsh -g km -i -c 'sudo tee /etc/kubernetes/manifests/kube-proxy.yaml'

Set Up the kube-controller-manager Pod

/etc/kubernetes/manifests/kube-controller-manager.yaml

apiVersion: v1
kind: Pod
metadata:
  name: kube-controller-manager
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-controller-manager
    image: quay.io/coreos/hyperkube:v1.6.1_coreos.0
    command:
    - /hyperkube
    - controller-manager
    - --master=http://127.0.0.1:8080
    - --leader-elect=true
    - --service-account-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
    - --root-ca-file=/etc/kubernetes/ssl/ca.pem
    resources:
      requests:
        cpu: 200m
    livenessProbe:
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10252
      initialDelaySeconds: 15
      timeoutSeconds: 15
    volumeMounts:
    - mountPath: /etc/kubernetes/ssl
      name: ssl-certs-kubernetes
      readOnly: true
    - mountPath: /etc/ssl/certs
      name: ssl-certs-host
      readOnly: true
  volumes:
  - hostPath:
      path: /etc/kubernetes/ssl
    name: ssl-certs-kubernetes
  - hostPath:
      path: /usr/share/ca-certificates
    name: ssl-certs-host

Set Up the kube-scheduler Pod

/etc/kubernetes/manifests/kube-scheduler.yaml

apiVersion: v1
kind: Pod
metadata:
  name: kube-scheduler
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-scheduler
    image: quay.io/coreos/hyperkube:v1.6.1_coreos.0
    command:
    - /hyperkube
    - scheduler
    - --master=http://127.0.0.1:8080
    - --leader-elect=true
    resources:
      requests:
        cpu: 100m
    livenessProbe:
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10251
      initialDelaySeconds: 15
      timeoutSeconds: 15

push it :

cat kube-scheduler.yaml | dsh -g km -i -c 'sudo tee /etc/kubernetes/manifests/kube-scheduler.yaml'

Start the master node

sudo systemctl daemon-reload
sudo systemctl start flanneld

this last command can take minutes the first time depending of the network speed between your master nodes and the container registry (internet or local, depending of your infrastructure) , cause the wrapper will download the container images to run flannel.
be patient…

sudo systemctl enable flanneld

Start kubernetes

sudo systemctl start kubelet
sudo systemctl enable kubelet

Again, the first time, this command runs a wrapper that downloads the kubernetes containers (approx 250MB)

be patient :)

you can see the progress like that :

c0 ~ # systemctl status kubelet
● kubelet.service
   Loaded: loaded (/etc/systemd/system/kubelet.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2017-06-21 09:58:17 UTC; 3min 12s ago
  Process: 2087 ExecStartPre=/usr/bin/rkt rm --uuid-file=/var/run/kubelet-pod.uuid (code=exited, status=254)
  Process: 2083 ExecStartPre=/usr/bin/mkdir -p /var/log/containers (code=exited, status=0/SUCCESS)
  Process: 2076 ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests (code=exited, status=0/SUCCESS)
 Main PID: 2098 (rkt)
    Tasks: 8 (limit: 32768)
   Memory: 122.3M
      CPU: 5.416s
   CGroup: /system.slice/kubelet.service
           └─2098 /usr/bin/rkt run --uuid-file-save=/var/run/kubelet-pod.uuid --volume var-log,kind=host,source=/var/log --mount volume=var-log,target=/var/log --volume dns,kind=host,source=/etc/resolv.conf --mount volume=dns,targe

Jun 21 10:01:20 c0 kubelet-wrapper[2098]: Downloading ACI:  94.3 MB/237 MB
Jun 21 10:01:21 c0 kubelet-wrapper[2098]: Downloading ACI:  95 MB/237 MB
Jun 21 10:01:22 c0 kubelet-wrapper[2098]: Downloading ACI:  95.7 MB/237 MB
Jun 21 10:01:23 c0 kubelet-wrapper[2098]: Downloading ACI:  96.4 MB/237 MB
Jun 21 10:01:24 c0 kubelet-wrapper[2098]: Downloading ACI:  97.2 MB/237 MB
Jun 21 10:01:25 c0 kubelet-wrapper[2098]: Downloading ACI:  98.3 MB/237 MB
Jun 21 10:01:26 c0 kubelet-wrapper[2098]: Downloading ACI:  99.5 MB/237 MB
Jun 21 10:01:27 c0 kubelet-wrapper[2098]: Downloading ACI:  101 MB/237 MB
Jun 21 10:01:29 c0 kubelet-wrapper[2098]: Downloading ACI:  101 MB/237 MB
Jun 21 10:01:30 c0 kubelet-wrapper[2098]: Downloading ACI:  102 MB/237 MB

You will have to be a bit patient after this download completes, so that kubelet can start all containers..

Finally, check that the api server responds to your requests :

run on the master node :

curl http://c0:8080/version
{
  "major": "1",
  "minor": "6",
  "gitVersion": "v1.6.1+coreos.0",
  "gitCommit": "9212f77ed8c169a0afa02e58dce87913c6387b3e",
  "gitTreeState": "clean",
  "buildDate": "2017-04-04T00:32:53Z",
  "goVersion": "go1.7.5",
  "compiler": "gc",
  "platform": "linux/amd64"
}

Setup the workers

create the kubernetes dir and ssl dir

dsh -g kw "sudo mkdir -p /etc/kubernetes/ssl"

TLS assets

Go back to the kube-ssl directory on the admin node

Generate the pem files :

vi worker-openssl.cnf

[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names
[alt_names]
IP.1 = $ENV::WORKER_IP

for i in {1..2} ; do export FQDN=c$i.int.intra; export WORKER_IP=10.0.1.10c$i; openssl genrsa -out ${FQDN}-worker-key.pem 2048 ; done
for i in {1..2} ; do export FQDN=c$i.int.intra; export WORKER_IP=10.0.1.10c$i; WORKER_IP=${WORKER_IP} openssl req -new -key ${FQDN}-worker-key.pem -out ${FQDN}-worker.csr -subj "/CN=${FQDN}" -config worker-openssl.cnf ; done
for i in {1..2} ; do export FQDN=c$i.int.intra; export WORKER_IP=10.0.1.10c$i; WORKER_IP=${WORKER_IP} WORKER_IP=${WORKER_IP} openssl x509 -req -in ${FQDN}-worker.csr -CA ca.pem -CAkey ca-key.pem -CAcreateserial -out ${FQDN}-worker.pem -days 365 -extensions v3_req -extfile worker-openssl.cnf ; done

copy the asset files on the worker nodes

cat ca.pem | dsh -g kw -i -c 'sudo tee /etc/kubernetes/ssl/ca.pem'
for i in {1..2} ; do cat ~/kube-ssl/c$i.int.intra-worker.pem | dsh -m c$i -i -c "sudo tee /etc/kubernetes/ssl/c$i.int.intra-worker.pem" ; done
for i in {1..2} ; do cat ~/kube-ssl/c$i.int.intra-worker-key.pem | dsh -m c$i -i -c "sudo tee /etc/kubernetes/ssl/c$i.int.intra-worker-key.pem" ; done

Set permissions

dsh -g kw "sudo chmod 600 /etc/kubernetes/ssl/*-key.pem ; sudo chown root:root /etc/kubernetes/ssl/*-key.pem"

Set links

for i in {1..2} ; do dsh -m c$i "sudo ln -s /etc/kubernetes/ssl/c$i.int.intra-worker.pem /etc/kubernetes/ssl/worker.pem"; done
for i in {1..2} ; do dsh -m c$i "sudo ln -s /etc/kubernetes/ssl/c$i.int.intra-worker-key.pem /etc/kubernetes/ssl/worker-key.pem"; done

Check :

[core@admin kube-ssl]$ dsh -g kw "sudo ls -l /etc/kubernetes/ssl"
total 32
-rw-------. 1 root root 1675 Jun 17 15:40 c1.int.intra-worker-key.pem
-rw-r--r--. 1 root root 1046 Jun 17 15:40 c1.int.intra-worker.pem
-rw-r--r--. 1 root root 1090 Jun 17 15:38 ca.pem
lrwxrwxrwx. 1 root root   47 Jun 17 15:42 worker-key.pem -> /etc/kubernetes/ssl/c1.int.intra-worker-key.pem
lrwxrwxrwx. 1 root root   43 Jun 17 15:42 worker.pem -> /etc/kubernetes/ssl/c1.int.intra-worker.pem
total 32
-rw-------. 1 root root 1679 Jun 17 15:40 c2.int.intra-worker-key.pem
-rw-r--r--. 1 root root 1046 Jun 17 15:40 c2.int.intra-worker.pem
-rw-r--r--. 1 root root 1090 Jun 17 15:38 ca.pem
lrwxrwxrwx. 1 root root   47 Jun 17 15:42 worker-key.pem -> /etc/kubernetes/ssl/c2.int.intra-worker-key.pem
lrwxrwxrwx. 1 root root   43 Jun 17 15:42 worker.pem -> /etc/kubernetes/ssl/c2.int.intra-worker.pem

Networking configuration

/etc/flannel/options.env :

dsh -g kw 'sudo mkdir /etc/flannel ; export IP=`ip addr show eth0 | grep -Po "inet \K[\d.]+"`; echo "FLANNELD_IFACE=$IP" | sudo tee /etc/flannel/options.env ; echo "FLANNELD_ETCD_ENDPOINTS=http://10.0.1.100:2379,http://10.0.1.101:2379,http://10.0.1.102:2379" | sudo tee -a /etc/flannel/options.env'

As for the master, create the systemd dropin for flanneld

dsh -g kw "sudo mkdir -p /etc/systemd/system/flanneld.service.d/"
dsh -g kw 'sudo rm -f /etc/systemd/system/flanneld.service.d/40-ExecStartPre-symlink.conf; printf "[Service]\nExecStartPre=/usr/bin/ln -sf /etc/flannel/options.env /run/flannel/options.env\n" | sudo tee -a /etc/systemd/system/flanneld.service.d/40-ExecStartPre-symlink.conf'

Docker configuration

/etc/systemd/system/docker.service.d/40-flannel.conf

dsh -g kw 'sudo mkdir -p /etc/systemd/system/docker.service.d/; printf "[Unit]\nRequires=flanneld.service\nAfter=flanneld.service\n[Service]\nEnvironmentFile=/run/flannel/flannel_docker_opts.env\n" | sudo tee /etc/systemd/system/docker.service.d/40-flannel.conf'

/etc/kubernetes/cni/docker_opts_cni.env

dsh -g kw 'sudo mkdir -p /etc/kubernetes/cni/ ; echo DOCKER_OPT_BIP="" | sudo tee  /etc/kubernetes/cni/docker_opts_cni.env ; echo DOCKER_OPT_IPMASQ="" | sudo tee -a /etc/kubernetes/cni/docker_opts_cni.env'


dsh -g kw 'sudo mkdir -p /etc/kubernetes/cni/net.d/ ; printf "{\n    "name": "podnet",\n    "type": "flannel",\n    "delegate": {\n        "isDefaultGateway": true\n    }\n}\n" | sudo tee /etc/kubernetes/cni/net.d/10-flannel.conf' 

Create the kubelet Unit

/etc/systemd/system/kubelet.service

create this file on the admin node and push it with DSH on all worker nodes, replacing ${ADVERTISE_IP} with the real hostname.

[Service]
Environment=KUBELET_IMAGE_TAG=v1.6.1_coreos.0
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/run/kubelet-pod.uuid \
  --volume dns,kind=host,source=/etc/resolv.conf \
  --mount volume=dns,target=/etc/resolv.conf \
  --volume var-log,kind=host,source=/var/log \
  --mount volume=var-log,target=/var/log"
ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/usr/bin/mkdir -p /var/log/containers
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/run/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
  --api-servers=http://c0.int.intra:8080 \
  --cni-conf-dir=/etc/kubernetes/cni/net.d \
  --container-runtime=docker \
  --register-node=true \
  --allow-privileged=true \
  --pod-manifest-path=/etc/kubernetes/manifests \
  --hostname-override=@@IP@@ \
  --cluster_dns=10.3.0.10 \
  --cluster_domain=cluster.local \
  --kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml \
  --tls-cert-file=/etc/kubernetes/ssl/worker.pem \
  --tls-private-key-file=/etc/kubernetes/ssl/worker-key.pem
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/run/kubelet-pod.uuid
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

I will name this template kubelet.service.w on the tmp dir of the admin node

cat kubelet.service.w | dsh -g kw -i -c 'export IP=`ip addr show eth0 | grep -Po "inet \K[\d.]+"`; sed s/@@IP@@/${IP}/g | sudo tee /etc/systemd/system/kubelet.service'

Set Up the kube-proxy Pod

create a kube-proxy.yaml.wk on the admin noe :

apiVersion: v1
kind: Pod
metadata:
  name: kube-proxy
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-proxy
    image: quay.io/coreos/hyperkube:v1.6.1_coreos.0
    command:
    - /hyperkube
    - proxy
    - --master=http://c0.int.intra:8080
    - --kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml
    securityContext:
      privileged: true
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: "ssl-certs"
    - mountPath: /etc/kubernetes/worker-kubeconfig.yaml
      name: "kubeconfig"
      readOnly: true
    - mountPath: /etc/kubernetes/ssl
      name: "etc-kube-ssl"
      readOnly: true
  volumes:
  - name: "ssl-certs"
    hostPath:
      path: "/usr/share/ca-certificates"
  - name: "kubeconfig"
    hostPath:
      path: "/etc/kubernetes/worker-kubeconfig.yaml"
  - name: "etc-kube-ssl"
    hostPath:
      path: "/etc/kubernetes/ssl"

and push it on all worker nodes

dsh -g kw 'sudo mkdir -p /etc/kubernetes/manifests/'
cat kube-proxy.yaml.w | dsh -g kw -i -c 'sudo tee /etc/kubernetes/manifests/kube-proxy.yaml'

Set Up kubeconfig

/etc/kubernetes/worker-kubeconfig.yaml

Again, file to create on the admin node :

apiVersion: v1
kind: Config
clusters:
- name: local
  cluster:
    certificate-authority: /etc/kubernetes/ssl/ca.pem
users:
- name: kubelet
  user:
    client-certificate: /etc/kubernetes/ssl/worker.pem
    client-key: /etc/kubernetes/ssl/worker-key.pem
contexts:
- context:
    cluster: local
    user: kubelet
  name: kubelet-context
current-context: kubelet-context

and push :

cat worker-kubeconfig.yaml | dsh -g kw -i -c 'sudo tee /etc/kubernetes/worker-kubeconfig.yaml'

Start the worker services

dsh -g kw "sudo systemctl daemon-reload"

dsh -g kw "sudo systemctl start flanneld"
dsh -g kw "sudo systemctl start kubelet"

dsh -g kw "sudo systemctl enable flanneld"
dsh -g kw "sudo systemctl enable kubelet"

configure kubectl (kubernetes management tool)

On the admin node (where you want, it’s a simple binary), you install kubectl

curl -O https://storage.googleapis.com/kubernetes-release/release/v1.6.1/bin/linux/amd64/kubectl
chmod +x kubectl

or as the doc suggests, put the binary in a directory included in your PATH env variable, for ex :

mv kubectl /usr/local/bin/kubectl

export MASTER_HOST=kubernetes.int.intra
export CA_CERT=/home/core/kube-ssl/ca.pem
export ADMIN_KEY=/home/core/kube-ssl/apiserver-key.pem 
export ADMIN_CERT=/home/core/kube-ssl/apiserver.pem 

kubectl config set-cluster default-cluster --server=https://${MASTER_HOST} --certificate-authority=${CA_CERT}
kubectl config set-credentials default-admin --certificate-authority=${CA_CERT} --client-key=${ADMIN_KEY} --client-certificate=${ADMIN_CERT}
kubectl config set-context default-system --cluster=default-cluster --user=default-admin
kubectl config use-context default-system

Final checks

[core@admin ~]$ kubectl get nodes
NAME         STATUS                     AGE       VERSION
10.0.1.100   Ready,SchedulingDisabled   1h        v1.6.1+coreos.0
10.0.1.101   Ready                      3m        v1.6.1+coreos.0
10.0.1.102   Ready                      2m        v1.6.1+coreos.0

Got it to work…

addons

DNS service

apiVersion: v1
kind: Service
metadata:
  name: kube-dns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "KubeDNS"
spec:
  selector:
    k8s-app: kube-dns
  clusterIP: 10.3.0.10
  ports:
  - name: dns
    port: 53
    protocol: UDP
  - name: dns-tcp
    port: 53
    protocol: TCP


---


apiVersion: v1
kind: ReplicationController
metadata:
  name: kube-dns-v20
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    version: v20
    kubernetes.io/cluster-service: "true"
spec:
  replicas: 1
  selector:
    k8s-app: kube-dns
    version: v20
  template:
    metadata:
      labels:
        k8s-app: kube-dns
        version: v20
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
        scheduler.alpha.kubernetes.io/tolerations: '[{"key":"CriticalAddonsOnly", "operator":"Exists"}]'
    spec:
      containers:
      - name: kubedns
        image: gcr.io/google_containers/kubedns-amd64:1.8
        resources:
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi
        livenessProbe:
          httpGet:
            path: /healthz-kubedns
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        readinessProbe:
          httpGet:
            path: /readiness
            port: 8081
            scheme: HTTP
          initialDelaySeconds: 3
          timeoutSeconds: 5
        args:
        - --domain=cluster.local.
        - --dns-port=10053
        ports:
        - containerPort: 10053
          name: dns-local
          protocol: UDP
        - containerPort: 10053
          name: dns-tcp-local
          protocol: TCP
      - name: dnsmasq
        image: gcr.io/google_containers/kube-dnsmasq-amd64:1.4
        livenessProbe:
          httpGet:
            path: /healthz-dnsmasq
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        args:
        - --cache-size=1000
        - --no-resolv
        - --server=127.0.0.1#10053
        - --log-facility=-
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
      - name: healthz
        image: gcr.io/google_containers/exechealthz-amd64:1.2
        resources:
          limits:
            memory: 50Mi
          requests:
            cpu: 10m
            memory: 50Mi
        args:
        - --cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
        - --url=/healthz-dnsmasq
        - --cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 >/dev/null
        - --url=/healthz-kubedns
        - --port=8080
        - --quiet
        ports:
        - containerPort: 8080
          protocol: TCP
      dnsPolicy: Default

import this service into kubernetes

[core@admin tmp]$ kubectl create -f dns-addon.yml 
service "kube-dns" created
replicationcontroller "kube-dns-v20" created

Dashboard !

on the admin node, where kubectl is available, create the two following files :

kube-dashboard-rc.yaml

apiVersion: v1
kind: ReplicationController
metadata:
  name: kubernetes-dashboard-v1.6.0
  namespace: kube-system
  labels:
    k8s-app: kubernetes-dashboard
    version: v1.6.0
    kubernetes.io/cluster-service: "true"
spec:
  replicas: 1
  selector:
    k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
        version: v1.6.0
        kubernetes.io/cluster-service: "true"
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
        scheduler.alpha.kubernetes.io/tolerations: '[{"key":"CriticalAddonsOnly", "operator":"Exists"}]'
    spec:
      containers:
      - name: kubernetes-dashboard
        image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.0
        resources:
          limits:
            cpu: 100m
            memory: 50Mi
          requests:
            cpu: 100m
            memory: 50Mi
        ports:
        - containerPort: 9090
        livenessProbe:
          httpGet:
            path: /
            port: 9090
          initialDelaySeconds: 30
          timeoutSeconds: 30

kube-dashboard-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: kubernetes-dashboard
  namespace: kube-system
  labels:
    k8s-app: kubernetes-dashboard
    kubernetes.io/cluster-service: "true"
spec:
  selector:
    k8s-app: kubernetes-dashboard
  ports:
  - port: 80
    targetPort: 9090

Then import the services into the cluster :

kubectl create -f kube-dashboard-rc.yaml
kubectl create -f kube-dashboard-svc.yaml
kubectl get pods --namespace=kube-system

You can access the dashboard like this :

kubectl port-forward kubernetes-dashboard-v1.6.0-SOME-ID 9090 --namespace=kube-system

result

Test your cluster

Flannel

We will test the communication between containers

First, deploy a simple app

kubectl run hello-world --replicas=2 --labels="run=load-balancer-example" --image=gcr.io/google-samples/node-hello:1.0  --port=8080

Be patient, check the progress :

[core@admin tmp]$  kubectl get pods  --all-namespaces
NAMESPACE     NAME                                 READY     STATUS              RESTARTS   AGE
default       hello-world-3272482377-15dng         0/1       ContainerCreating   0          3s
default       hello-world-3272482377-nmkcm         0/1       ContainerCreating   0          3s
default       nuxeo-1718889206-h8wlf               1/1       Running             1          21h
kube-system   kube-apiserver-10.0.1.100            1/1       Running             0          22h
kube-system   kube-controller-manager-10.0.1.100   1/1       Running             0          22h
kube-system   kube-dns-v20-hp25x                   3/3       Running             3          21h
kube-system   kube-proxy-10.0.1.100                1/1       Running             0          22h
kube-system   kube-proxy-10.0.1.101                1/1       Running             1          21h
kube-system   kube-proxy-10.0.1.102                1/1       Running             1          21h
kube-system   kube-scheduler-10.0.1.100            1/1       Running             0          22h
kube-system   kubernetes-dashboard-v1.6.0-46x5q    1/1       Running             1          21h

until the two pods are reported as running.

We will enter a pod on host c1, enter a pod on host c2 and foreach pod, ping the other one :

On the first host c1 :

run “docker ps” to find the id of the nginx container. for me, it’s e8ae791a00c2

c1 ~ # docker exec -ti e8ae791a00c2 bash
root@hello-world-3272482377-15dng:/# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
11: eth0@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default 
    link/ether 02:42:0a:02:61:03 brd ff:ff:ff:ff:ff:ff
    inet 10.2.97.3/24 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:aff:fe02:6103/64 scope link 
       valid_lft forever preferred_lft forever

On the second host c2 :

c2 ~ # docker exec -ti 54dbf976dbc6 bash
root@hello-world-3272482377-nmkcm:/# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
15: eth0@if16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default 
    link/ether 02:42:0a:02:38:04 brd ff:ff:ff:ff:ff:ff
    inet 10.2.56.4/24 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:aff:fe02:3804/64 scope link 
       valid_lft forever preferred_lft forever
root@hello-world-3272482377-nmkcm:/# ping 10.2.97.3
PING 10.2.97.3 (10.2.97.3): 56 data bytes
64 bytes from 10.2.97.3: icmp_seq=0 ttl=62 time=1.960 ms
64 bytes from 10.2.97.3: icmp_seq=1 ttl=62 time=0.970 ms
64 bytes from 10.2.97.3: icmp_seq=2 ttl=62 time=0.957 ms

On c1 :

root@hello-world-3272482377-15dng:/# ping 10.2.56.4
PING 10.2.56.4 (10.2.56.4): 56 data bytes
64 bytes from 10.2.56.4: icmp_seq=0 ttl=62 time=0.887 ms
64 bytes from 10.2.56.4: icmp_seq=1 ttl=62 time=0.770 ms

You are ready to host your apps.

Enjoy ;)