This post describes how to setup SSL certificate provisioning for Traefik ingress resources in a k3s cluster on ARM. We'll rely on the Traefik router that comes with k3s and use cert-manager next to it to take care of all the work necessary to obtain certificates from Let's Encrypt.

The instructions in this blog post were tested on k3s running on ARM architecture (Raspbian stretch on Raspberry Pi 3B/B+).

What is Traefik? Traefik is an edge router that helps you to make the apps you run on k3s cluster accessible from outside. It receives requests to the cluster and takes care of routing them to the right services in the cluster.

If you're using k3s, then it's likely that you already have Traefik running in your system, as k3s comes with Traefik by default.

$ kubectl get pod -n kube-system 
NAME                         READY   STATUS      RESTARTS   AGE
helm-install-traefik-sd4d5   0/1     Completed   1          38h
coredns-66f496764-5wwt7      1/1     Running     2          39h
svclb-traefik-s466c          3/3     Running     4          29h
svclb-traefik-jvksv          3/3     Running     3          29h
traefik-d869575c8-tw8dh      1/1     Running     1          29h

Looking at the pods in the kube-system namespace, we see that one Traefik pod is running, as well as one Traefik load balancer pod per node.

We could already use the existing Traefik installation to define plain http ingress resources for our own services. If you want to try it out, simply deploy the deployment.yaml, service.yaml and ingress.yaml defined below. You simply need to remove the tls field from the ingress.yaml. Also, make sure that you have a DNS entry pointing to your cluster's IP, or make simply add the domain specified in ingress.yaml to your internal DNS by editing your /etc/hosts and adding a line like 127.0.1.1    test.whateveryourdomainis.com.

It is possible to make use of Traefik's inbuilt capability to obtain SSL certificates from the Let's Encrypt certificate authority. However, with the current version of k3s, this would require us to install k3s without Traefik and then install a customized version of it that enables to make use of this functionality for our own services. If you're interested in following down that road, check out the official documentation and the issues you can run into while trying to achieve this.

The approach we'll be following is to use cert-manager together with the preinstalled Traefik instead. Cert-manager will take care of communicating with Let's Encrypt and tweak our cluster resources in a way that helps us solve the challenges that Let's Encrypt expects us to complete before they issue the certificate.

In order to enable https traffic to our pods running in our k3s cluster, we'll need to install cert-manager in our cluster. With cert-manager up and running, we can then setup a ClusterIssuer resource that will interact with Let's Encrypt. Once the issuer is up, we can use it to obtain certificates that we can use in our Traefik ingress resources. We'll use the latest cert-manager release (0.11 at the time of writing) and follow the official documentation for that release in defining the ClusterIssuer and certificate resources.

To install cert-manager in k3s running on ARM, we'll download the official yaml template.

wget https://github.com/jetstack/cert-manager/releases/download/v0.11.0/cert-manager.yaml 

Before applying cert-manager.yaml, however, we'll need modify the image names in the Deployment resources (cert-manager, cert-manager-cainjector and cert-manager-webhook) to point to the officially provided arm versions. Edit the file and add “-arm” at the end of image names in cert-manager, cert-manager-webhook, cert-manager-cainjector Deployments. For example, you need to change quay.io/jetstack/cert-manager-cainjector:v0.11.0 into quay.io/jetstack/cert-manager-cainjector-arm:v0.11.0.

Now we can apply the template. In my case, I had to explicitly create the cert-manager namespace as well by applying the official namespace template (creating the namespace by hand with kubectl create namespace cert-manager didn't help). Without this, kubernetes kept failing to create some of the resources, prompting the cert-manager-webhook pod to crash.

kubectl apply -f cert-manager.yaml \
  -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.11/deploy/manifests/01-namespace.yaml

Wait until the cert-manager pods are all up and running.


kubectl get pods --namespace cert-manager

NAME                                       READY   STATUS    RESTARTS   AGE
cert-manager-webhook-7f4d7d865c-dh94q      1/1     Running   3          23h
cert-manager-cainjector-668c57f6bc-cgbxw   1/1     Running   1          23h
cert-manager-5d67f89878-vdl9p              1/1     Running   1          23h

We can now create a certificate issuer. Actually we'll create two. One will point to the Let's Encrypt's staging API, while the other one will talk to the production API. We'll first start test our setup with the staging issuer, and will only move to the production issuer after requesting certificates from the staging issuer is working. The reason for this is that Let's Encrypt only allows a certain amount of certificates per week. Once you go beyond that rate limit, which can easily happen if things don't work right away and you need to debug the moving parts in your setup, the servers will ban you for a certain time.

We'll take the issuer templates from the official documentation as is, the only field you'll have to change is the email field where you'll need to fill in your email address.

$ cat <<EOF > le-staging.yaml
apiVersion: cert-manager.io/v1alpha2
kind: Issuer
metadata:
  name: letsencrypt-staging
  namespace: default
spec:
  acme:
    # The ACME server URL
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    # Email address used for ACME registration
    email: [email protected]
    # Name of a secret used to store the ACME account private key
    privateKeySecretRef:
      name: letsencrypt-staging
    # Enable the HTTP-01 challenge provider
    solvers:
    # An empty 'selector' means that this solver matches all domains
    - selector: {}
      http01:
        ingress:
          class: traefik
EOF
    
$ cat <<EOF > le-prod.yaml
apiVersion: cert-manager.io/v1alpha2
kind: Issuer
metadata:
  name: letsencrypt-prod
  namespace: default
spec:
  acme:
    # The ACME server URL
    server: https://acme-v02.api.letsencrypt.org/directory
    # Email address used for ACME registration
    email: [email protected]
    # Name of a secret used to store the ACME account private key
    privateKeySecretRef:
      name: letsencrypt-prod
    # Enable the HTTP-01 challenge provider
    solvers:
    # An empty 'selector' means that this solver matches all domains
    - selector: {}
      http01:
        ingress:
          class: traefik 
EOF

$ kubectl apply -f le-staging.yaml -f le-prod.yaml 
issuers.cert-manager.io/le-staging created
issuers.cert-manager.io/le-prod created

$ kubectl describe issuer

Now let's try requesting a certificate from the Let's Encrypt staging API. Make sure to reference your own domain and/or subdomains in commonName and dnsNames.

$ cat <<EOF > certificate.yaml
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: test-things-on-top-de-staging
  namespace: default
spec:
  secretName: test-things-on-top-de-tls-staging
  issuerRef:
    name: letsencrypt-staging
  commonName: test.things-on-top-of-other-things.de
  dnsNames:
  - test.things-on-top-of-other-things.de    
EOF
    
$ kubectl apply -f certificate.yaml 
certificate.cert-manager.io/test-things-on-top-de-staging created  

Now lets check if this worked.

If kubectl describe certificate test-things-on-top-de-staging shows an output similar to the following, then this means we have a working setup.

$ kubectl describe certificate test-things-on-top-de-staging
Name:         test-things-on-top-de-staging
Namespace:    default
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"cert-manager.io/v1alpha2","kind":"Certificate","metadata":{"annotations":{},"name":"example-com","namespace":"default"},"sp...
API Version:  cert-manager.io/v1alpha2
Kind:         Certificate
Metadata:
  Creation Timestamp:  2019-10-21T06:37:15Z
  Generation:          1
  Resource Version:    52897
  Self Link:           /apis/cert-manager.io/v1alpha2/namespaces/default/certificates/test-things-on-top-de-staging
  UID:                 7f8fd719-5e52-4897-b154-c3e1fd31579a
Spec:
  Common Name:  test.things-on-top-of-other-things.de
  Dns Names:
    test.things-on-top-of-other-things.de
  Issuer Ref:
    Name:       letsencrypt-staging
  Secret Name:  test-things-on-top-de-tls-staging
Status:
  Conditions:
    Last Transition Time:  2019-10-21T06:48:59Z
    Message:               Certificate is up to date and has not expired
    Reason:                Ready
    Status:                True
    Type:                  Ready
  Not After:               2020-01-19T05:48:57Z
Events:
  Type    Reason        Age   From          Message
  ----    ------        ----  ----          -------
  Normal  GeneratedKey  11m   cert-manager  Generated a new private key
  Normal  Requested     11m   cert-manager  Created new CertificateRequest resource "test-things-on-top-de-staging-1007754357"
  Normal  Issued        18s   cert-manager  Certificate issued successfully

If we don't see a Certificate issued after 15-20 minutes, then probably something is wrong in the setup. First thing we should do is have a look into the cert-manager logs. One issue I ran into was that the Let's Encrypt servers couldn't reach the challenge files that cert-manager created in response to the http01 challenge. The reason for this was that I use a reverse proxy on an AWS t2-micro instance to access my Raspberry Pi cluster, and I hadn't setup nginx to properly forward requests to the /.well-know/acme-challenge path on port 80 properly.

$ kubectl logs cert-manager-5d67f89878-vdl9p -n cert-manager | tail -3
I1021 06:46:50.023489       1 ingress.go:91] cert-manager/controller/challenges/http01/selfCheck/http01/ensureIngress "level"=0 "msg"="found one existing HTTP01 solver ingress" "dnsName"="test.things-on-top-of-other-things.de" "related_resource_kind"="Ingress" "related_resource_name"="cm-acme-http-solver-n5nx6" "related_resource_namespace"="default" "resource_kind"="Challenge" "resource_name"="example-com-1007754357-1239984667-2688243760" "resource_namespace"="default" "type"="http-01"
E1021 06:46:50.606362       1 sync.go:184] cert-manager/controller/challenges "msg"="propagation check failed" "error"="wrong status code '404', expected '200'" "dnsName"="test.things-on-top-of-other-things.de" "resource_kind"="Challenge" "resource_name"="example-com-1007754357-1239984667-2688243760" "resource_namespace"="default" "type"="http-01"
I1021 06:46:50.606662       1 controller.go:135] cert-manager/controller/challenges "level"=0 "msg"="finished processing work item" "key"="default/example-com-1007754357-1239984667-2688243760"

Getting the setup with all it's moving parts right may take a while and involve quite some googling around. But it's worth the effort and very rewarding once you see the "Certificate issued" finally appear in your certificate events.

We can now deploy a simple pod with a Traefik ingress that makes use of the certificate.

$ cat <<EOF > deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: whoami
  namespace: default
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: whoami
    spec:
      containers:
      - image: containous/whoami
        name: whoami
        ports:
        - containerPort: 80
          name: http 
EOF

$ cat <<EOF > service.yaml
apiVersion: v1
kind: Service
metadata:
  name: whoami
  namespace: default
spec:
  selector:
    app: whoami
  ports:
  - protocol: TCP
    port: 80
    name: http    
EOF

$ cat <<EOF > ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: whoami
  annotations:
    kubernetes.io/ingress.class: traefik
spec:
  tls:
    - secretName: test-things-on-top-de-tls-staging
  rules:
  - host: test.things-on-top-of-other-things.de
    http:
      paths:
        - path: /
          backend:
            serviceName: whoami-service
            servicePort: 80    
EOF

$ kubectl apply -f deployment.yaml -f service.yaml -f ingress.yaml
deployment.extensions/whoami created
service.extensions/whoami created
ingress.extensions/whoami created

In ingress.yaml, make sure that serviceName and servicePort match the metadata.name and spec.ports.port you assigned in service.yaml. And in order for the certificate to get added to your ingress, fill in the secretname declared in the certificate.yaml into spec.tls.secretName in the ingress.yaml.

As we specified the staging issuer in the certificate.yaml, using that certificate in our ingress results in a security warning when we try to call our domain. If we use curl in insecure mode, we can circumvent the verification failure for the time being and get a response.

$ curl https://test.things-on-top-of-other-things.de
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.
 
$ curl --insecure https://test.things-on-top-of-other-things.de
Hostname: whoami-7f784bd755-szq8p
IP: 127.0.0.1
IP: ::1
IP: 10.42.0.19
IP: fe80::58de:caff:fe9f:93e6
RemoteAddr: 10.42.1.17:55760
GET / HTTP/1.1
Host: test.things-on-top-of-other-things.de
User-Agent: curl/7.52.1
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 10.42.0.22
X-Forwarded-Host: test.things-on-top-of-other-things.de
X-Forwarded-Port: 443

But now let's exchange le-staging for le-prod in the certificate.yaml and reissue certificate by reapplying it with kubectl apply -f certificate.yaml. You can now issue a regular request to your domain with curl.

$ curl https://test.things-on-top-of-other-things.de
Hostname: whoami-7f784bd755-szq8p
IP: 127.0.0.1
IP: ::1
IP: 10.42.0.19
IP: fe80::58de:caff:fe9f:93e6
RemoteAddr: 10.42.1.17:55760
GET / HTTP/1.1
Host: test.things-on-top-of-other-things.de
User-Agent: curl/7.52.1
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 10.42.0.22
X-Forwarded-Host: test.things-on-top-of-other-things.de
X-Forwarded-Port: 443

You should now also be able to visit the domain with a browser of your choice without running into any security warnings.