Categories
Technology

Bootstrapping Docker Swarm Part 3: Getting SSL Certificates

This is part of a multi-part series on getting Docker Swarm up and running. You might want to start with the original post called Bootstrapping Docker Swarm.

There are two containers involved in creating SSL certificates: acme-dns and certbot. The acme-dns container will let us get wildcard certificates from Let’s Encrypt without having to use one of the supported DNS providers. If you already have a supported DNS provider such as Route53 then you do not need acme-dns. But if you’re hosting your own DNS or using your organization’s DNS servers acme-dns is the way to go.

Configuring acme-dns

If you’re using an First we’ll install acme-dns. Again on the infra host, create a directory where we will put three files:

  • Dockerfile
  • docker-compose.yml
  • config.cfg

The Dockerfile for this container looks like this:

FROM joohoi/acme-dns:v0.8

# move all of the configuration files into place
COPY config.cfg /config.cfg

EXPOSE 53/tcp 53/udp 5380/tcp
VOLUME ["/var/lib/acme-dns"]
ENTRYPOINT ["./acme-dns", "-c", "/config.cfg"]

The docker-compose.yml file looks like this:

version: "3.2"

services:
  acme-dns:
    build: .
    restart: always
    network_mode: host
    container_name: acme-dns
    image: ${IMAGE:-registry.example.com/infra/acme-dns}:${VERSION?undefined VERSION}
    volumes:
      - type: bind
        source: /srv/acme-dns
        target: /var/lib/acme-dns

And the config file looks like this:

[general]
# DNS interface. Note that systemd-resolved may reserve port 53 on 127.0.0.53
# In this case acme-dns will error out and you will need to define the listening interface
# for example: listen = "127.0.0.1:53"
listen = "0.0.0.0:53"
# protocol, "both", "both4", "both6", "udp", "udp4", "udp6" or "tcp", "tcp4", "tcp6"
protocol = "both"
# domain name to serve the requests off of
domain = "acme.example.com"
# zone name server
nsname = "acme.example.com"
# admin email address, where @ is substituted with .
nsadmin = "admin.example.com"
# predefined records served in addition to the TXT
records = [
    # domain pointing to the public IP of your acme-dns server 
    "acme.example.com. A 1.2.3.4",
    # specify that auth.example.org will resolve any *.auth.example.org records
    "acme.example.com. NS acme.example.com.",
]
# debug messages from CORS etc
debug = false

[database]
engine = "sqlite3"
connection = "/var/lib/acme-dns/acme-dns.db"

[api]
# listen ip eg. 127.0.0.1
ip = "0.0.0.0"
# disable registration endpoint
disable_registration = false
# listen port, eg. 443 for default HTTPS
port = "5380"
# possible values: "letsencrypt", "letsencryptstaging", "cert", "none"
tls = "letsencrypt"
# only used if tls = "letsencrypt"
acme_cache_dir = "/var/lib/acme-dns/api-certs"
# CORS AllowOrigins, wildcards can be used
corsorigins = ["*"]
# use HTTP header to get the client ip
use_header = false
# header name to pull the ip address / list of ip addresses from
header_name = "X-Forwarded-For"

[logconfig]
# logging level: "error", "warning", "info" or "debug"
loglevel = "debug"
# possible values: stdout, TODO file & integrations
logtype = "stdout"
# file path for logfile TODO
# logfile = "./acme-dns.log"
# format, either "json" or "text"
logformat = "text"

Please replace “1.2.3.4” with your infra host’s public IP address and replace “example.com” with your domain name and “admin.example.com” with your email address except instead of “@” use a “.”. We’re going to deploy this container like this:

mkdir -p /srv/acme-dns
VERSION=latest IMAGE=acme-dns docker-compose build
VERSION=latest IMAGE=acme-dns docker-compose up -d

We need to configure acme before we can use it. I’m assuming in the configuration above and in the steps below that your public IP address has the a DNS A record pointing it at for acme.example.com. If that is the case then you can run this command:

curl -X POST -H 'Content-Type: application/json' -d '{"allowfrom": ["127.0.0.1/32", "10.0.0.0/24"]}' https://acme.example.com:5380/register

Add any other networks on there that you think might be necessary to connect to the service. The certbot container will definitely need to connect to acme-dns container. Remember that this container is using host networking.

The registration command that you ran above will return credentials. Save those credentials. Then go back to the config.cfg file and change disable_registration from false to true and redeploy the container.

The last step to using acme-dns is to update your DNS records with your DNS server. You should already have an A record that says that acme.example.com points at the public IP address for your infra host. We’re going to add these additional records:

  • acme.example.com. NS acme.example.com.
  • _acme_challenge.example.com CNAME <subdomain>.acme.example.com

The first one there is an NS record on acme.example.com which says that acme.example.com is responsible for resolving DNS queries for anything under acme.example.com. The second is a CNAME record that says that lookups for _acme_challenge.example.com will instead return the results for <subdomain>.acme.example.com. Because everything served under acme.example.com is served by acme-dns this means that acme-dns will respond with queries to <subdomain>.acme.example.com. The value for the subdomain comes from the credentials registration.

If you happen to have multiple domains that you’re doing this for — example.com, example.org, example.net — you can use this to get wildcard certificates for all of them. You just need to add one record to each zone, like this:

  • _acme_challenge.example.net. CNAME <subdomain>.acme.example.com

In this case we are saying that lookings to _acme_challenge.example.NET — rather than .COM — will be answered by our DNS server running at acme.example.com. That’s it. Make it work with as many domains as you want.

Now let’s get certbot configured.

Configuring certbot

The certbot container will handle creating and renewing certificates. Make a new directory on your infra host and we’re going to create seven files:

  • Dockerfile
  • docker-compose.yml
  • entrypoint
  • acme-dns-auth
  • create-certificate
  • renew-certificates
  • update-load-balancers

The Dockerfile for this container looks like this:

FROM debian:buster-slim

ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends certbot rsync openssh-client

# move our stuff into place
COPY /entrypoint /
COPY /acme-dns-auth /usr/local/bin/acme-dns-auth
COPY /create-certificate /usr/local/bin/create-certificate
COPY /renew-certificates /usr/local/bin/renew-certificates
COPY /update-load-balancers /usr/local/bin/update-load-balancers
RUN chmod +x /entrypoint /usr/local/bin/acme-dns-auth /usr/local/bin/create-certificate /usr/local/bin/renew-certificates /usr/local/bin/update-load-balancers

VOLUME ["/etc/letsencrypt", "/var/logs/letsencrypt"]
ENTRYPOINT ["/entrypoint"]

The docker-compose.yml file looks like this:

version: "3.2"

services:
  certbot:
    build: .
    restart: always
    network_mode: host
    container_name: certbot
    image: ${IMAGE:-registry.example.com/infra/certbot}:${VERSION?undefined VERSION}
    volumes:
      - type: bind
        source: /home/admin/.ssh
        target: /root/.ssh
        read_only: true
      - type: bind
        source: /srv/certbot/config
        target: /etc/letsencrypt
      - type: bind
        source: /srv/certbot/logs
        target: /var/log/letsencrypt

The entrypoint file looks like this:

#!/bin/bash

# try renewing every four hours
trap exit TERM; while :; do /usr/local/bin/renew-certificates; sleep 4h & wait ${!}; done;

The acme-dns-auth file looks like this:

#!/usr/bin/env python3

import json
import os
import requests
import sys


# URL to acme-dns instance
API_URL = "https://acme.example.com:5380"

# path for acme-dns credential storage
CREDENTIALS_PATH = "/etc/letsencrypt/acmedns-registration.json"

# DO NOT EDIT BELOW HERE

DOMAIN = os.environ["CERTBOT_DOMAIN"]
if DOMAIN.startswith("*."):
    DOMAIN = DOMAIN[2:]

VALIDATION_TOKEN = os.environ["CERTBOT_VALIDATION"]


class Client(object):
    """
    Handles the communication with ACME-DNS API
    """

    def __init__(self, api_url):
        self.api_url = api_url

    def update_txt_record(self, account, txt):
        """Updates the TXT challenge record to ACME-DNS subdomain."""
        update = {
            "subdomain": account["subdomain"],
            "txt": txt,
        }
        headers = {
            "X-Api-User": account["username"],
            "X-Api-Key": account["password"],
            "Content-Type": "application/json",
        }
        res = requests.post(
            "{}/update".format(self.api_url),
            headers=headers,
            data=json.dumps(update),
        )

        if res.status_code != 200:
            msg = ("Encountered an error while trying to update TXT record:\n"
                   "------- Request headers:\n{}\n"
                   "------- Request body:\n{}\n"
                   "------- Response HTTP status: {}\n"
                   "------- Response body: {}")
            s_headers = json.dumps(headers, indent=2, sort_keys=True)
            s_update = json.dumps(update, indent=2, sort_keys=True)
            s_body = json.dumps(res.json(), indent=2, sort_keys=True)
            print(msg.format(s_headers, s_update, res.status_code, s_body))
            sys.exit(1)


class Credentials(object):
    def __init__(self, storage_path):
        self.storage_path = storage_path
        self._data = self.load()

    def load(self):
        """Reads the storage content from the disk to a dict structure"""
        data = dict()
        file_data = ""

        try:
            with open(self.storage_path, "r") as fh:
                file_data = fh.read()
        except IOError as e:
            print("ERROR: credentials data file exists but cannot be read")
            sys.exit(1)

        try:
            data = json.loads(file_data)
        except ValueError:
            if len(file_data) > 0:
                # credentials file is corrupted
                print("ERROR: credentials data file is corrupted")
                sys.exit(1)

        return data

    def fetch(self, key):
        """Gets configuration value from storage"""
        try:
            return self._data[key]
        except KeyError:
            return None


if __name__ == "__main__":
    client = Client(API_URL)
    storage = Credentials(CREDENTIALS_PATH)

    # Check if an account already exists in storage
    print("fetching credentials for {}".format(DOMAIN))
    account = storage.fetch(DOMAIN)
    if account is None:
        print("ERROR: no credentials found for {}".format(DOMAIN))
        sys.exit(1)

    # Update the TXT record in acme-dns instance
    client.update_txt_record(account, VALIDATION_TOKEN)

The create-certificate file looks like this:

#!/bin/bash
exec certbot certonly \
    --email=admin@example.com --agree-tos --no-eff-email --manual-public-ip-logging-ok \
    --manual --manual-auth-hook /usr/local/bin/acme-dns-auth \
    --preferred-challenges=dns \
    --post-hook /usr/local/bin/update-load-balancers \
    "$@"

The renew-certificates file looks like this:

#!/bin/bash
exec certbot renew --post-hook /usr/local/bin/update-load-balancers "$@"

The update-load-balancers file looks like this:

#!/bin/bash

# this script should be called as a "post hook" for the certbot client. all
# certificates will be rewritten when any renew happens which is fine. it will
# then attempt to reload haproxy in docker.

# exit immediately on errors
set -e

PATH_TO_LIVE=/etc/letsencrypt/live
PATH_TO_TARGET=/etc/letsencrypt/haproxy
LOAD_BALANCERS="lb01.example.com lb02.example.com"
RENEWED_LINEAGE=$(ls $PATH_TO_LIVE)

if [ ! -d $PATH_TO_LIVE ]; then
    echo "ERROR: could not find live certificates"
    exit 1
fi

# need a place to put our certificates
mkdir -p $PATH_TO_TARGET
chmod 700 $PATH_TO_TARGET

# for each domain create a concatenated pem file
for DOMAIN in $RENEWED_LINEAGE; do
    if [ -d "$PATH_TO_LIVE/$DOMAIN" ]; then
        echo "assembling certificate $DOMAIN for load balancers"
        cat "$PATH_TO_LIVE/$DOMAIN/privkey.pem" "$PATH_TO_LIVE/$DOMAIN/fullchain.pem" > "$PATH_TO_TARGET/$DOMAIN.pem"
        chmod 600 "$PATH_TO_TARGET/$DOMAIN.pem"
    fi
done

# rsync all certificates over to the load balancers
for LOAD_BALANCER in $LOAD_BALANCERS; do
    echo "transferring certificates to $LOAD_BALANCER"
    rsync -rptgocq --delete-after --timeout=300 --rsh="ssh -l admin" --rsync-path="sudo rsync" $PATH_TO_TARGET/ $LOAD_BALANCER:/srv/haproxy/certs/
    CONTAINER=$(ssh admin@$LOAD_BALANCER "docker container ls --filter name=haproxy_frontend --quiet")

    if [ -n "$CONTAINER" ]; then
        ssh admin@$LOAD_BALANCER "docker kill --signal USR2 $CONTAINER 1>/dev/null 2>&1"
        echo "signaled haproxy_frontend to reload certificates on $LOAD_BALANCER"
    else
        echo "ERROR: could not find haproxy_frontend on $LOAD_BALANCER"
    fi
done

Before you can continue you will need to make changes to most of these scripts. For example:

  • You must be able to SSH from the infra host to the load balancer hosts as some user without a password, using SSH keys. In the docker-compose.yml file, it is assumed that you can SSH between the infra host and load balancer hosts as the “admin” user without using a password. (Further, whichever user you use to do this must be able to run the docker command on the load balancer hosts.)
  • In the acme-dns-auth file you must change the hostname for connecting to acme-dns.
  • In the create-certificate file you must change the email address to your email address.
  • In the update-load-balancers file you must change the hostnames of the load balancers and you must change the username from admin to whatever you set in the docker-compose.yml file.

Now to deploy the certbot container you can run these commands on the infra host:

mkdir -p /srv/certbot/config
mkdir -p /srv/certbot/logs

VERSION=latest IMAGE=certbot docker-compose build
VERSION=latest IMAGE=certbot docker-compose up -d

Now you probably want to create some certificates. You can do that by running this command on the infra host:

docker exec -it certbot create-certificate -d example.com -d "*.example.com"

Within a short bit you’ll have a certificate waiting for you and it will be transferred to the load balancers and the update will fail because they are not configured yet. We will configure them later and then you can run this command to update the load balancers:

docker exec -it certbot update-load-balancers

Or if you just want to get a list of certificates that you’ve generated you can run this command:

docker exec -it certbot certbot certificates

Next Steps

In this part we got acme-dns and certbot up and running so that you can generate wildcard certificates for one or more domains. You might not have needed acme-dns but maybe at least you got pointers for setting up certbot by itself! In the next part we will set up the load balancers so that we can give our cluster a front door.

There are still a lot more steps! Follow on to read the rest of the steps.