Categories
Technology

Bootstrapping Docker Swarm

I’m a solo developer on my current team and my responsibilities range from installing operating systems to configuring and optimizing databases to loading data in to those databases and writing programs to manipulate and query the data once it is there. Suffice it to say that I have to do it all. So when I was looking for some way to make my software deployments easier, I wanted to avoid big tools like Kubernetes that would necessitate spending large portions of my time supporting the tools when I really just wanted to use those tools.

To that end, I began pursuing Docker Swarm as a halfway between Docker — great for single hosts — and Kubernetes — because I don’t have an entire devops team to support this. Docker Swarm provides container orchestration by dynamically assigning containers to one or more hosts and restarting them on different hosts when hosts fail. That’s it. That’s what it does. Getting it running is incredibly easy but getting it working takes some time. So this is going to be a series of posts that cover how to bootstrap Docker Swarm so that you can do something useful with it.

What Are We Bootstrapping?

To get started with Docker Swarm I had to bootstrap the cluster. This included, in this order:

  • A central logging location for all containers.
  • SSL certificates to encrypt HTTP communications.
  • Load balanced ingress for HTTP applications.
  • A container registry to store my custom containers.

The chicken and egg problem here was that I needed to create custom containers but before I could create custom containers I needed a container registry but before I could deploy a container registry I needed to get custom containers… you see the problem. I actually started with GitHub Packages but quickly ran into several issues:

  • There is a limit of 500MB for all of your containers and you can only download 1GB of packages per month.
  • If the repository is public then you can never delete your containers without also deleting the repository. If your repository is private their accounting system commonly gets confused when you delete unused packages and starts reporting counts for things that don’t exist.
  • The package namespace is unique across your entire account or organization. That is, you cannot have a package with the same name in multiple repositories. It is not clear why this is the case.

So GitHub packages was out. Docker Hub was also out because I don’t have a budget for that. So we’re going to run our own registry.

What Servers Do We Need?

To begin, this cluster is running in an environment where I have about 60 private IP addresses and 60 public IP addresses that I can use for whatever I want. I’m also running this on what is basically bare metal. You might run your setup differently if you’re operating on cloud hosts or something.

So we actually have six servers. They were:

  • One “infrastructure” server with lots of disk for logs and where I could assign a few containers to do specific cluster maintenance things. If this host went down the cluster would continue to function but we might lose things like logging and that is no big deal. This host has 16GB RAM and 4 CPUs. This host had one public IP and one private IP and the private IP was the primary interface.
  • Two “load balancer” hosts where we will run HAProxy and where we all of our HTTP requests will arrive and be sent into the cluster. Each host has a network interface for its private IPs and another for its public IPs. These hosts each had 2GB RAM and 2 CPUs.
  • Three “worker node” hosts where most of our containers will run. These worker nodes will also be our Docker Swarm controllers. These hosts will only have one private IP address. I set my hosts up with 32GB RAM and 8 CPUs each but you can size the however you would like. It might make sense to split the Docker Swarm controller node functionality off of these hosts but I did not do that.

Written another way, we have these hosts:

  • lb01 – 10.0.0.18
  • lb02 – 10.0.0.19
  • infra – 10.0.0.20
  • node01 – 10.0.0.21
  • node02 – 10.0.0.22
  • node03 – 10.0.0.23

Finally, before we begin, our hosts should have no firewalls enabled. Later we will talk about how to configure a firewall to work with Docker.

With this set up in mind we can begin.

Preparing the Hosts

Most of these hosts will have multiple IP addresses on them. Because we do not want to have asymmetric routes for these IP addresses, I think that is very important to configure the networking on these hosts to forward traffic for given IP addresses over the interfaces where the IP address exists. To do that is very simple. On Debian you simple make your /etc/network/interfaces file look something like this:

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
    address 10.0.0.20/24
    gateway 10.0.0.1
    up ip route add default via 10.0.0.1 dev eth0 tab 1
    up ip rule add from 10.0.0.20/32 tab 1 priority 500
    down ip rule delete from 10.0.0.20/32 tab 1 priority 500
    down ip route delete default via 10.0.0.1 dev eth0 tab 1

auto eth1
iface eth1 inet static
    address 1.2.3.4/24
    up ip route add default via 1.2.3.1 dev eth1 tab 2
    up ip rule add from 1.2.3.4/32 tab 2 priority 600
    down ip rule delete from 1.2.3.4/32 tab 2 priority 600
    down ip route delete default via 1.2.3.1 dev eth1 tab 2

source /etc/network/interfaces.d/*

This will become important later when we set up our load balancers to ensure that network traffic goes out over the correct interface. What we’re doing here is creating a new route and saying that traffic out that network should go over that route. Then we’re saying that traffic from our IP that is on the network that route serves should go over that route. And finally we are saying that these routes and rules should be added and remove dynamically as interfaces go up and down.

Next Steps

This is the introduction to our bootstrap. There are still a lot more steps! Follow on to read them.

Caveats

While I did get this solution to work and it worked pretty well, it was not my final solution for the problem of container orchestration. Ultimately I ended up going to Kubernetes because of one specific problem that I never could solve.

That problem: containers running on a host that aren’t operating in “host” network mode were not able to access services running on the same host.

For example, the load balancers listen on port 80 for incoming HTTP requests. If I have another container running on one of the load balancers, connected to an overlay network and not host networking, say it is a monitoring agent, and that container tries to connect to the load balancer to access a service, it will never, ever connect.

Put another way, if HAProxy is running on the host lb01 in host networking mode and it is listening on port 80 for incoming requests and a container running in an overlay network on lb01 tries to connect to port 80 on lb01, that does not work. Sure, I can connect to the parent host over the gateway IP inside my container but that is such a hacky workaround that I discounted it as a real solution. I’ve scoured docs for days and found no real solution to this problem. So I was not able to use Docker Swarm ultimately because of this limitation.

But maybe it will work for you! Good luck!