Setting up GitHub Runners on Kubernetes

Published November 7, 2021

If you are using GitHub Actions then you will be using GitHub Runners which are what run the actual actions. If you are working with a public repository then you are probably fine with the runners that GitHub provides. This is because with public repositories you could be running untrusted code provided by random people on the internet and GitHub does a good job of not letting you shoot yourself in the foot. Additionally, the GitHub provided runners are provided for free for public repositories, within limits.

But if you are using a private repository or you want access to extra resources beyond what is available with the free runners then you will want to run your own self-hosted GitHub Runner. One option is to turn on a server for the express purpose of running your GitHub Runnner and nothing else. But if you have a Kubernetes cluster running then why not put it to use?

The documentation for running a self-hosted GitHub Runner within Kubernetes is sparse. Some of the difficulties you will run into include:

The GitHub Runner software forces itself to restart whenever a new version is released despite people asking for this feature to be opt-out. So we will have to find a way to work around this.
If you want to run any containers (e.g. Docker) within your actions then you’ll need to have container tools like Docker installed and working.
If you use the “container” capability included with GitHub Runner, the runner will try to include a bunch of GitHub Runner components inside the container that is doing the building so you’ll need make those accessible to both the GitHub Runner and the Docker daemon that is running the container.

Of course, before we get started you will need to generate a Personal Access Token which will be used to register your runner. You will need one PAT per runner. The token must have the admin:org permissions granted to it. I have not set up any ephemeral runners with auto-scaling so I cannot speak to how that works. We’re only talking about permanent runners.

The GitHub Runner Container

As GitHub does not publish containers for their runner we will have to make our own. We need two files to make our container: Dockerfile and entrypoint. First, the Dockerfile:

`Dockerfile`

FROM debian:bullseye-slim

ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -q update && apt-get -y upgrade && \
    # install our dependencies
    apt-get install -y --no-install-recommends tini git git-lfs jq make curl ca-certificates gnupg && \
    # install the github runner dependencies
    apt-get install -y --no-install-recommends liblttng-ust0 libkrb5-3 zlib1g libssl1.1 libicu67 && \
    # install docker dependencies
    curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /etc/apt/trusted.gpg.d/docker-archive-bullseye.gpg && \
    echo -n "deb [arch=amd64] https://download.docker.com/linux/debian bullseye stable" > /etc/apt/sources.list.d/docker.list && \
    apt-get -q update && apt-get install -y --no-install-recommends docker-ce-cli docker-compose pass && \
    # clean up apt files
    apt-get clean && rm -rf /var/lib/apt/lists/*

# copy the entrypoint file
COPY entrypoint /entrypoint
RUN chmod +x /entrypoint

# this will run as root to deal with container issues
ENV RUNNER_ALLOW_RUNASROOT="1"

ENTRYPOINT ["tini", "--", "/entrypoint"]

What we’re doing here is installing some basic tools like git and make and curl. You can add more basic tools for your runner here, whatever you’d like generally available to your jobs. Next we install libraries that are required for the GitHub Runner software like liblttng-ust0, libkrb5-3, zlib1, libssl1.1, and libicu67. Then we add the Docker apt repository so that we can install the latest Docker CLI and docker-compose tool.

After that we will copy over the entrypoint script that sets up and runs the GitHub Runner.

Finally, we’re also going to set the RUNNER_ALLOW_RUNASROOT which is required to let the GitHub Runner software work with Docker correctly.

It’s worth noting that we use tini here because our runner will start other processes and we’d like something to clean them up automatically so that we don’t end up with zombies.

`entrypoint`

#!/bin/sh

# figure out which version of the installer to download
RUNNER_VERSION=$(curl -s -H "Accept: application/vnd.github.v3+json" -H "Authorization: token ${GITHUB_TOKEN}" https://api.github.com/repos/actions/runner/releases/latest | jq -r .name | sed 's/^v*//')
RUNNER_URL=https://github.com/actions/runner/releases/download/v${RUNNER_VERSION}/actions-runner-linux-x64-${RUNNER_VERSION}.tar.gz

# download the installer
echo "Downloading actions runner version ${RUNNER_VERSION} from '${RUNNER_URL}'"
curl -L -s -o /usr/local/src/installer.tar.gz $RUNNER_URL

# unpack the installer
mkdir -p /opt/runner
tar -x -z -C /opt/runner -f /usr/local/src/installer.tar.gz

# now get a token using our credentials
REGISTRATION_URL="https://github.com/${GITHUB_RUNNER_OWNER}"
TOKEN_URL="https://api.github.com/orgs/${GITHUB_RUNNER_OWNER}/actions/runners/registration-token"

echo "Requesting token from '${TOKEN_URL}'"
PAYLOAD=$(curl -s -X POST -H "Accept: application/vnd.github.v3+json" -H "Authorization: token ${GITHUB_TOKEN}" ${TOKEN_URL})
export GITHUB_RUNNER_TOKEN=$(echo $PAYLOAD | jq .token --raw-output)

# errors happen if you try to run from not in the directory
cd /opt/runner

# register the agent with github
./config.sh \
    --name ${GITHUB_RUNNER_NAME} \
    --token ${GITHUB_RUNNER_TOKEN} \
    --url ${REGISTRATION_URL} \
    --unattended \
    --replace

# register an unregister handler with the shell
remove() {
    ./config.sh remove --unattended --token "${GITHUB_RUNNER_TOKEN}"
    sleep 1
}

# force the handler to run when we exit
trap 'remove; exit 130' INT
trap 'remove; exit 143' TERM

# actually run the handler
./bin/runsvc.sh &

wait $!

What our entrypoint script does is this:

Get the latest version number for the runner. Whenever the runner container starts we will download and run the latest version. This is because GitHub will force its runner to restart when a new version is released so we’re just going to download the latest version by default.
Then we’re going to download the runner and unpack it to /opt/runner.
Using our GitHub token we’re going to register the runner with GitHub.
We’ll configure the runner and register a function that will deregister the runner when it exits.
Finally, we’ll start the runner and wait for it to exit.

That’s all this container does. Now let’s look at how to configure this with Kubernetes.

Kubernetes Configuration

I recommend that you put the runner into its own namespace and that you set up a network policy to limit that namespace such that your runner cannot access other resources within Kubernetes. You wouldn’t give random people in your organization the ability to run abritrary code on your Kubernetes cluster. But if you do not set up a network policy to constrain your runner that is what you would be doing.

Let’s look at our configuration for Kubernetes:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: github-runner01
  namespace: builders
  labels:
    service: github-runner
    app.kubernetes.io/name: github-runner01
spec:
  replicas: 1
  selector:
    matchLabels:
      service: github-runner
  template:
    metadata:
      labels:
        service: github-runner
        app.kubernetes.io/name: github-runner01
    spec:
      restartPolicy: Always
      containers:
        - name: runner
          # this should be the name of the container that you created above
          image: ghcr.io/your-org/your-runner-container:latest
          imagePullPolicy: Always
          volumeMounts:
            - name: docker-config
              mountPath: /root/.docker
            - name: home
              mountPath: /opt/runner
          resources:
            limits:
              cpu: 250m
              memory: 512Mi
            requests:
              cpu: 250m
              memory: 128Mi
          env:
            - name: DOCKER_HOST
              value: tcp://127.0.0.1:2375
            - name: GITHUB_RUNNER_OWNER
              # change this to the name of your github organization
              value: your-org
            - name: GITHUB_RUNNER_NAME
              # this must be unique in your organization
              value: runner
            - name: GITHUB_TOKEN
              # you will need to create this secret
              valueFrom:
                secretKeyRef:
                  name: github-runner-secret
                  key: GITHUB_TOKEN
        - name: dind
          image: docker.io/library/docker:20.10.10-dind
          imagePullPolicy: IfNotPresent
          args:
            - dockerd
            - --storage-driver=overlay2
            - --host=tcp://127.0.0.1:2375
            - --bip=10.60.0.1/24
            - --tls=false
          securityContext:
            privileged: true
          volumeMounts:
            - name: home
              mountPath: /opt/runner
            - name: cache
              mountPath: /var/lib/docker
          resources:
            limits:
              cpu: 2000m
              memory: 4Gi
            requests:
              cpu: 1000m
              memory: 4Gi
      terminationGracePeriodSeconds: 60
      volumes:
        - name: cache
          emptyDir: {}
        - name: home
          emptyDir: {}

A few notes here about this container configuration. Before you can use this you will need to:

Create a container image, using the steps described above, that contains the actual GitHub Runner software.
Edit the above YAML file to change the image path to that image.
Change the environment variables to point to your actual GitHub organization name.
Create a Kubernetes secret caled github-runner-secret with a key of GITHUB_TOKEN and a value containing your GitHub token that has permission to register your runner.
Maybe change the bip option for Docker, depending on your network setup.

This configuration is for a two container pod where one of the containers is the runner and the other container is Docker-in-Docker. The two containers communicate with each other over localhost on port 2375 which is the standard Docker TCP port. Note that the Docker container will clear its cache every time that you restart the pod which may or may not be what you want but clearing the cache on restart is much simpler than trying to keep the cache persistent. However, also note that this means that every time you run a build it will reuse the Docker cache from previous builds so I tend to put a “clean” statement at the end of my builds to forcefully clear any artifacts that my build may have created.

We’re setting up /opt/runner as an emptyDir that is mounted to both the runner container and the Docker container. This is because the GitHub Runner software has a bunch of tools (e.g. nodejs) that are used to run various GitHub Actions and these need to be available to the Docker daemon as well as the runner software. If you look back at our entrypoint file you will remember that we install the runner to /opt/runner. So the installed runner tools are now available to both containers in the pod.

Finally, note that when an action is run through a GitHub Runners do not clean up after themselves. So if you have an action that installs a library or does some other action within the container it will only be cleaned up when the whole pod is restarted. To work around this, I typically only do two types of builds:

I will run builds that only build containers. Building a container does not install anything on the host. It’s just a series of Docker commands. When I finish building the container and pushing the container then I explicitly remove the container.
I will run builds with the container option so that the entire build runs inside another container. This does mean that my build needs to set up its entire world before it can even get to doing anything but that’s generally OK. To see what I mean, look at this example:

jobs:
  test:
    runs-on: self-hosted
    container: python:3.9-slim-bullseye

    steps:
      ...

So with all that, and knowing the caveats, you, too, can now run your own GitHub Runner in your Kubernetes cluster.