"privileged" flag in Kubernetes runner seems to have stopped working since GitLab upgrade

I’m using AWS EKS to run GitLab CI jobs. The GitLab Kubernetes runner is running on the core k8s nodes and I’m using Karpenter to launch EC2 instances on demand for each CI job and then kill the instances off afterwards.

Since the jobs each run in their own isolated EC2 instance, I’ve had the “privileged” flag set in the runner configuration in order to allow docker-in-docker to be used when needed by the CI jobs. For example, here is a snippet:

runners:
  name: "arm64-runner"
  privileged: true
  tags: "arm64-runner,aarch64-runner"
  config: |
    [[runners]]
      [runners.kubernetes]
        namespace = "gitlab-karpenter-space"
        image = "ubuntu:20.04"

On October 21, we upgraded GitLab from 15.11.13 to 16.3.5. I’ve recently discovered that, after that upgrade (and the corresponding runner upgrade to 16.3.3) docker-in-docker doesn’t work as it used to. We get errors like this:

failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: iptables failed: iptables -t nat -N DOCKER: iptables v1.8.9 (legacy): can’t initialize iptables table `nat’: Permission denied (you must be root)

The solution I’ve found is to use Kyverno to reconfigure the CI job pod on the fly so that this clause:

   securityContext:
      privileged: true

is added to the pod’s definition.

Is this an expected change in behaviour for the Kubernetes runner? Or have I been misconfiguring my runners and the upgrade changed something that has exposed that misconfiguration?

Many thanks.

Just wanted to add that I’ve since determined than rolling the Kubernetes runner back from v16 to v15 restores the original behaviour of the privileged flag in the runner config.

It doesn’t look like this post is getting too much attention, unfortunately, so I need to see if there is another way of finding out if this is a bug or by design.

Turns out it was my configuration. The privileged flag needs to be set within the runners.kubernetes clause, not the runners clause.

Looks like v16 tightened up on the processing of the config compared to v15.

1 Like