Performance issues when writing to tmpfs during a docker-based gitlab-ci build

I am using the docker executor to run tests of a PHP (Symfony) project. While executing the tests, cache and log files are written to the file system.

Previously, those files were written to /dev/shm/<company>/<projectname>, and everything worked fine. Build jobs completed in ~5 minutes.

A couple of days ago, we have re-configured the test suite to write cache and log files to /run/shm/<company>/<projectname> (instead of /dev/shm/...). After committing this change, the build job took forever to finish (> 1 hour instead of 5 minutes).

I then re-configured /etc/gitlab-runner/config.toml to mount the hosts /run directory as follows:

concurrent = 3
check_interval = 0

[[runners]]
  name = "runner1"
  url = "<snip>"
  token = "<snip>"
  executor = "docker"
  [runners.docker]
    limit = 3
    tls_verify = false
    image = "ubuntu:14.04"
    privileged = false
    disable_cache = false
    volumes = ["/cache", "/run/shm:/run/shm"]
    pull_policy = "if-not-present"
  [runners.cache]

This works fine (in terms of build performance) - however, when running multiple build jobs of the same project at the same time (e.g. in different branches), they now obviously overwrite each other’s cache files :slight_smile:

So, the next step was to reconfigure the runner to mount the host /run/shm into the container’s /mnt/run/shm

[[runners]]
  //...
  [runners.docker]
    //...
    volumes = ["/cache", "/run/shm:/mnt/run/shm"]
    //...
  [runners.cache]

and adjust the .gitlab-ci.yml file to create a pipeline-specific directory inside the mounted tmpfs:

before_script:
  # The docker image mounts the hosts /run directory
  # Create a pipeline-specific sub-directory to prevent builds from overwriting each other's cache files
  - mkdir /mnt/run/shm/$CI_PIPELINE_ID
  - ln -s /mnt/run/shm/$CI_PIPELINE_ID /run/shm

This works fine for isolating the builds, but again completely kills the performance.

I am using a vanilla docker install on a dedicated Ubuntu 14 “runner” vm with aufs storage driver:

$ docker info
Containers: 40
 Running: 0
 Paused: 0
 Stopped: 40
Images: 154
Server Version: 1.12.3
Storage Driver: aufs
// ....

I tried setting environment = [ "DOCKER_DRIVER=aufs" ] in the gitlab runner config, but this didn’t change anything.

I am relatively new to docker (I only use it for gitlab ci), so I don’t know if I am using the right approach for this. What would be the correct way to get good write performance to /run/shm from inside a docker-based gitlab ci build?

So here is some more info:

When running the build WITHOUT mounting the host’s /run directory into the container, the host reports very high iowait for the journaling block device - here is the output of iotop on the VM on which the gitlab-ci-multi-runner is running:

Total DISK READ :       0.00 B/s | Total DISK WRITE :      31.33 K/s
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:     101.83 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
  181 be/3 root        0.00 B/s    0.00 B/s  0.00 % 88.78 % [jbd2/dm-0-8]
29923 be/4 root        0.00 B/s   31.33 K/s  0.00 %  0.09 % php /usr/local/bin/phpunit --configuration acme/app/phpunit.xml.dist

When running the build with mounting the host’s /run directory (as described earlier), iotop remains mostly silent - as expected.

For the sake of completeness, I also tried setting the privileged flag to true. This didn’t have any effect.

FYI plain docker supports specifying size of /dev/shm for containers. Unfortunately right now gitlab ci do not provide any option to override it :(.

There’s an issue in gitlab runner to provide support for it

Thanks for the hint!

I played around a bit, and I’m not sure if this is related - according to the docs, the default size is 64M, and I also have the performance problem with small test suites that use less than 10M.

I managed to get the build working by explicitely creating a new tmpfs inside the container:

before_script:
  # Create tmpfs in the /run directory to improve build performance.
  - mount -t tmpfs none /run

However, this is a “dirty hack”, and not really a solution for the problem. My impression is that the gitlab ci runner does not handle tmpfs properly when spinning up a new container - maybe I should create a new issue for that?

Hm, so your docker runner is running in privileged mode?

Not sure that slow performance in docker is gitlab issue. It’s a thin wrapper around docker. Google for docker issues :wink:

I tried both, with and without privileged mode.

Well, I have slow performance when running a gitlab ci job. I am not interacting with docker directly, so it is primarily a gitlab issue for me. Of course I don’t know whether the issue is caused by docker itself or if gitlab just doesn’t “wrap” it correctly / completely - I’m happy about any suggestions how I could further isolate the problem :slight_smile: