Is it possible to limit maximum concurrency for a certain job type / build step per runner?

We’re using a self-hosted GitLab with self-hosted GitLab CI runners on on-premise VMs. We have a variety of jobs / build steps with greatly differing resource usage characteristics.
We currently have a bit of a problem balancing job concurrency. For most job types (let’s call them build-step-[1234], our VMs can perfectly well handle a runner with two concurrent jobs. Actually, they could probably handle even more. However, there are other job types (let’s call one build-step-5) that are rather long-running and resource-intensive. If two of these are scheduled on the same runner, they compete for resources and can each take up to three times their typical execution time and run into timeouts. Running one of these jobs alongside one for any other build step is totally fine.

Now my question is: Is it possible to limit per-runner concurrency for a certain job type / build step?
A few directions I’ve thought about:

  • Use a tool like resource_groups in the .gitlab-ci.yml. However, as far as I understand resource_groups, they can only be used to limit a job to one concurrent instance globally. I’m looking for a similar way, only per runner.
  • Set up a fleet of VMs and runners with custom configuration for build-step-5 and a separate group for all other build steps. While this is possible, I’d like to have all-purpose runners, especially since we are running only a few VMs. If nothing else works, I guess I’ll have to go for this approach.

Maybe there’s already a way to do so that I’m just missing? :slight_smile:

Example excerpt from a runner configuration toml, in case that helps:

concurrent = 2
check_interval = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "gitlab-runner-linux3"
  url = "..."
  token = "..."
  executor = "docker"
  [runners.custom_build_dir]
...
  [runners.docker]
    image = "ubuntu:20.04"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0

One of the teams using our GitLab+runners had some jobs that (they said) should not run in parallel. The solution I went with was setting up two nearly identical runners, one for general usage, and one with “limit = 1” in config.toml taking jobs tagged with “”. As the daemon still has “concurrent = 16” (that is the value we have found to be suitable with our hardware), I believe one of 's jobs will just mean that the runner everybody else (they could tag their jobs with “”, but the lower degree of parallelism will hurt them - and make me upset with them if I discover it, they can get a runner if they have a need) uses runs one job less.

It’s not exactly what you asked for or thought about, but quite close - and I think it can be adapted to your requirements.

1 Like