HA Options for GitLab Runners

stuart.mccarthy · June 20, 2025, 3:54am

Using Gitlab (Ultimate licences) at work connected to on-premises runners. As part of risk and compliance requirements our runners have been identified as single points of failure.

We use Podman as the engine for executing our containers and in recent times had runners offline due to a bug which filled the tmpfs filesystem of the Podman user (non-root execution) - this has now been resolved by RedHat. However, due to this issue we are looking at ways to ensure scheduled tasks do not fail in the event of a single runner being offline.

If we were to implement HA, at a high level I believe we would:

Deploy an additional Linux VM
Ensure that Podman is installed
Register this new VM as a runner with GitLab

Note: we do not have any dedicated container platforms in place (i.e. Kubernetes or Docker Swarm) at this time, just traditional VM infrastructure.

Questions:

Can GitLab runners operate in a HA model? At a high-level I believe they can through load balancing - though many posts suggest it doesn’t work that well. i.e. second runner doesn’t do anything until primary is 100% utilised. Though possibly this depends on the methods used - load balancing vs round robin?
Are there any better ways? I don’t believe we have a huge number of pipelines or jobs so any sort of autoscaling doesn’t seem required. Simply want to add some redundancy to what we currently do.

grove · June 20, 2025, 8:33am

Do you need HA on the fleet of runners, so GitLab can (almost) always execute jobs, or do you need HA on each individual runner. You seem to be describing the first option, that is also what I know something about.

If you just set up multiple runners, GitLab will spread the jobs among them (actually the runners control the distribution of jobs - for one runner to take all jobs you’ll need the jobs to be started at specific times and the runners to have an inappropriate configuration). If one runner goes down in a scenario like that, you’ll lose the jobs that were executing at that runner when it crashed, and you’ll probably have to restart those jobs manually - if you can’t live with that you’re probably in the second category mentioned above.

stuart.mccarthy · June 20, 2025, 12:50pm

Without diving into technical speak and what the correct terms are:

Load balance pipelines / jobs between runners
Ensure pipelines / jobs can continue to run in the event of a runner failure

As you said, I am happy to lose a job that was running, I just want to ensure subsequent jobs don’t also fail

Topic		Replies	Views
Gitlab-Runner HPA with same runnerToken GitLab CI/CD ci , runner , kubernetes	1	647	September 25, 2023
Load Balancing In gitlab How to Use GitLab	4	3725	February 3, 2018
GitLab HA Runners Self-managed	0	412	May 13, 2020
Can Gitlab runners continue to operate if Gitlab itself restarts? GitLab CI/CD ci , runner , docker	1	859	May 23, 2022
Any way to use a secondary gitlab runner if the primary runner is down?Any way to use a secondary gitlab runner if the primary runner is down? GitLab CI/CD ci , runner	0	458	November 16, 2022

HA Options for GitLab Runners

Related topics