Scaling gitlab docker runners: add more runners or add more horsepower to existing ones?


I’m trying to figure out how to best scale up our existing gitlab-runner/docker based build pipelines.

If I say “best scale up”, what I mean is the best practice.

Is it better to scale horizontally (add more runners) or vertically (add more RAM + cores to existing ones)?

And let’s max this out and say I have a beefy physical server with 40 cores and 768GB of RAM (which I don’t have laying around, but just to give an extreme example).

Any ideas what the best practice in terms of performance is?



As always, the answer is “it depends.” Adding more CPU and RAM keeps everything on the same server, but is the bottleneck then the disk and network? Also bear in mind when it comes to virtualization that more CPUs can actually have a slight performance hit as the underlying hypervisor in some cases has to wait for the CPUs to all be available before scheduling the VM. There is also the use of memory as cache, the more jobs on the one server, the more context switching there can be.

I personally prefer scale out (more runners) over scale up (larger VM) as there is no downtime to the scaling, and they can be brought online for peak usage and then vacated when no longer needed.

Thanks @mbeierl! There is of course some overhead involved when you run a beefy runner, but I guess the same goes if you have a huge number of runners that also need to be coordinated.

However, your point about the downtime and also scaling is very good! So scaling out is the way to go then for me! Thanks for the insight!