Unicorn is not a GitLab product but generally used as http server to serve Ruby on Rails applications. The wikipedia article is short but explains this a bit.
Unicorn works in the way that you have a single control process, run single threaded. It spawns worker processes which then do the work (executing jobs). That’s a similar pattern as with apache/httpd and the prefork model, AFAIK PostgreSQL follows the same approach.
Other servers in the Ruby world are thin (used by Dashing) and Puma. In the past years, I have seen that larger Ruby on Rails applications are shifting from Unicorn to Puma for performance reasons mostly.
GitLab is on the move to Puma, you can follow the progress in this epic.
Scaling Unicorn Workers
I haven’t done much here but only to evaluate an arbitrary number of Unicorn workers. The docs are correct about setting the value 1.5 or 2 as factor for the number of CPU cores. Keep in mind though, that this increases virtual memory allocation on the system itself.
I wouldn’t immediately set these values to 48 workers, considering the fact that other applications also require CPU resources on the same host (PostgreSQL, Redis, NodeJS).
Start with setting these values to 16, then raise to 32. Try to measure whether this helps with performance, or just increases memory and load on the system.
Busy Background jobs
9 Threads / 9 Busy and
50 Threads / 50 Busy means that the job pipeline is stalled, and the backlog is becoming huge.
The things I would analyze:
- Are these running jobs ended at some point
- How long is the (average) execution time of such a job, do they always reach a timeout?
- How many jobs per minute, hour, day are executed
Raising the unicorn worker count will definitely help with the issue, but if there’s e.g. a job which fails with an email send timeout, and you multiply that with 1000 user emails, it will still block.
Therefore, identify blocking jobs first.
To my knowledge, this is controlled by Omnibus GitLab and you must not modify the Sidekiq scheduler on your own. GitLab itself defines the queues it needs for the workers, with googling for the queue names you’ll e.g. land here:
From an application developer’s perspective, this works way better than to have just a single queue where one blocking job halts everything.
The most interesting part are not the idle queues, but the ones which have lots of items inside. Speaking of that, monitoring and collecting metrics for better “over time” visualization will help here.
I’m not sure if the default metrics for Prometheus are available also in-depth for unicorn/sidekiq, but there’s possibilities to integrate that into either the Prometheus service or your own monitoring.
Here’s a good blog post on the matter: https://samsaffron.com/archive/2018/02/02/instrumenting-rails-with-prometheus
In terms of EE I’d also look into possibilities with scaling in other directions, such as Elastisearch for the search backend. Maybe the problem is not only related to the job processing performance, but influenced by other components generating too many jobs with too long updates.