Prometheus hogging CPU after GitLab Update from 11.0.2 to 11.3.4

Hi all,

Updating a self hosted GitLab Omnibus instance from 11.0.2 to 11.3.4 on a CentOS 7.4 for me resulted in the bundled Prometheus using almost 100% of available CPU time. Naturally all other GitLab operation is impacted by this and excessively slow.

So far I was not able to pinpoint what the problem is.
I could need advice on how to properly diagnose and fix this issue.

Thanks for pointing me in the right direction.
Best regards
Björn

1 Like

My self-hosted installation has a similar problem. There are multiple Prometheus jobs which take as much CPU as they can get (see screen shot.) This causes the overall site to respond slowly, and there are many 500 responses from the web interface.

I have Prometheus disabled in Admin Area>Metrics and Profiling>Metrics - Prometheus. I have both stopped and started GitLab and rebooted the machine after changing the setting, to no effect.

This does not appear to be a memory problem - it looks more like runaway Prometheus processes. The pid of the process grabbing the CPU changes every minute of so.

I am running GitLab 13.3.8 (efc3994bdc3) on a current version of Raspbian buster.

Anyone seen anything like this, and any guidance on taming the Prometheus processes?

I was able to fix this by going into gitlab.rb and disabling Prometheus completely. (As in fact is recommended for small systems.)

1 Like