Triggering specific job causes freeze of gitlab machine

johsjo · April 20, 2022, 12:36pm

We have a very confusing issue with our Gitlab server.
The server is:

running 14.9.3 and is using mostly default settings.
running ubuntu 22.04, kernel 5.13.0.
is virtualized on vmware.
is configured with 4 CPUs and 8GB ram.
is not heavily used, ~10 users and a handful of active repos.

Since upgrading from 13.12 we have started getting “freezes” in network connectivity.
In principle, all network traffic halts completely for around 60s before continuing as normal.
This includes traffic to and from the machine, but also between services on the machine (e.g. between workhorse and gitlab-rails).

This happens when running a few specific CI jobs.
The job that triggers the issue just runs gradle clean.
It uploads/downloads a cache, but that is as far as I know handle by the runner, not the server.
The git repo is only around 1MB in size.

We have checked for memory/disk/cpu issues but can’t find any obvious problems with resource constraints.

The only few clues we have is that gitlab-workhorse reports:
badgateway: failed to receive response: context canceled and
badgateway: failed to receive response: dial unix /var/opt/gitlab/gitlab-rails/sockets/gitlab.socket: connect: no such file or directory
But we are fairly sure this is just a symptom of the network stack being “frozen”.

We think this is not directly Gitlabs “fault”, but want to check here if anyone else have had similar issues?

iwalker · April 20, 2022, 1:16pm

Unrelated to gitlab but i have had problems with debian or ubuntu on vmware when using vmxnet3 as the network card in the vmware machine config. Intermittent or poor network or even causing the vm to restart.

I changed to e1000 in the vm machine config and after this was stable. Means deleting the existing network card from the vm and adding a new one and choosing e1000 and saving.

johsjo · April 21, 2022, 7:40am

Thanks!

I was kind of going in the direction of something with vmware being iffy.
Replacing the NIC worked like a charm.

Topic		Replies	Views
Gitlab stalls or freezes Infrastructure as Code & Cloud Native	1	561	November 22, 2019
Full system freeze while running GitLab omnibus in Docker Self-managed docker , omnibus	1	1352	September 4, 2021
GitLab VM becoming unresponsive every day around same time Self-managed	12	1821	November 21, 2022
Workhorse slows down the instance Self-managed	0	525	September 15, 2020
Puma threads are periodically blocking GitLab instance How to Use GitLab	2	949	July 30, 2024

Triggering specific job causes freeze of gitlab machine

Related topics