Gitlab-ee intermittent slowdown when pushing to projects

Reposting this bug here in case anyone else can help me:


I am hosting my own gitlab-ee deployment and am having a fairly strange issue where pushing to (and sometimes pulling from) projects results in an extended delay(20s+). This issue is intermittent(these slowdowns happen perhaps 4-12 times per day) and I cant see any sort of pattern to them. Looking at the metrics, none of the components of gitlab seem to be overloaded (cpu, memory, network all seem fine). Also worth mentioning that the environment was upgraded from 13.1 all the way to 15.1.1 around 2 months ago. Previously while it was running on 13.1 this issue did not appear at all (there were no other changes in architecture).

Looking at the logs I have narrowed down the issue (I think) to gitlab workhorse on the gitaly node. I have tried to fix the issue by allocating more memory to the gitaly instance but that did not resolve the issue. I have provided an excerpt below where you can see that the PostRevievePack method takes over 20 seconds. I would like to understand what exactly is happening when this method is called and what could be causing this slowdown.

Steps to reproduce

Create a new git branch with some code changes and push to a project. As mentioned above this issue is intermittent and cant be reliably reproduced

What is the current bug behavior?

Git push takes 20s or more

What is the expected correct behavior?

GIt push/pull should take a few seconds at most

Relevant logs and/or screenshots

An example push:

git push -u origin branch1 
Enumerating objects: 62, done.
Counting objects: 100% (59/59), done.
Delta compression using up to 8 threads
Compressing objects: 100% (47/47), done.
Writing objects: 100% (48/48), 5.97 KiB | 1.49 MiB/s, done.
Total 48 (delta 37), reused 0 (delta 0), pack-reused 0

[… ca. 25s wait here …]

remote: To create a merge request for branch1, visit:
remote:   <REDACTED>
 * [new branch]        branch1 -> branch1
Branch 'branch1' set up to track remote branch 'branch1' from 'origin'.
  "command.count": 1,
  "command.cpu_time_ms": 143,
  "command.inblock": 656,
  "command.majflt": 15,
  "command.maxrss": 232032,
  "command.minflt": 10080,
  "command.oublock": 176,
  "command.real_time_ms": 17806,
  "command.system_time_ms": 53,
  "command.user_time_ms": 89,
  "correlation_id": "01GE20TEZC3T4KQF1FZ56CRSYC",
  "grpc.code": "OK",
  "grpc.meta.auth_version": "v2",
  "grpc.meta.client_name": "gitlab-workhorse",
  "grpc.meta.deadline_type": "none",
  "grpc.meta.method_type": "bidi_stream",
  "grpc.method": "PostReceivePack",
  "grpc.request.fullMethod": "/gitaly.SmartHTTPService/PostReceivePack",
  "grpc.request.glProjectPath": "<REDACTED>",
  "grpc.request.glRepository": "<REDACTED>",
  "grpc.request.payload_bytes": 9785,
  "grpc.request.repoPath": "@hashed/<REDACTED>",
  "grpc.request.repoStorage": "default",
  "grpc.response.payload_bytes": 56,
  "grpc.service": "gitaly.SmartHTTPService",
  "grpc.start_time": "2022-09-28T12:23:23.575",
  "grpc.time_ms": 20094.992,
  "level": "info",
  "msg": "finished streaming call with code OK",
  "peer.address": "<redacted>",
  "pid": 1378,
  "span.kind": "server",
  "system": "grpc",
  "time": "2022-09-28T12:23:43.670Z",
  "user_id": "<REDACTED>",
  "username": "<REDACTED>"

The above call takes about 20 seconds to complete whereas normally it should take a few milliseconds. I would like to understand exactly whats happening here and what could possibly be causing this slowdown.