Git pack-objects process after backup completed

Hi.
We found that after backup completed there are a lot of processes like (iotop)
189100 be/4 git 1131.96 M 0.00 B 0.00 % 59.92 % git pack-objects --revs --thin --stdout --progress --delta-base-offset

via lsof we found that this process works with repostories like this:

cat //var/volume/repositories/@hashed/08/c1/08c159cb811c2742199fed6b07c10860a7b8e777515377e20fe0f2d90f230657.git/config
[core]
repositoryformatversion = 0
filemode = true
bare = true
sparseCheckout = true
commitGraph = true
splitIndex = false
[remote “origin”]
url = /var/opt/gitlab/backups/repositories/repo1/bingo.bundle
fetch = +refs/:refs/
mirror = true
[gitlab]
fullpath = repo1/bingo

but the path /var/opt/gitlab/backups/repositories is not availble, it is the backup folder…
Does anyone know why the repository has such a strange url?

Thanks.

1 Like

And one more question:
From time to time we have a lot of git pack-objects process and they’re doing something, but we don’t understand what it is

It is after backup artefacts or someone downloaded the repositories via https instead of ssh?

Hi,

Going by the info here:

https://git-scm.com/docs/git-pack-objects

it’s could be to do with the .git/objects/pack/*.pack and *.idx files. I expect a process is running and creating/updating these files. Depending on how many repositories you have or their size, this might take a while.

Under backups, the repositories directory won’t exist once the backup has been finished. Since it was only there temporarily whilst it was being archived into the tar file.

Also with Linux, lsof will show files as being open, and they may not necessarily be accessed since Linux will cache in memory for a certain amount of time before completely closing the file - for performance reasons. If files kept opening/closing all the time, it would slow down repeated access.

Thank you for your answer.
I am more interested in the answer to the question: in what cases and why does GitLab run the git pack-objects process? And can the any settings somehow affect the frequency of starting this process?

A quick google:

normal functionality of gitlab, so it’s expected that these processes will exists and run on your server.

here you can find information relating on how to reduce memory usage by making changes in gitlab.rb to reduce the number of threads if you are worried about having too many processes, or with it using too much of memory on your system. Or alternatively, increase the CPU/ram available on your server to deal with the amount of threads created.

As you can see, it’s normal functionality of Gitlab for these processes to exist, and since they are running, there is obviously a reason for it. As far as I can see, nothing to worry about.

1 Like

Thank you.
We have enough memory and CPU, but not enough disk - this process generate about 10000 - 15000 read IOPS. We have planned to buy a more productive disk, but we still can’t understand why this problem occurred in the 13.7 version of GitLab.
git-pack-objects: excessive cpu and mem usage (#1132) · Issues · GitLab.org / gitaly · GitLab - here is a way to reduce the number of threads that call git-pack-objects, but the link (files/gitlab-config-template/gitlab.rb.template · master · GitLab.org / omnibus-gitlab · GitLab) is not that setting, please tell me what this setting is called?

The only parameter I know for reducing threads is this one:

# sidekiq['concurrency'] = 25

you can try uncommenting that and reducing this. On a raspberry PI they suggest setting to 9, but the idea applies to whichever system if you are wanting to control the resources. Haven’t found anything else with Google right now, so you can try it to see if it makes a difference.

The only other changes could be required with gitaly since this is more likely to affect the pack processes, especially considering the issues I posted previously: Gitaly | GitLab

Of course, after changing, run gitlab-ctl reconfigure and restart gitlab services gitlab-ctl restart.

1 Like

Thank you for advice. It helped to reduce the load on the system as a whole.