We have a quite big repo on gitlab (70 Go for .git objects) containing a few projects developed in parallel. Since a few month ago we started to get major issues with disk space on CI workers, after investigation we found out that the .git directory took the most space, mostly in the form of packs. git-pack-redundant showed lots some completely redundant packs and a huge number of redundant objects in the remaining packs. We were able to circumvent the problem by running git gc more often.
In the meantime some developers were also complaining of huge downloads during git fetch. And I believe it is the same issue as packs on the runners come from fetching. I continued investigation using GIT_TRACE_PACKET and in the few repro cases we have it seems the object negotiation is much shorter than usual, the server acknowledging ‘ready’ almost instantly, and then proceeding by sending huge packs that contains lots of objects that the client already have. I wanted to know if others were having similar issues or if the problem is on our end.
I want to add that the repo did not suddenly increase in size so that not the issue.
We recently updated our gitlab server to version GitLab Community Edition 13.1.1 but the problem is still here; We have repro with git version 2.25.1.windows.1 and git version 2.27.0.windows.1