I have a repository with a rich history of changes. To keep things clean, I decided to trim its history by removing old branches and truncating commit history. For testing purposes, I deleted all branches and every commit except the newest one, making it the new HEAD. I also removed all tags, merge requests, and pipelines using the API.
Despite these cleanup efforts, there are still gigabytes of files in Git LFS (Large File Storage). I performed housekeeping via the Web GUI and a rake task, as well as ran a job to prune orphaned LFS files. I even checked the repository on the server itself to ensure there were no remaining references:
$ cd /var/opt/gitlab/git-data/repositories/@hashed/aa/bb/xxxx.git/refs $ ls heads/ # <empty> $ ls tags/ # <empty> $ ls merge-requests/ # <empty> $ ls keep-around/ # <empty>
I found the following workaround.
export the project
list all files that are still referenced in lfs:
git lfs ls-files --all -l
open the exported archive
delete all files from tree/lfs-objects that where not listed in step 2
compress the archive again
delete the old repository
let GitLab delete the orphaned files from LFS:
sudo gitlab-rake gitlab:cleanup:orphan_lfs_files
create a new project at the same place and import the modified export
verify the project integrity:
sudo gitlab-rake gitlab:git:fsck
After following these steps, the orphaned files are finally deleted. However, this process seems error-prone and somewhat risky.
What am I missing that prevents GitLab from automatically garbage collecting the orphaned files?