When gitlab-ce backup data is restored, some warehouses are empty and the size of the warehouse exceeds 2G. Is this a bug caused by gitlab-ce?
Warehouse?
There is nothing in GitLab called a warehouse, but naturally anything can grow (almost) without limits, if the stuff you put in is big, and depending on what you’re talking about there are reasons it can take up space while looking (in some ways) empty. So you need to use the correct terms and be more specific (i.e. what do you see, and what did you expect).
Thanks for your reply, the project created by Gitlab-CE-15.0.7 has been getting bigger and bigger over time, and the current project size is 6.5GB. When I backup the entire gitlab-ce instance, it shows that the backup is successful, but when I restore it on the new gitlab-ce, other projects are restored successfully. However, the data of this 6.5GB project cannot be recovered successfully, opening this project in the gitlab-ce ui page is like creating a new empty project, the project does not have any data. Does gitlab-ce have requirements for big project recovery or does gitlab-ce have bugs?
If you were on Gitlab 15.1.0 or higher, you could debug this pretty easily: Back up GitLab | GitLab
From 15.1.0 and higher, it would be possible to run a backup manually for just the 6.5GB repository. You could then copy/paste the output from the console so that we could then see what is actually happening with the backup, and what is being backed up and not backed up.
I would suggest upgrading your 15.0.7 as per the upgrade path: Upgrading GitLab | GitLab to 15.1.6 and then look at taking a backup for that 6.5GB repository and see what happens. You can then restore this backup to a new install of Gitlab 15.1.6 and see if it restores the full amount. If it does, then it means all is OK. Then do the next upgrade on the upgrade path and repeat. You can then find out which version upgrade causes the problems - but again, after each upgrade ensure that background migrations have finished before starting the next upgrade. It’s explained in the Gitlab upgrade docs on how to check that.
Perhaps you missed an upgrade on the upgrade path or didn’t wait for background migrations to finish when you were upgrading to 15.10.7. I doubt this is a bug with Gitlab, since I have upgraded many Gitlab installs from 15.0.x to 16.x without any issues like this.
Thank you for your reply. First, I would like to introduce my operation process. The original gitlab-ce is 12.2.4, but I upgraded it to 15.10.7 through the upgrade path on the official website, and the background migration of each version has been completed. After running gitlab-ce-15.10.7 for 14 days, it is found that backup files cannot be restored in the new environment. Then I exported from gitlab-ce15.10.7 through project export and import, and then imported projects one by one in the new gitlab-ce15.10.7. After successful import, the new gitlab-ce15.10.7 ran for a period of time and backed up again. In the new gitlab-ce15.10.7 instance, I found that 6.5GB and 2GB projects could not be recovered, which is my operation process. I wonder if the version of gitlab-ce15.10.x will restrict backup and recovery for super 2G projects?
{“level”:“fatal”,“msg”:“create: pipeline: 1 failures encountered:\n - @hashed/84/a5/84a5092e4a5b6fe968fd523fb2fc917dbffae44105f82b6b94c8ed5b9a800223.git (cloud/cloudservers): manager: write bundle: *backup.FilesystemSink write: write file "/var/opt/gitlab/backups/repositories/@hashed/84/a5/84a5092e4a5b6fe968fd523fb2fc917dbffae44105f82b6b94c8ed5b9a800223/1687633260_2023_06_25_15.10.7/001.bundle": rpc error: code = Internal desc = cmd wait failed: exit status 1, stderr: "error: inflate: data stream error (incorrect data check)\nfatal: packed object e57ddaf5456b40e7aa17efe2e272e380414ff9d3 (stored in /var/opt/gitlab/git-data/repositories/@hashed/84/a5/84a5092e4a5b6fe968fd523fb2fc917dbffae44105f82b6b94c8ed5b9a800223.git/objects/pack/pack-746cd47e39c97c34e1c12512ac526a01b91aa5c6.pack) is corrupt\nerror: pack-objects died\n"\n”,“pid”:16472,“time”:“2023-06-24T19:04:20.371Z”}
2023-06-25 03:04:20 +0800 – e[34mDeleting tar staging files … e[0m
2023-06-25 03:04:20 +0800 – Cleaning up /var/opt/gitlab/backups/db
2023-06-25 03:04:22 +0800 – Cleaning up /var/opt/gitlab/backups/repositories
2023-06-25 03:04:24 +0800 – e[34mDeleting tar staging files … e[0me[32mdonee[0m
2023-06-25 03:04:24 +0800 – e[34mDeleting backups/tmp … e[0m
2023-06-25 03:04:24 +0800 – e[34mDeleting backups/tmp … e[0me[32mdonee[0m
2023-06-25 03:04:24 +0800 e[34m-- Deleting backup and restore lock filee[0m
rake aborted!
Backup::Error: gitaly-backup exit status 1
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/gitaly_backup.rb:59:in finish!' /opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:28:in
dump’
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/manager.rb:66:in run_create_task' /opt/gitlab/embedded/service/gitlab-rails/lib/backup/manager.rb:211:in
block in run_all_create_tasks’
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/manager.rb:210:in each_key' /opt/gitlab/embedded/service/gitlab-rails/lib/backup/manager.rb:210:in
run_all_create_tasks’
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/manager.rb:42:in create' /opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:16:in
block (4 levels) in <top (required)>’
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:203:in lock' /opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:13:in
block (3 levels) in <top (required)>’
/opt/gitlab/embedded/bin/bundle:23:in load' /opt/gitlab/embedded/bin/bundle:23:in
’
Tasks: TOP => gitlab:backup:create
(See full trace by running task with --trace)
2023-06-26 03:00:59 +0800 – e[34mDumping database … e[0m
Backups can only be restored to the same version. So if you have a backup from 15.0.7 then it can only be restored to 15.0.7. You cannot restore it to newer versions.
If you are missing data after upgrades, then obviously something went wrong during the upgrade, either with converting repositories to their new format where they go from showing proper names under /var/opt/gitlab/git-data to random hashed types, or then perhaps again no doubt when it’s converted to Gitaly.
You can’t even restore a backup made on an EE on a CE without editing it. (It’s a quite simple edit; and one that we did when we downgraded) The shown error is not the one you get when the versions between backup and restore mismatch though.
It would be advantageous if @CaptainsDT showed what command produced that output, and told us a bit about why he did that.
Thank you for your reply. I did not restore data across versions. The backup 15.10.7 was also restored to the new gitlab-ce instance, which was also lost. I tried to push a new gitlab-ce instance in a project by mirroring the repository. The version was consistent, but the data flow was offset and the mirroring failed
My gitlab-ce is installed in docker and deployed in CE, and I have never used EE. Is started through docker-compose, so the backup command is /usr/bin/docker exec -t fa34f0f46b1e gitlab-backup create STRATEGY=copy, is that helpful for analyzing problems?