How to make sure the restoration is successful and no data is lost?

I created a GitLab CE v15.10.1 backup and restored it to a new VM.

I found the size of the git-data/repositories folder in the new VM is 284GB which is smaller than the one (306GB) in the old VM. The difference is around 22GB.

The folders in the git-data/repositories are not the repositories’ names that appear in the old VM. The folders in the new VM are as follows:

root@test-restore-gitlab-backup:/data/git-data/repositories# ls -la
total 28
drwxrws---   6 git git 4096  九  14 02:00 .
drwx------   3 git git 4096  九   6 10:21 ..
drwxr-sr-x   4 git git 4096  九  14 00:20 +gitaly
-rw-------   1 git git   64  九   6 10:23 .gitaly-metadata
drwxr-s--- 251 git git 4096  九   6 17:39 @hashed
drwxr-s---   3 git git 4096  九   6 17:46 @pools
drwxr-s---  36 git git 4096  九   6 17:45 @snippets
root@test-restore-gitlab-backup:/data/git-data/repositories# du -sh .
284G    .

How to examine whether the restoration is successful and no data is lost since the size difference (22GB) is quite large?

I found these references:

However, I am unsure if the repository decreased in size due to the change from the legacy name to the hashed name.
Does anyone know about this?

306>284, so from the numbers it looks like the repositories take up (a little - less than 8%) more space on the new server than they did in the old. On the other hand 284 is the number reported in the output you show and say is from the new server, so you might just have switched the numbers around.

As you can only restore to the exact same version (but if you know what you’re doing that is not hard to get around), and I believe non-hashed storage has been deprecated for a while, I think that must have been used before the backup was made, so that is not involved in the explanation. (I don’t remember those details, but if different files are stored together it can give a small reduction - In most cases it wouldn’t add up to 8%, but if all your repos have many small files, I guess it can be)

My guess would be different block sizes (perhaps from different file systems) between the two servers. The command(s) to check that depends on the file systems used, so I can’t offer a simple pair of commands to check that.

1 Like

Hi @grove, I’ve rectified the size, the size(284GB) in the new VM is smaller than that (306GB) in the old VM.

The OS of the new VM is Ubuntu 22.04.3 LTS, and the OS of the old VM is Ubuntu 18.04.6 LTS.

And I found out that there is a @hashed folder whose size is 304GB in the git-data/repositories folder of the old VM.

I tried to restore with yesterday’s backup tar file (294GB) in another new VM, and then the size of git-data/repositories became 253GB.

Everything seems fine on the GitLab website but error messages appeared during the restoration, I am not sure if something is wrong.

2023-09-15 19:13:29 +0800 -- Deleting tar staging files ...
2023-09-15 19:13:29 +0800 -- Cleaning up /data/gitlab-backups/backup_information.yml
2023-09-15 19:13:29 +0800 -- Cleaning up /data/gitlab-backups/db
2023-09-15 19:13:29 +0800 -- Cleaning up /data/gitlab-backups/repositories
2023-09-15 19:13:30 +0800 -- Cleaning up /data/gitlab-backups/uploads.tar.gz
2023-09-15 19:13:30 +0800 -- Cleaning up /data/gitlab-backups/builds.tar.gz
2023-09-15 19:13:30 +0800 -- Cleaning up /data/gitlab-backups/artifacts.tar.gz
2023-09-15 19:13:30 +0800 -- Cleaning up /data/gitlab-backups/pages.tar.gz
2023-09-15 19:13:30 +0800 -- Cleaning up /data/gitlab-backups/lfs.tar.gz
2023-09-15 19:13:30 +0800 -- Cleaning up /data/gitlab-backups/terraform_state.tar.gz
2023-09-15 19:13:30 +0800 -- Cleaning up /data/gitlab-backups/packages.tar.gz
2023-09-15 19:13:30 +0800 -- Deleting tar staging files ... done
2023-09-15 19:13:30 +0800 -- Deleting backups/tmp ...
2023-09-15 19:13:30 +0800 -- Deleting backups/tmp ... done
2023-09-15 19:13:30 +0800 -- Deleting backup and restore lock file
rake aborted!
Backup::Error: gitaly-backup exit status 1
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/gitaly_backup.rb:59:in `finish!'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/repositories.rb:37:in `restore'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/manager.rb:100:in `run_restore_task'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/manager.rb:233:in `block in run_all_restore_tasks'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/manager.rb:231:in `each_key'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/manager.rb:231:in `run_all_restore_tasks'
/opt/gitlab/embedded/service/gitlab-rails/lib/backup/manager.rb:75:in `restore'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:26:in `block (4 levels) in <top (required)>'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:203:in `lock'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake:23:in `block (3 levels) in <top (required)>'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `block in execute'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `each'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `execute'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sentry-ruby-core-5.1.1/lib/sentry/rake.rb:26:in `execute'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/task.rb:219:in `block in invoke_with_call_chain'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/task.rb:199:in `synchronize'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/task.rb:199:in `invoke_with_call_chain'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/task.rb:188:in `invoke'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/application.rb:160:in `invoke_task'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/application.rb:116:in `block (2 levels) in top_level'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/application.rb:116:in `each'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/application.rb:116:in `block in top_level'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/application.rb:125:in `run_with_threads'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/application.rb:110:in `top_level'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/application.rb:83:in `block in run'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/application.rb:186:in `standard_exception_handling'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/lib/rake/application.rb:80:in `run'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/rake-13.0.6/exe/rake:27:in `<top (required)>'
/opt/gitlab/embedded/bin/rake:25:in `load'
/opt/gitlab/embedded/bin/rake:25:in `<top (required)>'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/bundler-2.3.15/lib/bundler/cli/exec.rb:58:in `load'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/bundler-2.3.15/lib/bundler/cli/exec.rb:58:in `kernel_load'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/bundler-2.3.15/lib/bundler/cli/exec.rb:23:in `run'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/bundler-2.3.15/lib/bundler/cli.rb:483:in `exec'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/bundler-2.3.15/lib/bundler/vendor/thor/lib/thor/invocation.rb:127:in `invoke_command'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/bundler-2.3.15/lib/bundler/vendor/thor/lib/thor.rb:392:in `dispatch'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/bundler-2.3.15/lib/bundler/cli.rb:31:in `dispatch'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/bundler-2.3.15/lib/bundler/vendor/thor/lib/thor/base.rb:485:in `start'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/bundler-2.3.15/lib/bundler/cli.rb:25:in `start'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/bundler-2.3.15/exe/bundle:48:in `block in <top (required)>'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/bundler-2.3.15/lib/bundler/friendly_errors.rb:117:in `with_friendly_errors'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/bundler-2.3.15/exe/bundle:36:in `<top (required)>'
/opt/gitlab/embedded/bin/bundle:23:in `load'
/opt/gitlab/embedded/bin/bundle:23:in `<main>'
Tasks: TOP => gitlab:backup:restore

I also noticed a size reduction when restoring a backup on a new server in the past. However, I didn’t get any errors as part of the restore process, so I assumed it was just due to some junk that isn’t backed up or perhaps there is something happening with compression. It would be interesting to get an official answer on this.

Anyway, I’ve not become aware of any problems or broken repos and my restore was >9 months ago.

I’d definitely take a closer look at the source of those errors though if I were you.

1 Like

I’ve unmounted the disk of git-data/repositories from the old VM and mounted it to the new VM.
Then I can log in to GitLab on the new VM, but I found that the PostgreSQL(main) is v13.8 on the new VM while it is v12.12 on the old VM.

I am not sure if this will cause any problems. Should I upgrade the PostgreSQL to v13.8 and then mount it to the new VM?

I found that this error may be due to insufficient disk space.

My disk is 800G and the backup file is 294G, then restoration used up to 883G (including the backup tar file, unzipped files, and the files copied/moved to git-data/repositories), so the restoration is not fully completed and shows the above error message, as a result, there are only 250G in the git-data/repositories folder which is a smaller size than the backup file.

After changing the disk to 1T disk space, the restoration can be done successfully.

I found out that these commands maybe can help me resolve this question:

sudo gitlab-rake gitlab:check SANITIZE=true
gitlab-rake gitlab:storage:migrate_to_hashed
gitlab-rake gitlab:storage:hashed_projects

I’ve restored the backup in a VM with sufficient size (3x backup size) and there are no repositories named with username in the specified data_dir folders except the repositories with hashed names.

I also use “gitlab-rake gitlab:check” command to check and everything seems ok.

And it looks fine now, so I think this issue is resolved.

Glad you got it sorted. The newer PostgreSQL version is a good thing (fresh installations sometimes use a newer version while existing installations only upgrade when it is absolutely necessary - or you can do it manually, there is documentation about that).

I doubt you had unmigrated repositories (not in hashed storage) to begin with, as that change happened a long time ago.

1 Like