Restore only with data path (no database)

We have a problem, as we lost a file server storing a medium size Gitlab instance (500 repos, 250 users).

Of course there were ZFS snapshots (but those were lost with fileserver), and external (nightly) backup. Sadly we noticed due to some configuration change a year ago, only the data- and registry path are saved, as somebody commented the gitlab_rails[‘backup_path’] and it wrote to default location, which was not backed up.

Therefore we have the data directory and registry.

Is it possible to restore a Gitlab instance from just that? Or any advice on that?

(I know we surely lose person bindings, groups, merge requests, tickets, etc etc – but as the instance has several terabytes, we cannot simply try for some times, as plain copy takes nearly a day).

Thanks a lot, Jo

When you create a new backup, you can use SKIP=db, according to

< potentially dangerous suggestion >
What was skipped when creating a backup is stored in backup_information.yml in the root of the backup, that file seems easy to edit. If you do that maybe you also want to delete db/database.sql.gz from the backup.
</ potentially dangerous suggestion >

I believe the database contains some info about the repositories, so while the actual data might be restored by that, it might now be usable.

And just in case the tags weren’t clear: You might destroy things if you do that, my guess is that the worst that can happen is that you end up with a useless backup. But it’s a guess, and if you follow that advice you get to keep all the pieces and don’t get to complain - I would like to know if it worked though.

I assume this was a recent version of GitLab so it was using hashed storage.
You can re-create the groups/repos in new GitLab instance. The data from backup are just git bare repos so if you copy/paste them to the correct directories GitLab should be able to work with that. The worst is to figure out which data belongs to which repo, but there used to be a gitlab metadata file in each directory with repo name and path. I am not sure if it’s still there.

I can’t help with the registry, tho.

I just checked the on-disk files on a running GitLab (but the backup is pretty much a copy of those), and in /var/opt/gitlab/git-data/repositories/@hashed/8f/1f/8[...].git/config is [gitlab]/fullpath telling you what these files are for.

I had a similar problem yesterday and found a solution to bulk import all projects to a new git server.

If you have the git-data folder you can import all of the data in it to the new server.
First you have to setup a new git server and copy the git-data folder to the new server (with a different name). Then you can execute this command (import is the folder name):

gitlab-rake gitlab:import:repos['./import/']

GitLab now loads all repositories.

Source: repository - Move repo from plain git to GitLab server - Server Fault

Thanks a lot for your proposals. We set up a flexible path to try multiple methods, I keep you informed if it worked.