Backup gitlab without (size of repos * 2) provisioned on gitlab server

I’m trying to implement a proper backup strategy on Gitlab before I start using it in production. Ideally, I would like a job on the backup server to pull the backups, not push from Gitlab.

If I need to take a backup of Gitlab, according to this page I can either use the omnibus rake tasks, or take a snapshot and rsync the data out.

Option One - Rake Task##

This option requires that I provision 2x the storage I would like to have available for Gitlab, as the backups are created on the local file system. I can work around this with NFS, but this is not authenticated and instead would require IP based access control. I personally find this to be a bit ugly.

Unless I can somehow configure Gitlab to push the tar to stdout, I can’t make this work cleanly without generating writes to the local file system.

Option Two - Snapshots + Rsync

This is the route I think would be the most scalable, however I simply can’t get it to work. For now to test, I’m shutting the gitlab services down entirely during backup, instead of taking an LVM snapshot, as this should be more atomic if anything. This still won’t recover properly.

Here’s the process I’m trying to get through (unsuccessfully):

  • Shut down the gitlab service (note this service is just the official gitlab docker container running on CoreOS)
  • Run rsync -aHX to copy all gitlab data to the backup server
  • Remove the local data
  • Run rsync -aHX to copy all gitlab data back from the backup server
  • Start the gitlab service

This results in a crash on cache-clear and the gitlab container never stays up. According to the gitlab docs page linked above this should be possible, I’m just not understanding where things are going wrong.

Anyone familiar with how to make a proper gitlab backup when not using the default rake tasks?