Installing GitLab on standalone machine with repos to a NAS

Hello,

I am interested in installing gitlab with LFS to serve as a data repository for an university research lab, and am trying to decide the best way forward. I have used gitlab as a user, but this is the first time I’ve been asked to deploy and manage a gitlab installation.

  • I have the gitlab 13.6.2-ee installed via omnibus, on a Ubuntu 18 machine
  • I have an existing network mounted Synology NAS that I want all our repos to sit. The NAS consists of two NAS bays in mirror arrangement for backup purposes. The main purpose of this NAS was to store data, so I can rearrange the NAS configuration as needed.
  • I am aware that some Synology drives have the ability to deploy gitlab via docker, but my particular model doesn’t support this

I was wondering:

  • I saw from the NFS documentation (https://docs.gitlab.com/ee/administration/nfs.html#soft-mount-option) that NFS support will end in gitlab 14. Instead of deploying the NAS on the network, should I move the NAS so it’s directly plugged into my workstation? What is a recommended option in this regard?
  • I know gitlab has its own backup best practices. Should I keep my NAS in its mirror arrangement, or break it and rely on gitlab’s backup setup instead?
  • We will likely be working with large text (~500 MB each) and video files (1-2 GB each). I know that even with git-lfs, it’s not recommended to put large files into git, but I was unable to find a suitable alternative. I mainly want a central version of all our data files so all the lab users are not overwriting each other’s updates (ie if they clean some data and want to re-upload), and less concerned about file history. Is using git for this purpose a bad idea?
  • If my gitlab installation breaks, are there easy ways for me to extract the contents of my repos (ie if I access the NAS directly) while I am fixing the gitlab installation? Or maybe set up an auto-export of the master branch?

Thank you.

  • What you should do depends on what your NAS supports.
    • I haven’t looked into what the end of NFS support means, but there are filesystem-features (e.g. locking) that are unlikely to work if you mount it on the server and just makes GitLab think it’s a local filesystem.
    • If your NAS supports behaving like an external drive, attaching it as such to the server GitLab runs on will probably work (but I suspect performance won’t be overwhelming
    • If your NAS supports to present itself as some kind of object store (we have some that present an S3-compatible interface), I suppose that would be better.
  • RAID (like your mirror) is not backup, and backup is not RAID. The two things serve completely different purposes.Whether you need one, the other or both depends on what you want to protect yourself against. As you haven’t said anything about that, we can’t address your question.
  • I haven’t worked with files of that size, but the task sounds like something I would use git for
  • The repositories will be store in git’s bare format (and in hashed paths). Whether that’s easy to extract from really depends on your knowledge, but it sounds like it will be hard for you.

Hi Grove,

Thanks for responding. Yes, unfortunately I am attempting to do this without a prior background in Ubuntu, data redundancy, or network management. So definitely a lot of learning as I go.

RAID vs backup - I did some googling and think I understand some of the nuances, and spoke to the person that set up the NAS to clarify our setup. We currently have 2 NAS bays with 2 drives each. The primary bay has a RAID 1 setup between the two drives, and uses Hyper Backup onto the second bay. So it sounds like the current configuration is both RAID and backup. Our main priorities are 1) protect against data loss, 2) protect against version conflicts, and 3) cap downtime to ~1-2 days for our team of 20 users.

Currently, the NAS is network mounted on the client machines and they just click and drag files from it. We want them to be able to continue to do this for files they don’t need tracked, but also have a partition for gitlab purposes. However, if this is not possible, we can convert all of the NAS to be purely for gitlab purposes.

In terms of the NFS support ending, it seems like the push is to move to Gitaly instead of NFS mounts. My NAS cannot function as an external drive or as object store, so I think I will have to look into deploying Gitaly. So I guess my question is…should I keep my NAS in its RAID 1/backup arrangement, or break the RAID and move to a multi-cluster Gitaly set up? Gitaly looks like overkill for what we need, but if that’s the direction of gitlab, then I will look into it.

Regarding your first priority: data loss can occur as a result of user error (which you normally protect yourself against with backups) or hardware error (which you normally use RAID to protect yourself against). So it really shuld be two points (but - if I understood everything correctly - your current solution protects aginst both).

In regards to your third priority, that could also be more specific: Do you also need that cap if the russians (or whoever, the point is the same, and this is just an extreme example) drop a bomb on your building? (But that’s not really relevant here)

I haven’t looked into what gitaly can do, but it sounds like overkill in your case. If the partition holding the data for GitLab is only mounted on the GitLab server (and it’s hard to see why you would need anything more), I see no problem in doing that and make the GitLab server use it as a local dir, it doesn’t need to know that it’s in another box. NFS support only matters if it needs to do something special. (In a GitLab issue about backups I’ve read about people using the “local” upload facility to get backups moved to NFS, GitLab didn’t need to and didn’t know that NFS was involved.)

A quick search lead me to believe that “Hyper Backup” is Synology’s name for a fairly standard backup solution, but I don’t know it, so for this I’m mostly guessing: Depending on how frequent copies are made with the “Hyper Backup” thing, it might be able to help you back to normal operation in case of a small class of user errors (deleting few files on the GitLab server), but for most cases you’ll want the backups made by GitLab’s included tool. So you might want to consider not using “Hyper Backup” on the partition GitLab uses, and perhaps instead make a partition to upload the backups to.

Ah…hopefully even the toughest thesis committee would understand if the building housing our data got levelled. My main priorities are to protect against common user and hardware error and maintain reasonable uptime in light of that. My rough numbers was more to note that our situation is pretty mundane and is not mission critical to have absolutely zero downtime.

Gitaly vs NFS: From https://docs.gitlab.com/ee/administration/nfs.html, it says “From GitLab 13.0, using NFS for Git repositories is deprecated. In GitLab 14.0, support for NFS for Git repositories is scheduled to be removed. Upgrade to Gitaly Cluster as soon as possible.”, so I was under the impression that in Gitlab 14, my only option is to use Gitaly if I don’t have the repo hard drive physically mounted to the computer. I have the NAS mounted via fstab, and am good to use Gitlab’s internal backup system. I will look into Gitlab’s Rake backup then.

Thank you.

GitLab’s build-in backup is based on a rake script, so I think the two things you talk about in the last sentence are the same.

I just checked that link,and on one hand it looks like what I thought, saying

NFS can be used as an alternative for object storage

and that’s not where you are.
On the other hand it doesn’t describe telling GitLab that certain directories are on NFS, which would match my understanding of why NFS support needed documentation - so I really don’t know where you would be.

Hi Grove,

I will stick with using NFS mount and tell Gitlab the drive is local, and see how far it gets me. I did note the same thing as you did - the documentation doesn’t explicitly say to inform Gitlab about the NFS. It looks like the effort to deploy on NFS is pretty small, so if Gitlab 14 does force everyone to switch to Gitaly, I will put some more time into learning Gitaly at that time.

Thank you for your help.