Migrating from NFS to object and/or Gitaly storage

We maintain a self-hosted GitLab instance (Enterprise Edition Ultimate under the Education Licence) as a small cluster without HA consisting of three application servers, a database server, a redis server and a monitoring server. Currently we have in total about 30 TB of data including repositories, artifacts, containers, uploads and LFS files, all stored on NFS shares.

As announced by GitLab, upon the release of GitLab 15.0 technical and engineering support for using NFS to store Git repository data will be officially at end-of-life.

We are looking at this change with some concern, as it means not only migrating to new storage system(s) we have no or limited experience of but also rethinking our backup concept.

A few questions:

  1. Will the functionality for using NFS to store repository data be completely removed in GitLab 15.0, meaning that we will need to migrate all repositories to another storage platform before upgrading? Or will it still work, but be out of support?

  2. Will it be possible to continue storing other types of data (artifacts, containers, uploads, LFS files) on NFS shares? The Using NFS with GitLab page recommends object storage, but for us the NFS solution is the most cost-effective as we need it for other systems as well and are satisfied with its performance.

  3. The recommended solution to store repositories is Gitaly storage or Gitaly Cluster. Do I understand correctly that they can be used only to store the repositories itself and we need another solution for the other types of data?

  4. How would you suggest backing up large amounts of repositories? Our current backup and restore concept relies on filesystem level snapshots on the NFS shares combined with database dumps taken at the same time than the snapshots.
    The Gitaly page mentions that Gitaly Cluster does not support snapshot backups and recommends using the official backup and restore rake tasks. That is at least currently not really a usable solution for us, as running the task even with skipping all other categories than repositories would take several days. Probably it would be faster on Gitaly Cluster than on the current NFS setup, but most likely still take a very long time.
    If we stopped the whole GitLab service (e.g. before a upgrade), could a filesystem level snapshot combined with database dumps from Gitaly and GitLab itself be used as a backup?

If you can answer at least some of these questions, it would be most appreciated.

8 Likes

Upon reading the release notes of GitLab 15.6 I noticed that this topic became more urgent again. Towards the end of the page it reads “NFS as Git repository storage is no longer supported. Migrate to Gitaly Cluster as soon as possible.”

We have not yet moved to Gitaly Cluster, NFS has been working well for us until now. I would like to see this kind of major changes requiring significant migration efforts presented more widely in communication channels instead of only being shortly mentioned at the end of release notes. I have seen no articles on the GitLab Blog and questions here on the forum have so far remained unanswered.

The questions I presented in April are still highly relevant for us. Here again with a small reformulation for questions 1 and 4:

  1. Does GitLab 15.6 include some changes related to repositories stored on NFS compared to version 15.5 which we are running now? Can be upgrade to 15.6. before migrating to Gitaly or is it better to stay with the version 15.5 for now?

  2. Will it be possible to continue storing other types of data (artifacts, containers, uploads, LFS files) on NFS shares? The Using NFS with GitLab page recommends object storage, but for us the NFS solution is the most cost-effective as we need it for other systems as well and are satisfied with its performance.

  3. The recommended solution to store repositories is Gitaly storage or Gitaly Cluster . Do I understand correctly that they can be used only to store the repositories itself and we need a separate solution for the other types of data?

  4. How would you suggest backing up large amounts of repositories? Our current backup and restore concept relies on filesystem level snapshots on the NFS shares combined with database dumps taken at the same time than the snapshots.
    The Gitaly page mentions that Gitaly Cluster does not support snapshot backups because of possible inconsistency with the Praefect database. If we stop the whole GitLab service (e.g. before a upgrade or once every night), could a filesystem level snapshot combined with database dumps from Gitaly and GitLab itself be used as a backup?
    The recommended method according to the GitLab documentation are the official backup and restore rake tasks, which take a long time to run and require significant additional disk space with a large number of repositories (in our case about 80 000 repositories taking up nearly 10 TB of disk). As a new development there is now the possibility to use incremental backups which would certainly help but I am still not sure if that’ll be performant enough.
    Do you have any figures on how much faster the backup would run on a Gitaly Cluster compared to a high-end NFS appliance? What is the backup strategy on Gitlab.com?

If you can answer at least some of these questions, it would be most appreciated.

1 Like

Answering to myself, as I was able to clarify most of the questions with GitLab Support (we use NFS also on our smaller commercially licensed GitLab instance for internal use) and think that the topic is of interest for the wider community:

Upgrading to GitLab 15.6 using NFS is possible and in fact we have already upgraded our both instances. As the Statement of Support indicates the technical and engineering support for using NFS to store Git repository data is officially at end-of-life regardless of the GitLab version. There are no technical changes which would make staying with 15.5. or earlier versions preferable.

NFS is still supported for other types of data and only the repositories are handled by Gitaly.

The last question is a trickier one. The potential inconsistency issues are mainly related to the Gitaly Cluster. A single Gitaly instance (either a GitLab instance where all the components including Gitaly are installed on a single VM or a single Gitaly node in a multi-VM instance) would be somewhat easier to handle in this respect.

We are planning to test whether a single dedicated Gitaly node will be able to handle the repository storage of our larger instance and stay with NFS for the other types of data. If that works, we’ll set up a method of switching to maintenance mode shortly every night (as well as before each monthly upgrade) for taking a safe snapshot. That could be combined with a second less frequent backup using the officially supported GitLab backup scripts e.g. once a week during the weekend.

I’ll try to remember reporting here to share our experiences after we’ve completed the migration, which we’ll probably do towards the end of Q1/2023.

Arto

2 Likes

Any luck on the migration to a single gitaly node? We have a similar system to yours (small-ish internal EE gitlab system with multiple web nodes), and I’m not all that comfortable with the risk (and cost) of creating a gitaly cluster, given that there isn’t a lot of public info on how it operates and its risks around failure states.

We haven’t had time to do the migration or even load test a single Gitaly node with 10 TB capacity yet. Until now we’ve only moved the repositories on a local disk on our smaller internal single-VM instance. The larger instance with several front-end nodes is still relying on NFS (which has been working fine for us this far).

Anyone has experience migrating from NFS to Object storage?
In our case we want to move from NFS to Google cloud object storage.

Hello @Teraes We are also using NFS only for gitlab repository storage. But ours is very small around 50GB repository size.Should I stay with NFS for the upcoming versions.
Also one more doubt. Could you please tell how you are upgrading the multinode gitlab instances.
Our system has 2 gitlab instance, 1 postgres and 1 redis and NFS. Should I stop both the gitlab and upgrade one by one. Please suggest the steps it will be helpful.