GitLab mirroring options including metadata for multiple instances

Hi,

I have access to an open source license that allows me to fully mirror a repository to a self-hosted GitLab instance including metadata. This is a fallback option that also allows us to play with the self-hosted GitLab option and test different things. However, since we are a decentralized team, we would also like others to be able to host their own mirrors on their own infrastructure as well as potential additional fallbacks. Mirrors in our case particularly means repository/code but also all the metadata. I would like to ask what options there are to achieve this, given that we only have one license for now.

Some ideas that I have had: Getting more licenses, obviously would work but doesn’t scale. I looked at other mirroring options (the pull option is the one requiring the license) and I found push mirroring. It would be feasible in theory to configure my instance to mirror to others, however push mirroring seems to only work for repository data and not metadata (feel free to challenge me on this, but this is what I found). Then there might be the possibility to simply back up the whole database to these other instances regularly. This may be feasible but I am curious if others have tried this and if there are recommended tools that have been used for this successfully.

Happy to receive ideas about potential options and alternative approaches, thanks!

2 Likes

You really need to be looking at configuring High Availability:

Pull/push mirroring is OK if you just want the repo data, but you won’t get the wiki, or anything else as you mentioned. Backup/restore to the other nodes is fine in a disaster recovery scenario, but far too time-consuming to keep asking people to backup/restore to get the latest data - backing up the database wouldn’t be enough.

2 Likes

Can you show/link an example repository, and explain what kind of metadata needs to be synced?

It’s not really related to any particular repository. What I would like is to have all the metadata that is not part of the code available as well, i.e. all the issues + their conversations, all the merge requests + their conversations including inline code review comments, tags, releases, and branches.