Minimum setup for High Availability

:hugs: Please help fill in this template with all the details to help others help you more efficiently. Use formatting blocks for code, config, logs and ensure to remove sensitive data.

Problem to solve

Planning to setup an onprem HA system for a small number of users so the setup needs to be minimal. Already have high availability Postgres server pair and HAProxie pair in place.

Can a three-node do the job for GitLab? The aim for the outcome is to be able to shutdown one node for OS patching or GitLab upgrade.

Steps to reproduce

Which troubleshooting steps have you already taken? Can you link to any docs or other resources so we know where you have been?

Configuration

Provide screenshots from the GitLab UI showing relevant configuration, if applicable.
On self-managed instances, add the relevant configuration settings or changes.

Versions

Please add an x whether options apply, and add the version information.

  • Self-managed
  • GitLab.com SaaS
  • Dedicated

Versions

  • GitLab (Web: /help or self-managed system information sudo gitlab-rake gitlab:env:info):

Helpful resources

  1. Check the FAQ for helpful documentation, issues/bugs/feature proposals, and troubleshooting tips.
  2. Before opening a new topic, make sure to search for keywords in the forum search
  3. Check the GitLab project for existing issues. If you encounter a bug, please create a bug report issue.
  4. Review existing troubleshooting docs.

Thanks for taking the time to be thorough in your request, it really helps! :blush:

Not sure it’s possible with 3 nodes but potentially (FYI- you can remove the stuff from the template that isn’t necessary to your question.)

I say this because if you look at a reference architecture- Reference architecture: Up to 40 RPS or 2,000 users | GitLab Docs - you’ll see that it has a few parts. With just 3 nodes, I think the best you might get is 2 running gitlab-rails (the outward facing part) and the last node doing sidekiq/redis. That said, what happens when you need to upgrade that one?

Honestly I’d just recommend a simple omnibus install (and you can use an external postgres if you want) and then accept a small amount of downtime on upgrades. Just snapshot the VM if you go that way, upgrade, done. Downtime is pretty minimal and it’s a much less complex system when you’re only serving up a few users.

Just my two cents.

Personally I would run geo on a couple nodes and have full up replicas. We run our prod instance in AWS and there is a Geo node in our ‘back room’ on a 1U server that is the HA backup,

(And our AWS instance is NOT in US-EAST-1 region, so we have NOT in 5+ years had the server unavailable because of AWS issues.. )

We’ve had to reboot it once of twice during the day in all of those 5 years.

‘course it gets patched every weekend and rebooted and also gets gitlab software updates about every 2 weeks which is also a reboot and update…

(And we’ve got !40 some users on the machine..)

My plan is to have 3 nodes, all nodes running the same set of services

  • Consul
  • Redis
  • Praefect
  • Gitaly
  • Rail/Sidekiq

Yes I agree with the part of running single Omnibus but was overulled by the management, as he saw the GitLab HA setup in his previous job. Three nodes setup is not too complicate however, if you don’t have to notify user of outage and have to do it after business hour …