GitLab HA Issues

I am currently configuring a new Gitlab ‘cluster’ for use in our projects. I’ve personally used Gitlab for years, but only really run it as a standalone installation using an internal database and storage local to the server.

For this new cluster, there is a business need to ensure that it is highly available, and can be scaled. My current configuration is as below (all of this is on Amazon Web Services):

PostgreSQL Servers:
2x PostgreSQL Servers in Master/Slave Replication - this appears to be working fine with no issues

Gitlab Servers:
2x Gitlab Application Servers - these appear to be behaving ‘ok’. I can access the Web UI and if I create a project on one of them, then it appears in the Web UI of the other (if I access them directly).

This is accomplished by having them both point at the external PostgreSQL cluster (obviously) and then for disk storage there is an EFS volume mounted on both servers at /data. I have changed the Gitlab configuration to look at this volume for Git Data Directories. Testing manually that a file created by one server appears and is available to the other server works without issue.

There is a load balancer that distributes traffic on the ports 80, 443 (currently not used) and 22.

So once I had configured all of this, I thought great I’m done. Decided to test actually cloning and committing to a git repository to ensure it was all working and this is where it has fallen down.

About 50% of the time I get an error: “GitLab: The project you were looking for could not be found.” This obviously equates to when the request via the load balancer lands on a particular server so if I try and clone manually from the servers direct it works for Server 1 and not for Server 2 confirming my suspicions that the error occurs when it lands on a particular server.

The only thing that two Gitlab servers don’t share between them is Redis - what does Gitlab actually use Redis for? Could this explain the issues I’m having? I would have expected having a shared file storage volume and database should be all that is required?

Do I need to have shared Redis as well? Or is there something else I need to look at doing in the configuration to finish this off?

All servers involved are Ubuntu 16.04 LTS
GitLab Enterprise Edition 10.4.3-ee c65e2ba (unlicensed)

Additionally we get a lot of 422 Errors when trying to do almost anything on the website. At first I thought this was quite minimal but it seems to be happening a lot

Did you resolve this eventually?
Did this have anything to do with EFS? I’m specifically looking at the warnings against using EFS in all the GiLab HA docs.