I’ve got a Linode running Ubuntu 20.04 and gitlab-ce. For the first time ever, ran into an upgrade issue where the web interface is no longer accessible.
After the minor version upgrade from 14.9.3 to 14.10.5, the root domain of my GitLab instance returns a blank HTTP 200 response. All other paths just return 404. I changed no configuration files before this upgrade.
The upgrade itself completed with no errors. All the status checking commands I’ve tried return no errors. The only errors I’ve found have to do with nginx, and this is where it gets mysterious. This error keeps getting posted to /var/log/gitlab/nginx/gitlab_error.log
:
2022/07/01 09:43:03 [crit] 2586243#0: *116949 connect() to unix:/var/opt/gitlab/gitlab-workhorse/sockets/socket failed (2: No such file or directory) while connecting to upstream, client: <ip redacted>, server: <my domain>, request: "POST /api/v4/jobs/request HTTP/1.1", upstream: "http://unix:/var/opt/gitlab/gitlab-workhorse/sockets/socket:/api/v4/jobs/request", host: "<my domain>"
And this gets printed in /var/log/gitlab/nginx/registry_gitlab_error.log
:
2022/07/01 11:35:57 [error] 4406#0: *34 connect() failed (111: Connection refused) while connecting to upstream, client: <ip redacted>, server: <my registry domain>, request: "POST /api/v4/jobs/request HTTP/1.1", upstream: "http://[::1]:5000/api/v4/jobs/request", host: "<my domain>"
The path /var/opt/gitlab/gitlab-workhorse/sockets/socket
exists, and if I curl
it directly, I get the correct HTML content:
curl --unix-socket /var/opt/gitlab/gitlab-workhorse/sockets/socket http://<my domain>/users/sign_in
But for some reason, the nginx embedded in Omnibus GitLab is not able to read the socket.
Below, the relevant files/dirs and their permissions:
/var/opt/gitlab/gitlab-workhorse:
drwxr-x--- 3 git gitlab-www 4096 2022-07-01 10:59 gitlab-workhorse
/var/opt/gitlab/gitlab-workhorse/sockets:
drwxr-x--- 2 git gitlab-www 4096 2022-07-01 11:28 sockets
/var/opt/gitlab/gitlab-workhorse/sockets/socket:
srwxrwxrwx 1 git git 0 2022-07-01 11:28 socket
All the permissions seem fine for the gitlab-www
user to reach the socket. The nginx workers are running as that user. I even temporarily set a shell for gitlab-www
in /etc/passwd
and curl
ed the socket successfully after doing su gitlab-www
.
gitlab-ctl reconfigure
, gitlab-ctl restart
, restarting the whole server, even upgrading further to 15.x has not helped. I do have backups with which to restore the full server to the pre-upgrade state if needed, but I’d rather not go there unless absolutely necessary.
Any ideas?