Cannot push to repositories after upgrade for source Gitlab 10.6.x to 12.7.5 - error "tmp/tests/gitaly/gitaly-hooks: not found"

Hello Gitlab Community Members. Long time reader, first time poster.

We’ve used Gitlab-ce for a while, and the maintainer in our organisation left years ago. Recently as a project to update our Gitlab and take advantage of new features, I’ve been tasked with updating this system. I’m moderately familiar with ruby, and have only a passing-to-none understanding of gitaly, go, and other requirements.

Aside from some minor quirks during the upgrade process, everything is running well - gitlab itself is running without errors. The only problem I can find (a pretty major one) is that when I attempt to push new commits to a branch (or a new branch) to the remote repository, I get the following error:

[joel.mclean@lin-kil-i-125 playbooks]$ git push origin master
Enter passphrase for key '/home/joel.mclean/.ssh/id_rsa': 
Enumerating objects: 36, done.
Counting objects: 100% (36/36), done.
Delta compression using up to 4 threads
Compressing objects: 100% (19/19), done.
Writing objects: 100% (21/21), 1.79 KiB | 919.00 KiB/s, done.
Total 21 (delta 13), reused 0 (delta 0)
remote: /home/git/gitaly/ruby/git-hooks/pre-receive: 5: exec: tmp/tests/gitaly/gitaly-hooks: not found
To git.micron21.com:micron21/jh-ansible-tools.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'git@git.micron21.com:micron21/jh-ansible-tools.git'

I most recently did the upgrade from 12.5.9 to 12.7.5 today, which executed without any errors, following the guide in: https://gitlab.com/gitlab-org/gitlab/-/blob/v12.7.5-ee/doc/update/patch_versions.md (for each version)

The output of rake gitlab:check is :

root@git:/home/git/gitlab# sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production
Warning: fuzzy message was ignored.
msgid = '1 day
Warning: fuzzy message was ignored.
msgid = 'Assignee
Warning: fuzzy message was ignored.
  /home/git/gitlab/locale/en/gitlab.po: msgid 'Branch'
Warning: fuzzy message was ignored.
msgid = 'ContainerRegistry|Remove tag
Warning: fuzzy message was ignored.
  /home/git/gitlab/locale/en/gitlab.po: msgid 'Fork'
Warning: fuzzy message was ignored.
msgid = 'Milestone
Warning: fuzzy message was ignored.
  /home/git/gitlab/locale/en/gitlab.po: msgid 'Tag'
Warning: fuzzy message was ignored.
msgid = 'project
Checking GitLab subtasks ...

Checking GitLab Shell ...

GitLab Shell: ... GitLab Shell version >= 11.0.0 ? ... OK (11.0.0)
Running /home/git/gitlab-shell/bin/check
Internal API available: OK
Redis available via internal API: OK
gitlab-shell self-check successful

Checking GitLab Shell ... Finished

Checking Gitaly ...

Gitaly: ... default ... OK

Checking Gitaly ... Finished

Checking Sidekiq ...

Sidekiq: ... Running? ... yes
Number of Sidekiq processes ... 1

Checking Sidekiq ... Finished

Checking Incoming Email ...

Incoming Email: ... Reply by email is disabled in config/gitlab.yml

Checking Incoming Email ... Finished

Checking LDAP ...

LDAP: ... LDAP is disabled in config/gitlab.yml

Checking LDAP ... Finished

Checking GitLab App ...

Git configured correctly? ... yes
Database config exists? ... yes
All migrations up? ... yes
Database contains orphaned GroupMembers? ... no
GitLab config exists? ... yes
GitLab config up to date? ... yes
Log directory writable? ... yes
Tmp directory writable? ... yes
Uploads directory exists? ... yes
Uploads directory has correct permissions? ... yes
Uploads directory tmp has correct permissions? ... yes
Init script exists? ... yes
Init script up-to-date? ... yes
Projects have namespace: ... 
Micron21 / Service ... yes
... removed all the projects to save space - they all said "yes" ...
Micron21 / docker-images ... yes
Redis version >= 2.8.0? ... yes
Ruby version >= 2.5.3 ? ... yes (2.6.5)
Git version >= 2.22.0 ? ... yes (2.23.0)
Git user has default SSH configuration? ... yes
Active users: ... 15
Is authorized keys file accessible? ... yes

Checking GitLab App ... Finished


Checking GitLab subtasks ... Finished

My biggest problem is that I didn’t test pushing a new commit because I didn’t think to test that in earlier versions - I did brief UAT each upgrade, but didn’t actually push a new branch or a new commit - just ran “git pull / git push / git fetch” to confirm the results.

Based on the error, I assume that there’s some problem with Gitaly. On the client in question, I’ve updated my local copy of GIT to be running 2.23.1

[joel.mclean@lin-kil-i-125 playbooks]$ git --version
git version 2.23.1

I’ve also re-installed gitaly, and even done:

root@git:/home/git/gitaly# make clean
git clean -fdX
Removing .ruby-bundle
Removing _build/
Removing config.toml
Removing gitaly
Removing gitaly-debug
Removing gitaly-hooks
Removing gitaly-ssh
Removing gitaly-wrapper
Removing praefect
Removing ruby/.bundle/
Removing ruby/vendor/bundle/

root@git:/home/git/gitaly# find . -type f -name config.toml
./_support/instrumented-cluster/gitaly1/config.toml
./internal/praefect/config/testdata/config.toml

root@git:/home/git/gitaly# cd /home/git/gitlab
root@git:/home/git/gitlab# sudo -u git -H bundle exec rake "gitlab:gitaly:install[/home/git/gitaly,/home/git/repositories]" RAILS_ENV=production
Warning: fuzzy message was ignored.
msgid = '1 day
Warning: fuzzy message was ignored.
msgid = 'Assignee
Warning: fuzzy message was ignored.
  /home/git/gitlab/locale/en/gitlab.po: msgid 'Branch'
Warning: fuzzy message was ignored.
msgid = 'ContainerRegistry|Remove tag
Warning: fuzzy message was ignored.
  /home/git/gitlab/locale/en/gitlab.po: msgid 'Fork'
Warning: fuzzy message was ignored.
msgid = 'Milestone
Warning: fuzzy message was ignored.
  /home/git/gitlab/locale/en/gitlab.po: msgid 'Tag'
Warning: fuzzy message was ignored.
msgid = 'project

After reinstalling gitaly this way, and restarting gitlab, the services are all working no problem, but I just can’t push to remote repositories. Even if I re-add the remote with git remote add test git@git.micron21.com.etcet.cetc and try to push my local changes to this “new” remote, I get the same error.

Gitlab starts okay, and reports no errors, and I can’t find an example of this error anywhere else on the internet.
Output of systemctl status gitlab is:

root@git:/home/git/gitlab# systemctl restart gitlab
root@git:/home/git/gitlab# systemctl status gitlab
● gitlab.service - LSB: GitLab git repository management
   Loaded: loaded (/etc/init.d/gitlab)
   Active: active (exited) since Thu 2020-02-13 18:48:42 AEDT; 5s ago
  Process: 16899 ExecStop=/etc/init.d/gitlab stop (code=exited, status=0/SUCCESS)
  Process: 17002 ExecStart=/etc/init.d/gitlab start (code=exited, status=0/SUCCESS)

Feb 13 18:48:04 git gitlab[17002]: Starting GitLab web server (unicorn)
Feb 13 18:48:04 git gitlab[17002]: Starting GitLab Sidekiq
Feb 13 18:48:04 git gitlab[17002]: Starting GitLab Workhorse
Feb 13 18:48:04 git gitlab[17002]: Starting Gitaly
Feb 13 18:48:42 git gitlab[17002]: .
Feb 13 18:48:42 git gitlab[17002]: The GitLab web server with pid 17044 is running.
Feb 13 18:48:42 git gitlab[17002]: The GitLab Sidekiq job dispatcher with pid 17132 is running.
Feb 13 18:48:42 git gitlab[17002]: The GitLab Workhorse with pid 17093 is running.
Feb 13 18:48:42 git gitlab[17002]: Gitaly with pid 17091 is running.
Feb 13 18:48:42 git gitlab[17002]: GitLab and all its components are up and running.
Feb 13 18:48:42 git systemd[1]: Started LSB: GitLab git repository management.

My next plan is to migrate everything over to an Omnibus installation, but I wanted to do that ‘later on’ and not have to roll back to various snapshots until I find one where it was working and try again.

As it stands only really I am using this platform, so it can stay in this “broken” state, and I can try a few hotfixes if anyone has any ideas.

Thank you in advance for your wisdom and guidance!

Okay, so with a fresh outlook and more coffee, I found the problem; I believe there is an error in how gitaly’s config.toml is generated when there isn’t one present. I had assumed that removing the config.toml file and letting gitaly reinstall and introduce it’s own config file would potentially fix issues I had with gitaly but the default config file was generated with a bad bin_dir

root@git:/home/git/gitaly# cat config.toml
#bin_dir = "tmp/tests/gitaly"
bin_dir = "/home/git/gitaly"
..

I commented out the non-existent tmp/tests/gitaly and replaced it with the correct path to gitaly where gitaly-hooks is actually present.

I suspect this problem of finding the bin_dir is introduced in Gitaly’s installation steps, where it checks for a config.toml, and if none exists, creates one with this bad setting. The config.toml.example has the correct information, so I’m not sure where gitaly installation pulled this tmp directory from.

root@git:/home/git/gitaly# cat config.toml.example 
# Example Gitaly configuration file
# Documentation lives at https://docs.gitlab.com/ee/administration/gitaly/ and
# https://docs.gitlab.com/ee//administration/gitaly/reference

socket_path = "/home/git/gitlab/tmp/sockets/private/gitaly.socket"

# The directory where Gitaly's executables are stored
bin_dir = "/home/git/gitaly"

I hope this can help someone else who experiences the same problem as me.

As this is likely an installation bug for gitaly that may only apply to source upgrades, how would I raise this as a bug to be tracked?