All Runners are no more online while in contact after update 16.11

Hello,

I have a problem today, my runners aren’t active in my Gitlab, i don’t understand why since the contact is fine.
Any hints please?

systemctl status gitlab-runner.service
gitlab-runner[3181473]: WARNING: Checking for jobs... failed                runner=ExpRHGXnD status=POST https://<gitlab.instance>/api/v4/jobs/request: 500 Internal Server Error

I see that the POST is failing.

I tried to register a new Runner, with the same results :frowning: .
I tried the following commands without results either even when the output is fine:

gitlab-runner status
gitlab-runner start
gitlab-runner verify

It happenned only today, during the night there was an update of Gitlab to 16.11 and i suppose since then the runners can’t contact the Gitlab instance.

I updated all my runners to match this version, without any results.

I’m stuck, please help me debug this.

I can confirm that it’s a problem after my update since i took a backup of my Gitlab instance before update, who is at version v16.10.3, and i can register a Runner there that have the status online in the list of runners.

How can i solve this problem of runners without the online status please in my self hosted Gitlab used in production who was updated automaticly to v16.11.0?

Thanks in advance.

When you upgrade Gitlab, you should also be upgrading you runners.

Also, if you upgraded your runners to 16.11.0, then you should also have upgraded your Gitlab server to 16.11.0.

You don’t mention if your Gitlab server is still on 16.10.3 or whether you have upgraded that as well? Both Gitlab and runner versions should always match.

The Gitlab server in production was updated to 16.11.0 automaticly during last night.

Since some people suggested me to have the runner aligned with the Gitlab server, i made it this morning in the hope of having them back online in my Gitlab server, without success.

So i tried to check if using an instance backup of my Gitlab prior to the update have the same symptoms or not, so 16.10.3 doesn’t have problem about runners online.

Problem solved thanks to @Niklas help on discord who told me the following:

  • You can check the status of the migrations with gitlab-rake db:migrate:status and if any of them are not up , you should be able to run them with gitlab-rake db:migrate

  • That means that the migrations were not all applied, in most cases it should be safe to retry them, but I would suggest a backup just in case (I created a backup before each upgrade anyways on my instances)

Since all migrations weren’t made correctly, gitlab-rake db:migrate solved it.

2 Likes

I’m curious if under Admin → Monitoring → Background Migrations whether anything was showing up here. Obviously we cannot see that now, since you’ve ran the command from the CLI. It’s possible that the server was processing the background migrations slowly. Although surprised that it would have taken as long as it did. On mine from 16.10.3 to 16.11.0 they were pretty quick, literally only a few minutes.

Anyway, glad it’s solved :slight_smile:

@iwalker

I encountered the same problem. The GitLab admin panel under Monitoring → Background Migrations showed 0 for Queued/Finalizing/Failed jobs. However, running gitlab-rake db:migrate:status revealed that many migrations were actually down. Fortunately, running gitlab-rake db:migrate fixed the issue for me.

2 Likes