CE Edition: Upgraded to 17.0.0 from 15.11 and now no runner works

We have upgraded (following the steps required) from 15.11 to 17.0 and now our runners do not work:

Jun 27 11:27:42 localhost gitlab-ci-multi-runner: time="2024-06-27T11:27:42+02:00" level=info
Jun 27 11:27:43 localhost gitlab-ci-multi-runner: time="2024-06-27T11:27:43+02:00" level=error msg="Checking for jobs... forbidden" runner=rotJLM5r
Jun 27 11:27:45 localhost gitlab-ci-multi-runner: time="2024-06-27T11:27:45+02:00" level=error msg="Checking for jobs... forbidden" runner=y9hjR8e1
Jun 27 11:27:46 localhost gitlab-ci-multi-runner: time="2024-06-27T11:27:46+02:00" level=error msg="Checking for jobs... forbidden" runner=rotJLM5r
Jun 27 11:27:48 localhost gitlab-ci-multi-runner: time="2024-06-27T11:27:48+02:00" level=error msg="Checking for jobs... forbidden" runner=y9hjR8e1
Jun 27 11:27:49 localhost gitlab-ci-multi-runner: time="2024-06-27T11:27:49+02:00" level=error msg="Checking for jobs... forbidden" runner=rotJLM5r
Jun 27 11:27:49 localhost gitlab-ci-multi-runner: time="2024-06-27T11:27:49+02:00" level=error msg="Runner http://192.168.1.101/rotJLM5rseHPpHy6syhg is not healthy and will be disabled!"
Jun 27 11:27:51 localhost gitlab-ci-multi-runner: time="2024-06-27T11:27:51+02:00" level=error msg="Checking for jobs... forbidden" runner=y9hjR8e1
Jun 27 11:27:51 localhost gitlab-ci-multi-runner: time="2024-06-27T11:27:51+02:00" level=error msg="Runner http://192.168.1.101/y9hjR8e1mznaiNBphGRm is not healthy and will be disabled!"

These runners and their jobs were working perfectly in 15.11.

Any help will be apreciated!

Did you upgrade your runners as well? Usually when upgrading Gitlab you should update Gitlab and the runners as well to ensure they are on the same version.

Thanks for the reply.

We did a sudo yum update gitlab-runner…

But no other thing…

What do you mean with updating the runners? (in plural)

More info, if we just look at one runner
This was log before the upgrade

Jun 27 06:05:08 localhost gitlab-ci-multi-runner: time="2024-06-27T06:05:08+02:00" level=info msg="Checking for jobs... received" job=24748 repo_url="http://192.168.1.101/mannestech/financegear.git" runner=4b1500df
Jun 27 06:06:07 localhost gitlab-ci-multi-runner: time="2024-06-27T06:06:07+02:00" level=info msg="Job succeeded" job=24748 project=3 runner=4b1500df

And this after:

Jun 27 10:24:14 localhost gitlab-ci-multi-runner: time="2024-06-27T10:24:14+02:00" level=warning msg="Checking for jobs... failed" runner=4b1500df status="couldn't execute POST against http://192.168.1.101/api/v4/jobs/request: Post http://192.168.1.101/api/v4/jobs/request: dial tcp 192.168.1.101:80: getsockopt: connection refused"
Jun 27 10:24:21 localhost gitlab-ci-multi-runner: time="2024-06-27T10:24:21+02:00" level=warning msg="Checking for jobs... failed" runner=4b1500df status="502 Bad Gateway"

But Google did not help with that…

On the assumption you may have more than one runner, since this is a valid option to split jobs rather than have them queue waiting for a single runner to become available.

No, its not the case, only one runner. Thanks anyway.

We are following the hint on the 502 error (bad gateway) that looks different from the situation before the upgrade.

Any thought on that?

Not really sure on that error. I started using runners about two years ago, so would mean I was on a version of Gitlab well before 15.11 when I registered it, and my runners are connected and green.

I’m assuming you’ve checked under Admin Area → CI/CD → Runners and the status here is not green? I know when I registered my runners I had to do it with tokens, and without knowing whether you did the same or added runners before token registration became available, it would well be that might be a potential reason why.

I suppose you could try removing the runner and re-registering it with a token, and see if that helps with anything. Either that, or wait and see if anyone else posts here with a possible solution to your problem.

Thanks again for your replies.

I do not understand the Admin → CI/CD → Runners hint.

I do not see how to access that… I have tried with the (superuser account, the one that adds users and other administrative things), but there is no CI/CD (or build) option…

I am running a CE home installation…

(if you are refering to Settings → CI/CD → Runners, they are all green there)

Solved the problem of reconfigure (disabling gitlab-kas: How do I disable gilab-kas in CE Linux installation? - #2 by david-mannes) and now I am trying to move forward installing 17.1 (since i have updated gitlab-runner to 17.1, just to make them stay in sync).

1 Like

Problem solved.

The installation of 17.0 had not finished correctly (gitlab-kas version parsing did not work) and the application was not ready to run (not stable, there were things to finish in the installation).

So the solution was to disable gitlab-kas and then reconfigure.

We did not try if 17.0 was working then and decided to go for the 17.1 upgrade instead (because we had upgraded fitlab-runner to 17.1).

Now, old ci/cd jobs have resumed working and everything looks ok.

Thank you @iwalker for your support and help.

2 Likes

The kas issue is a strange one, as I think it only occurred for some people and not others, depending on what Linux distro was used underneath.

I personally have disabled all functionaliy that I don’t need, so a minimal Gitlab install for me service wise looks like this:

[root@gitlab ~]# gitlab-ctl status
run: gitaly: (pid 38771) 155048s; run: log: (pid 2431) 514284s
run: gitlab-workhorse: (pid 38758) 155049s; run: log: (pid 2436) 514284s
run: logrotate: (pid 53106) 246s; run: log: (pid 2432) 514284s
run: nginx: (pid 38784) 155048s; run: log: (pid 2460) 514284s
run: postgresql: (pid 2435) 514284s; run: log: (pid 2434) 514284s
run: puma: (pid 38790) 155048s; run: log: (pid 2438) 514284s
run: redis: (pid 2455) 514284s; run: log: (pid 2453) 514284s
run: sidekiq: (pid 38806) 155047s; run: log: (pid 2442) 514284s

a while back when gitlab-kas appeared as a service, I knew I wasn’t using Gitlab with Kubernetes, so didn’t want resources being used by something that I don’t need. Obviously for those that need this service, then it’s a different matter. But as long as you don’t need, best to disable the unnecessary stuff.