Hi there,
I run an up-to-date Debian Buster on a VPS. I installed gitlab-ce and gitlab-runner from packages.gitlab.com, and therefore have the following versions:
# apt show gitlab-ce
Package: gitlab-ce
Version: 12.0.3-ce.0
# apt show gitlab-runner
Package: gitlab-runner
Version: 12.0.1
gitlab-runner
is launched by systemd
, and the associated file in /etc/systemd/system/gitlab-runner.service
contains this:
[Unit]
Description=GitLab Runner
After=syslog.target network.target
ConditionFileIsExecutable=/usr/lib/gitlab-runner/gitlab-runner
[Service]
StartLimitInterval=5
StartLimitBurst=10
ExecStart=/usr/lib/gitlab-runner/gitlab-runner "--debug" "run" "--working-directory" "/home/gitlab-runner" "--config" "/etc/gitlab-runner/config.toml" "--service" "gitlab-runner" "--syslog" "--user" "gitlab-runner"
Restart=always
RestartSec=120
[Install]
WantedBy=multi-user.target
I successfully registered a shared runner, but all jobs fail. For example, one repository contains the following .gitlab-ci.yml
:
---
stages:
- precompile
precompile:
stage: precompile
tags:
- base
before_script:
- docker --version
script:
- docker build . -t cv:${CI_PIPELINE_ID}
The job is received by gitlab-runner, but immediately fails. Here is the log from systemd:
gitlab-runner[30858]: Checking for jobs... received job=106 repo_url=https://gitlab.dunatotatos.com/dunatotatos/cv.git runner=bUxm5K1S
gitlab-runner[30858]: Checking for jobs... received job=106 repo_url=https://gitlab.dunatotatos.com/dunatotatos/cv.git runner=bUxm5K1S
gitlab-runner[30858]: Failed to requeue the runner: builds=1 runner=bUxm5K1S
gitlab-runner[30858]: Running with gitlab-runner 12.0.1 (0e5417a3) job=106 project=26 runner=bUxm5K1S
gitlab-runner[30858]: on Azazel bUxm5K1S job=106 project=26 runner=bUxm5K1S
gitlab-runner[30858]: Shell configuration: environment: []
gitlab-runner[30858]: dockercommand:
gitlab-runner[30858]: - sh
gitlab-runner[30858]: - -c
gitlab-runner[30858]: - "if [ -x /usr/local/bin/bash ]; then\n\texec /usr/local/bin/bash --login\nelif [
gitlab-runner[30858]: -x /usr/bin/bash ]; then\n\texec /usr/bin/bash --login\nelif [ -x /bin/bash ]; then\n\texec
gitlab-runner[30858]: /bin/bash --login\nelif [ -x /usr/local/bin/sh ]; then\n\texec /usr/local/bin/sh
gitlab-runner[30858]: --login\nelif [ -x /usr/bin/sh ]; then\n\texec /usr/bin/sh --login\nelif [ -x /bin/sh
gitlab-runner[30858]: ]; then\n\texec /bin/sh --login\nelif [ -x /busybox/sh ]; then\n\texec /busybox/sh
gitlab-runner[30858]: --login\nelse\n\techo shell not found\n\texit 1\nfi\n\n"
gitlab-runner[30858]: command: su
gitlab-runner[30858]: arguments:
gitlab-runner[30858]: - -s
gitlab-runner[30858]: - /bin/bash
gitlab-runner[30858]: - gitlab-runner
gitlab-runner[30858]: - -c
gitlab-runner[30858]: - bash --login
gitlab-runner[30858]: passfile: false
gitlab-runner[30858]: extension: ""
gitlab-runner[30858]: job=106 project=26 runner=bUxm5K1S
gitlab-runner[30858]: Using Shell executor... job=106 project=26 runner=bUxm5K1S
gitlab-runner[30858]: Waiting for signals... job=106 project=26 runner=bUxm5K1S
gitlab-runner[30858]: Executing build stage build_stage=prepare_script job=106 project=26 runner=bUxm5K1S
su[2599]: (to gitlab-runner) root on none
su[2599]: pam_unix(su:session): session opened for user gitlab-runner by (uid=0)
su[2599]: pam_unix(su:session): session closed for user gitlab-runner
gitlab-runner[30858]: Executing build stage build_stage=upload_artifacts_on_failure job=106 project=26 runner=bUxm5K1S
su[2619]: (to gitlab-runner) root on none
su[2619]: pam_unix(su:session): session opened for user gitlab-runner by (uid=0)
su[2619]: pam_unix(su:session): session closed for user gitlab-runner
gitlab-runner[30858]: WARNING: Job failed: exit status 1 duration=146.489509ms job=106 project=26 runner=bUxm5K1S
gitlab-runner[30858]: WARNING: Job failed: exit status 1 duration=146.489509ms job=106 project=26 runner=bUxm5K1S
gitlab-runner[30858]: Appending trace to coordinator... ok code=202 job=106 job-log=0-20037 job-status=running runner=bUxm5K1S sent-log=0-20036 status=202 Accepted
gitlab-runner[30858]: Submitting job to coordinator... ok code=200 job=106 job-status= runner=bUxm5K1S
gitlab-runner[30858]: ERROR: Failed to process runner builds=0 error=exit status 1 executor=shell runner=bUxm5K1S
gitlab-runner[30858]: ERROR: Failed to process runner builds=0 error=exit status 1 executor=shell runner=bUxm5K1S
The output on gitlab itself is not of much help:
Running with gitlab-runner 12.0.1 (0e5417a3)
on Azazel bUxm5K1S
Using Shell executor...
Running on azazel...
ERROR: Job failed: exit status 1
Running with CI_DEBUG_TRACE
gives more info, but I’m a bit confused. (I would happily post the whole log, but formatting is a pain, and I cannot post attachments.)
Running with gitlab-runner 12.0.1 (0e5417a3)
on Azazel bUxm5K1S
Using Shell executor...
+ set -eo pipefail
+ set +o noclobber
+ :
+ eval 'echo "Running on $(hostname)..."
'
+++ hostname
++ echo 'Running on azazel...'
Running on azazel...
+ exit 0
++ '[' 1 = 1 ']'
++ '[' -x /usr/bin/clear_console ']'
++ /usr/bin/clear_console -q
+ set -eo pipefail
+ set +o noclobber
+ :
+ eval 'export FF_CMD_DISABLE_DELAYED_ERROR_LEVEL_EXPANSION=$'\''false'\''
export FF_USE_LEGACY_BUILDS_DIR_FOR_DOCKER=$'\''false'\''
[...]
++ export CI_RUNNER_EXECUTABLE_ARCH=linux/amd64
++ CI_RUNNER_EXECUTABLE_ARCH=linux/amd64
++ cd /home/gitlab-runner/builds/bUxm5K1S/0/dunatotatos/cv
+ exit 0
++ '[' 1 = 1 ']'
++ '[' -x /usr/bin/clear_console ']'
++ /usr/bin/clear_console -q
ERROR: Job failed: exit status 1
Why would the job fail when executing clear_console
, while it went perfectly fine during the first step? I guess the error is not there…
HOWEVER, everything works as expected when gitlab-runner is running from the command line through an SSH connection, with the following command:
/usr/lib/gitlab-runner/gitlab-runner "--debug" "run" "--working-directory" "/home/gitlab-runner" "--config" "/etc/gitlab-runner/config.toml" "--service" "gitlab-runner" "--syslog" "--user" "gitlab-runner"
This is the exact same command launched by systemd. I’ve tried to set a matching $PATH
and $PWD
, and the behavior is still the same. Jobs crash with systemd, and run successfully when launched from the command line. I’ve also tried to use a runner dedicated to the repository instead of shared, to observer still the same behavior. Last but not least, I’ve used strace
on the process launched by systemd, but the output is a bit too large for a detailed analysis.
Is this a bug, or (most likely) a misconfiguration from my side?
Please let me know if anything is missing in my post.
Thanks in advance for your help!
Duna