Describe your question in as much detail as possible:
TL;DR: Only if user
gitlab-runner is logged into a shell on the box, CI/CD pipelines using GitLab Runner succeed. Otherwise, they all fail.
At work we are running GitLab 15.7.5-ee. But we don’t have any GitLab Runners defined, and I wanted to leverage the CI/CD pipeline features. So I followed the instructions in the GitLab docs to install a GitLab Runner on an RHEL8 server we have. I added the GitLab repo, installed
gitlab-runner, and registered it to a test project repository I setup to iron out any issues. All this went flawlessly. GitLab sees the runner, shows green, etc.
My intention is to use Podman vs. Docker on the RHEL8 box, so that is all installed and configured. And I’ve run some basic containers to verify things are working at a basic level. Gitlab Runner is configured according to the Docker docs which cover this. It is set to run under user
gitlab-runner (which it setup when it installed). The
config.toml file is adjusted to point at the Podman UNIX socket file. User
gitlab-runner is added to
For now all I have in the repo itself is a
.gitlab-ci.yml file. And the contents of the latter are the default where it’s just a few stages that all pretty much just execute
However, when I try to run a pipeline (for example, by editing the .gitlab-ci.yml and saving it to trigger this), the stages fail with
Running with gitlab-runner 15.8.2 (4d1ca121) on ockness.net.unc.edu U8sxgA3a, system ID: s_4f3a2b84f82c feature flags: FF_NETWORK_PER_BUILD:true Preparing the "docker" executor 00:09 ERROR: Failed to remove network for build ERROR: Preparation failed: Cannot connect to the Docker daemon at unix:///run/user/984/podman/podman.sock. Is the docker daemon running? (docker.go:753:0s) Will be retried in 3s ...
failing after 3 tries.
Now I have Google-fu’d my way to finding multiple references to this error, most of which were folks trying to simply use Docker with GitLab Runner and not Podman. So the fixes described (mostly making sure Docker was installed) don’t apply.
Now the oddest part is I have tracked this down to something rather specific. That is, if all I do is login to the RHEL8 box as user
gitlab-runner (which I had to do initially following the instructions in order to install the socket feature/etc.) and just stay logged in and repeat the above to trigger a pipeline run, everything succeeds! If I log back out of
gitlab-runner and repeat, it fails once again.
As best as I can figure it, when GitLab Runner tickles the UNIX socket which should trigger bringing up the Podman service, it doesn’t. But if again, user
gitlab-runner is logged into a shell (even doing nothing), then it all works.
I cannot find anything in the logs to indicate what the issue here is. But I’m hoping someone else has hit on this and can point me in the right direction.
What are you seeing, and how does that differ from what you expect to see?
I am seeing every pipeline fail when I expect to see them succeed.
Consider including screenshots, error messages, and/or other helpful visuals
What version are you on? Are you using self-managed or GitLab.com?
- GitLab (Hint:
- Runner (Hint:
That doesn’t work for me. I get a 404. But I’m only a user of the system, not an admin.
Add the CI configuration from
.gitlab-ci.yml and other configuration if relevant (e.g. docker-compose.yml)
stages: # List of stages for jobs, and their order of execution - build - test - deploy build-job: # This job runs in the build stage, which runs first. image: alpine:latest stage: build script: - echo "Compiling the code..." - echo "Compile complete." unit-test-job: # This job runs in the test stage. image: alpine:latest stage: test # It only starts when the job in the build stage completes successfully. script: - echo "Running unit tests... This will take about 60 seconds." - sleep 60 - echo "Code coverage is 90%" lint-test-job: # This job also runs in the test stage. image: alpine:latest stage: test # It can run at the same time as unit-test-job (in parallel). script: - echo "Linting code... This will take about 10 seconds." - sleep 10 - echo "No lint issues found." deploy-job: # This job runs in the deploy stage. image: alpine:latest stage: deploy # It only runs when *both* jobs in the test stage complete successfully. environment: production script: - echo "Deploying application..." - echo "Application successfully deployed."
What troubleshooting steps have you already taken? Can you link to any docs or other resources so we know where you have been?
Oof. Where to begin? Ok, I have tried everything from rebooting the RHEL8 box (no change) to adjusting the
config.toml file with various parameter changes and more. And I can consistently make the pipeline succeed if I simply login to the
gitlab-runner account and then go over to GitLab and re-run a pipeline job. And if I log out and do it again, it fails. Again, it feels like GitLab Runner simply isn’t starting up properly if GitLab triggers a CI/CD pipeline and that user isn’t logged in.
As for docs/resource, for starters,
- Docker executor | GitLab
- Install GitLab Runner | GitLab
- Install GitLab Runner using the official GitLab repositories | GitLab (the RHEL bits)
Finally, Some (Sanitized) Notes I Took of the process
# Add official GitLab repository curl -L "https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.rpm.sh" | sudo bash # Install GitLab Runner sudo dnf install gitlab-runner # Register runner using info provided in GitLab under Settings | CICD, then expanding 'Runners', which includes a URL and token sudo gitlab-runner register # For CentOS/etc. you can simply do this # yum install podman # For RHEL8, to install Podman, do this which adds Podman and related tools sudo dnf module enable -y container-tools:rhel8 sudo dnf module install -y container-tools:rhel8 sudo dnf install podman-docker # to get Docker CLI sudo dns install podman-plugins # for network aliasing
GitLab Runner adds a user
gitlab-runner to the system. We need to make sure this user is configured properly to run GitLab Runner, so we do the following:
# As root, we set the password on this account so we can login sudo passwd gitlab-runner # Next we SSH into this account so we can set things up, using the password from the previous step ssh gitlab-runner@localhost # Next we enable and start the Podman socket (very important) systemctl --user --now enable podman.socket # And we verify things are working systemctl status --user podman.socket # The above command also provides us with that path to the socket we need (e.g., `/run/user/<#>/podman/podman.sock`) which we'll use in the next step
... executor = "docker" # ADD THE FOLLOWING LINES TO CREATE NETWORK FOR EACH JOB [runners.feature_flags] FF_NETWORK_PER_BUILD = true ... [runners.docker] # ADD THE FOLLOWING LINE SO GitLab Runner knows where to find Podman host = "unix:///run/user/<#>/podman/podman.sock" ...
sudo gitlab-runner restart
Thanks for taking the time to be thorough in your request, it really helps!