Goal: Run Gitlab Runner as a service with type “docker” but using really non- root user “gitlab-runner” podman.
I based my deployment off of previous thread: Gitlab runner setup with podman - GitLab CI/CD - GitLab Forum
Which was helpful but I think with update requirements to runner, I think there are now issues with workflow to deploy
Problem to solve
-
Gitlab CI setup noted below yields in a working runner, but is erroring out with notes that its looking for the docker service port, NOT the one defined to use podman.
-
I think, in relation it is ignoring documented process to use podman socket, which calls " /etc/gitlab-runner/config.toml" but instead generates defaul docker one and appends each time it runs in ~/.gitlab-runner/config.toml
Steps to reproduce
Errors from Gitlab pipeline
Running with gitlab-runner 17.10.1 (ef3cc)
on triton ZzoxYnB3b, system ID: s_a456
Preparing the "docker" executor
00:09
ERROR: Failed to remove network for build
ERROR: Preparation failed: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? (docker.go:1041:0s)
Will be retried in 3s ...
ERROR: Failed to remove network for build
ERROR: Preparation failed: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? (docker.go:1041:0s)
Will be retried in 3s ...
Shell run of task "
gitlab-runner@triton:~ $ /usr/bin/gitlab-runner run
Runtime platform arch=arm64 os=linux pid=8097 revision=ef334dcc version=17.10.1
Starting multi-runner from /home/gitlab-runner/.gitlab-runner/config.toml... builds=0 max_builds=0
WARNING: Running in user-mode.
WARNING: Use sudo for system-mode:
WARNING: $ sudo gitlab-runner...
Usage logger disabled builds=0 max_builds=1
Configuration loaded builds=0 max_builds=1
listen_address not defined, metrics & debug endpoints disabled builds=0 max_builds=1
[session_server].listen_address not defined, session endpoints disabled builds=0 max_builds=1
Initializing executor providers builds=0 max_builds=1
Checking for jobs... received job=9732752354 repo_url=https://gitlab.com/penguinpages/infra/shuffleboard.git runner=ZzoxYnB3b
Added job to processing list builds=1 job=9732752354 max_builds=1 project=68548754 repo_url=https://gitlab.com/penguinpages/infra/shuffleboard.git time_in_queue_seconds=2
ERROR: Failed to remove network for build error=networksManager is undefined job=9732752354 network= project=68548754 runner=ZzoxYnB3b
WARNING: Preparation failed: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? (docker.go:1041:0s) job=9732752354 project=68548754 runner=ZzoxYnB3b
Will be retried in 3s ... job=9732752354 project=68548754 runner=ZzoxYnB3b
So both showing pods not able to be launched as it can’t connect to socket. Looking for “docker” vs defined (see below) use of podman.
Configuration
*CI steps I run as ansible tasks:
- name: Install dependencies
ansible.builtin.package:
name:
- curl
- openssh-server
- ca-certificates
- tzdata
- podman
state: present
- name: Create user gitlab-runner if not exists
ansible.builtin.user:
name: gitlab-runner
state: present
groups: sudo
ignore_errors: yes
- name: Download and execute GitLab Runner installation script
ansible.builtin.shell: |
curl -L "https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh" | sudo bash
args:
executable: /bin/bash
- name: Install GitLab Runner
ansible.builtin.package:
name: gitlab-runner
state: present
- name: Ensure gitlab-runner has sudo privileges without password
ansible.builtin.lineinfile:
path: /etc/sudoers
state: present
regexp: '^gitlab-runner ALL=\(ALL\) NOPASSWD: ALL$'
line: 'gitlab-runner ALL=(ALL) NOPASSWD: ALL'
validate: 'visudo -cf %s'
- name: Deploy GitLab runner
ansible.builtin.shell: |
sudo -u gitlab-runner gitlab-runner register -n --url https://gitlab.com --registration-token glrt-blah --executor docker --docker-image alpine:latest
args:
executable: /bin/bash
become: yes
become_user: gitlab-runner
- name: Prepare user environment for user-scoped systemd
ansible.builtin.shell: loginctl enable-linger gitlab-runner
- name: Add first search line to file
ansible.builtin.lineinfile:
path: /etc/containers/registries.conf
line: '[registries.search]'
insertafter: EOF
state: present
- name: Add second search line to file
ansible.builtin.lineinfile:
path: /etc/containers/registries.conf
line: 'registries = ["docker.io"]'
insertafter: EOF
state: present
- name: Set XDG_RUNTIME_DIR in .bashrc
ansible.builtin.lineinfile:
path: ~/.bashrc
line: "export XDG_RUNTIME_DIR=/run/user/$(id -u)"
become: yes
become_user: gitlab-runner
- name: Start and enable podman.socket works in shell but ansible failing to run correctly
ansible.builtin.command: "systemctl --user enable --now podman.socket"
become: yes
become_user: gitlab-runner
ignore_errors: True
# Not sure how to make so below task which works initially works every time.
# - name: Get path to the podman socket
# ansible.builtin.shell: "systemctl --user status podman.socket | grep Listen: | awk '{print $2}'"
# register: podman_socket_path
# become: yes
# become_user: gitlab-runner
- name: Set custom fact
ansible.builtin.set_fact:
podman_socket_path: "/run/user/1000/podman/podman.sock"
- name: Display podman socket path
ansible.builtin.debug:
msg: "Podman socket path: {{ podman_socket_path }}"
- name: Update config.toml with the podman socket path
ansible.builtin.lineinfile:
path: /etc/gitlab-runner/config.toml
regexp: '^ *host ='
line: ' host = "unix:/{{ podman_socket_path }}"'
- name: Create GitLab runner service
ansible.builtin.copy:
dest: /etc/systemd/system/gitlab-runner.service
content: |
[Unit]
Description=GitLab Runner
After=network.target
[Service]
User=gitlab-runner
ExecStart=/usr/bin/gitlab-runner run
Restart=always
[Install]
WantedBy=multi-user.target
- name: Enable and restart GitLab runner service
ansible.builtin.systemd:
name: gitlab-runner
enabled: yes
state: restarted
- name: Validate GitLab runner service is running
ansible.builtin.command: systemctl is-active gitlab-runner
register: gitlab_runner_status
failed_when: gitlab_runner_status.rc != 0
changed_when: false
- name: Assert GitLab runner service is running
ansible.builtin.assert:
that:
- gitlab_runner_status.stdout == 'active'
fail_msg: "GitLab runner service is not running"
success_msg: "GitLab runner service is running"
I think broken process where each launch extends gitlab-runner toml file.. which does not seem correct
gitlab-runner@triton:~/.gitlab-runner $ cat ~/.gitlab-runner/config.toml
concurrent = 1
check_interval = 0
connection_max_age = "15m0s"
shutdown_timeout = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "triton"
url = "https://gitlab.com"
id = 46878060
token = "glrt-blah"
token_obtained_at = 2025-04-15T16:06:32Z
token_expires_at = 0001-01-01T00:00:00Z
executor = "docker"
[runners.cache]
MaxUploadedArchiveSize = 0
[runners.cache.s3]
[runners.cache.gcs]
[runners.cache.azure]
[runners.docker]
tls_verify = false
image = "alpine:latest"
privileged = false
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/cache"]
shm_size = 0
network_mtu = 0
[[runners]]
name = "triton"
url = "https://gitlab.com"
id = 46878060
token = "glrt-blah"
token_obtained_at = 2025-04-15T16:08:22Z
token_expires_at = 0001-01-01T00:00:00Z
executor = "docker"
[runners.cache]
MaxUploadedArchiveSize = 0
[runners.cache.s3]
[runners.cache.gcs]
[runners.cache.azure]
[runners.docker]
tls_verify = false
image = "alpine:latest"
privileged = false
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/cache"]
shm_size = 0
network_mtu = 0
<snip cutting over next 10 where same but for timestamp and ID>
So I think there is some step missing where user “gitlab-runner” is launching the runner service and builds an updated toml config for the instance and SHOULD be using podman template but is instead falling back to defined “docker” socket.