Env
- gitlab-ce 12.7.2-ce.0 amd64
- gitlab-runner 12.6.0 amd64
- Debian Stretch
Problem
I have to restart the Debian (Stretch) Docker service to get network working again. Otherwise I get a “Connection unreachable” inside the CI Docker, for accessing the outside resources.
- What are you seeing, and how does that differ from what you expect to see?
Running with gitlab-runner 12.6.0 (ac8e767a)
on Shared Docker runner vJRY-fex
Using Docker executor with image git.example.com:5555/internal/container/stretch/build:v1 ...
00:34
Starting service docker:dind ...
Pulling docker image docker:dind ...
Using docker image sha256:8489eeb24a264b6bcdb17f3da00140cebe92ee36bd22365f37d07d59390df4ee for docker:dind ...
Waiting for services to be up and running...
*** WARNING: Service runner-vJRY-fex-project-120-concurrent-0-docker-0 probably didn't start properly.
Health check error:
service "runner-vJRY-fex-project-120-concurrent-0-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2020-01-29T10:44:59.621826740Z time="2020-01-29T10:44:59.621556665Z" level=info msg="Starting up"
2020-01-29T10:44:59.625563991Z time="2020-01-29T10:44:59.625412983Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
2020-01-29T10:44:59.626422790Z failed to load listeners: can't create unix socket /var/run/docker.sock: device or resource busy
*********
Authenticating with credentials from job payload (GitLab Registry)
Pulling docker image git.example.com:5555/internal/container/stretch/build:v1 ...
Using docker image sha256:5223d58b17d2138d64951fb738ddd44b71bf8734477ac05f7874db64922532cb for git.example.com:5555internal/container/stretch/build:v1 ...
Running on runner-vJRY-fex-project-120-concurrent-0 via git...
00:01
Fetching changes...
00:06
Reinitialized existing Git repository in /builds/internal/backoffice_ui/.git/
From https://git.example.com/internal/backoffice_ui
* [new ref] refs/pipelines/565 -> refs/pipelines/565
535d10e..7e5d9ca develop -> origin/develop
Checking out 7e5d9caf as develop...
*********
… a lot of other stuff, but than the important part:
$ apt update
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
Err:1 http://repos.example.com/debian stretch InRelease
Could not connect to repos.example.com:80 (172.21.1.124), connection timed out
Err:2 http://repos.example.com/debian nodejs_11 InRelease
Unable to connect to repos.example.com:http:
Reading package lists...
Building dependency tree...
Reading state information...
All packages are up to date.
In the end, the build breaks, because all the packages which are required can’t be installed.
Workaround
~# service docker restart
- After restart
Running with gitlab-runner 12.6.0 (ac8e767a)
on Shared Docker runner vJRY-fex
Using Docker executor with image git.example.com:5555/internal/container/stretch/build:v1 ...
00:33
Starting service docker:dind ...
Pulling docker image docker:dind ...
Using docker image sha256:8489eeb24a264b6bcdb17f3da00140cebe92ee36bd22365f37d07d59390df4ee for docker:dind ...
Waiting for services to be up and running...
*** WARNING: Service runner-vJRY-fex-project-120-concurrent-0-docker-0 probably didn't start properly.
Health check error:
service "runner-vJRY-fex-project-120-concurrent-0-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2020-02-04T09:01:09.326839247Z time="2020-02-04T09:01:09.326549028Z" level=info msg="Starting up"
2020-02-04T09:01:09.330228908Z time="2020-02-04T09:01:09.330098855Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
2020-02-04T09:01:09.331162134Z failed to load listeners: can't create unix socket /var/run/docker.sock: device or resource busy
*********
Authenticating with credentials from job payload (GitLab Registry)
Pulling docker image git.example.com:5555/internal/container/stretch/build:v1 ...
Using docker image sha256:5223d58b17d2138d64951fb738ddd44b71bf8734477ac05f7874db64922532cb for git.example.com:5555/internal/container/stretch/build:v1 ...
Running on runner-vJRY-fex-project-120-concurrent-0 via git...
00:02
Fetching changes...
00:02
Reinitialized existing Git repository in /builds/internal/backoffice_ui/.git/
Checking out 7e5d9caf as develop...
Removing .filename
Removing backoffice-ui-build-deps_0.0.21+0~20200129104742_all.deb
Skipping Git submodules setup
Authenticating with credentials from job payload (GitLab Registry)
05:04
$ echo 'deb http://repos.example.com/debian/ nodejs_11 main' >> /etc/apt/sources.list
$ apt update
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
Get:1 http://repos.example.com/debian stretch InRelease [3982 B]
....
All runs fine now. Just restart Docker is enough. It works than for a few hours or days … not sure.
Configs
- gitlab-ci.yaml
variables:
TOOL_ARGS: apt-get -o Debug::pkgProblemResolver=yes --no-install-recommends --yes --allow-unauthenticated
DEB_PACKAGE_NAME: "backoffice-ui"
stages:
- build
- publish
- deploy
build:stretch: &build
stage: build
tags:
- docker
image: git.example.com:5555/internal/container/stretch/build:v1
before_script:
- echo 'deb http://repos.example.com/debian/ nodejs_11 main' >> /etc/apt/sources.list
- apt update
- git reset --hard
- git clean -fd
- git checkout $CI_COMMIT_REF_NAME
...
- Dockerfile
FROM debian:stretch
LABEL maintainer="me@example.com"
ENV LANG=C.UTF-8 \
DEBIAN_FRONTEND=noninteractive
RUN mkdir -p /usr/share/man/man1 \
&& apt-get update \
&& apt-get -qy upgrade \
&& apt-get -qy dist-upgrade \
&& export build_deps=' \
build-essential \
ca-certificates \
fakeroot \
git-buildpackage \
lintian \
pristine-tar' \
&& apt-get -qy install --no-install-recommends $build_deps \
autodep8 \
autopkgtest \
git \
&& apt-get -qy autoremove --purge \
&& apt-get clean \
&& apt-mark auto $build_deps \
&& rm -rf /var/lib/apt/lists/*
ADD overlay /
ENTRYPOINT ["/usr/bin/gitlab-ci-entrypoint"]
- Runner config
[[runners]]
name = "Shared Docker runner"
url = "https://git.example.com/"
token = "secret"
executor = "docker"
clone_url = "https://git.example.com"
environment = ["DOCKER_TLS_CERTDIR=","GIT_SSL_NO_VERIFY=1","DOCKER_DRIVER=overlay2"]
[runners.custom_build_dir]
[runners.docker]
cap_add = ["NET_ADMIN"]
tls_verify = false
image = "docker:stable"
privileged = true
disable_entrypoint_overwrite = false
network_mode = "bridge"
oom_kill_disable = false
disable_cache = false
volumes = ["/certs/client", "/var/run/docker.sock:/var/run/docker.sock", "/srv/gitlab-runner/data:/srv/gitlab-runner/data", "/cache", "/opt/aptly/incoming:/publish:rw"]
extra_hosts = ["git.example.com:192.168.43.18","repos.example.com:172.1.1.1"]
shm_size = 0
services = ["docker:dind"]
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
Final
I tried really a lot, to understand, how to get it working again. I saw in tcpdump, that the connection tries to get out, but nothing happens, until I realized that the CI is working again, after a reboot. Than I tested with a restart from the docker service and voilà it works (again).