Docker:dind and Docker Compose exposed ports not accessible

Hi all,

I’ve run into an odd situation using the shared Gitlab Runner instances on gitlab.com. I’m using docker:dind and Docker Compose, and exposed ports on the ‘docker’ host are not accessible.

Log output:
e[0KRunning with gitlab-runner 12.10.0-rc2 (6c8c540f)
e[0;me[0K on docker-auto-scale 0277ea0f
e[0;msection_start:1587861022:prepare_executor
e[0Ke[0Ke[36;1mPreparing the “docker+machine” executore[0;m
e[0;me[0KUsing Docker executor with image registry.gitlab.com/**/docker/python-test-runner …
e[0;me[0KStarting service docker:dind …
e[0;me[0KPulling docker image docker:dind …
e[0;me[0KUsing docker image sha256:7799c84f00cd5b6d6c537ef3e0211866e35f09cfcfca6f979c9df1098e55b365 for docker:dind …
e[0;me[0KWaiting for services to be up and running…
e[0;m
e[0;33m
WARNING:e[0;m Service runner-0277ea0f-project-12618051-concurrent-0-7b69c8c694017fec-docker-0 probably didn’t start properly.

Health check error:
service "runner-0277ea0f-project-12618051-concurrent-0-7b69c8c694017fec-docker-0-wait-for-service" timeout

Health check container logs:


Service container logs:
2020-04-26T00:30:32.747443503Z time="2020-04-26T00:30:32.747166212Z" level=info msg="Starting up"
2020-04-26T00:30:32.750824283Z time="2020-04-26T00:30:32.750446832Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
2020-04-26T00:30:32.750842982Z time="2020-04-26T00:30:32.750650525Z" level=warning msg="[!] DON'T BIND ON ANY IP ADDRESS WITHOUT setting --tlsverify IF YOU DON'T KNOW WHAT YOU'RE DOING [!]"
2020-04-26T00:30:32.754394707Z time="2020-04-26T00:30:32.753631451Z" level=info msg="libcontainerd: started new containerd process" pid=18
2020-04-26T00:30:32.754410918Z time="2020-04-26T00:30:32.753701664Z" level=info msg="parsed scheme: \"unix\"" module=grpc
2020-04-26T00:30:32.754415027Z time="2020-04-26T00:30:32.753711552Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
2020-04-26T00:30:32.754418810Z time="2020-04-26T00:30:32.753740075Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock 0  <nil>}] <nil>}" module=grpc
2020-04-26T00:30:32.754423140Z time="2020-04-26T00:30:32.753751798Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
2020-04-26T00:30:32.811454767Z time="2020-04-26T00:30:32.810674880Z" level=info msg="starting containerd" revision=7ad184331fa3e55e52b890ea95e65ba581ae3429 version=v1.2.13 
2020-04-26T00:30:32.811471862Z time="2020-04-26T00:30:32.811010520Z" level=info msg="loading plugin "io.containerd.content.v1.content"..." type=io.containerd.content.v1 
2020-04-26T00:30:32.811476122Z time="2020-04-26T00:30:32.811084545Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.btrfs"..." type=io.containerd.snapshotter.v1 

2020-04-26T00:30:32.811480309Z time=“2020-04-26T00:30:32.811265441Z” level=warning msg=“failed to load plugin io.containerd.snapshotter.v1.btrfs” error=“path /var/lib/docker/containerd/daemon/ io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter”
2020-04-26T00:30:32.811486703Z time=“2020-04-26T00:30:32.811276737Z” level=info msg=“loading plugin “io.containerd.snapshotter.v1.aufs”…” type=io.containerd.snapshotter.v1
2020-04-26T00:30:32.823987706Z time=“2020-04-26T00:30:32.823026680Z” level=warning msg=“failed to load plugin io.containerd.snapshotter.v1.aufs” error=“modprobe aufs failed: “ip: can’t find device ‘aufs’\nmodprobe: can’t change directory to ‘/lib/modules’: No such file or directory\n”: exit status 1”
2020-04-26T00:30:32.824004814Z time=“2020-04-26T00:30:32.823066747Z” level=info msg=“loading plugin “io.containerd.snapshotter.v1.native”…” type=io.containerd.snapshotter.v1
2020-04-26T00:30:32.824009142Z time=“2020-04-26T00:30:32.823186422Z” level=info msg=“loading plugin “io.containerd.snapshotter.v1.overlayfs”…” type=io.containerd.snapshotter.v1
2020-04-26T00:30:32.824012850Z time=“2020-04-26T00:30:32.823402377Z” level=info msg=“loading plugin “io.containerd.snapshotter.v1.zfs”…” type=io.containerd.snapshotter.v1
2020-04-26T00:30:32.824024384Z time=“2020-04-26T00:30:32.823660583Z” level=info msg=“skip loading plugin “io.containerd.snapshotter.v1.zfs”…” type=io.containerd.snapshotter.v1
2020-04-26T00:30:32.824027900Z time=“2020-04-26T00:30:32.823669977Z” level=info msg=“loading plugin “io.containerd.metadata.v1.bolt”…” type=io.containerd.metadata.v1
2020-04-26T00:30:32.824037225Z time=“2020-04-26T00:30:32.823726463Z” level=warning msg=“could not use snapshotter btrfs in metadata plugin” error=“path /var/lib/docker/containerd/daemon/ io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter”
2020-04-26T00:30:32.824041357Z time=“2020-04-26T00:30:32.823734236Z” level=warning msg=“could not use snapshotter aufs in metadata plugin” error=“modprobe aufs failed: “ip: can’t find device ‘aufs’\nmodprobe: can’t change directory to ‘/lib/modules’: No such file or directory\n”: exit status 1”
2020-04-26T00:30:32.824045308Z time=“2020-04-26T00:30:32.823741698Z” level=warning msg=“could not use snapshotter zfs in metadata plugin” error=“path /var/lib/docker/containerd/daemon/ io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin”
2020-04-26T00:30:32.835421417Z time=“2020-04-26T00:30:32.831264729Z” level=info msg=“loading plugin “io.containerd.differ.v1.walking”…” type=io.containerd.differ.v1
2020-04-26T00:30:32.835440111Z time=“2020-04-26T00:30:32.831313431Z” level=info msg=“loading plugin “io.containerd.gc.v1.scheduler”…” type=io.containerd.gc.v1
2020-04-26T00:30:32.835444343Z time=“2020-04-26T00:30:32.831343517Z” level=info msg=“loading plugin “io.containerd.service.v1.containers-service”…” type=io.containerd.service.v1
2020-04-26T00:30:32.835455827Z time=“2020-04-26T00:30:32.831355557Z” level=info msg=“loading plugin “io.containerd.service.v1.content-service”…” type=io.containerd.service.v1
2020-04-26T00:30:32.835459514Z time=“2020-04-26T00:30:32.831366309Z” level=info msg=“loading plugin “io.containerd.service.v1.diff-service”…” type=io.containerd.service.v1
2020-04-26T00:30:32.835462936Z time=“2020-04-26T00:30:32.831377403Z” level=info msg=“loading plugin “io.containerd.service.v1.images-service”…” type=io.containerd.service.v1
2020-04-26T00:30:32.835466407Z time=“2020-04-26T00:30:32.831401504Z” level=info msg=“loading plugin “io.containerd.service.v1.leases-service”…” type=io.containerd.service.v1
2020-04-26T00:30:32.835469603Z time=“2020-04-26T00:30:32.831412637Z” level=info msg=“loading plugin “io.containerd.service.v1.namespaces-service”…” type=io.containerd.service.v1
2020-04-26T00:30:32.835472974Z time=“2020-04-26T00:30:32.831423097Z” level=info msg=“loading plugin “io.containerd.service.v1.snapshots-service”…” type=io.containerd.service.v1
2020-04-26T00:30:32.835476262Z time=“2020-04-26T00:30:32.831435111Z” level=info msg=“loading plugin “io.containerd.runtime.v1.linux”…” type=io.containerd.runtime.v1
2020-04-26T00:30:32.835479550Z time=“2020-04-26T00:30:32.831694952Z” level=info msg=“loading plugin “io.containerd.runtime.v2.task”…” type=io.containerd.runtime.v2
2020-04-26T00:30:32.835482827Z time=“2020-04-26T00:30:32.831829211Z” level=info msg=“loading plugin “io.containerd.monitor.v1.cgroups”…” type=io.containerd.monitor.v1
2020-04-26T00:30:32.835486050Z time=“2020-04-26T00:30:32.832197565Z” level=info msg=“loading plugin “io.containerd.service.v1.tasks-service”…” type=io.containerd.service.v1
2020-04-26T00:30:32.835489291Z time=“2020-04-26T00:30:32.832223647Z” level=info msg=“loading plugin “io.containerd.internal.v1.restart”…” type=io.containerd.internal.v1
2020-04-26T00:30:32.835501734Z time=“2020-04-26T00:30:32.832273184Z” level=info msg=“loading plugin “io.containerd.grpc.v1.containers”…” type=io.containerd.grpc.v1
2020-04-26T00:30:32.835505394Z time=“2020-04-26T00:30:32.832286377Z” level=info msg=“loading plugin “io.containerd.grpc.v1.content”…” type=io.containerd.grpc.v1
2020-04-26T00:30:32.835508625Z time=“2020-04-26T00:30:32.832319316Z” level=info msg=“loading plugin “io.containerd.grpc.v1.diff”…” type=io.containerd.grpc.v1
2020-04-26T00:30:32.835511820Z time=“2020-04-26T00:30:32.832330712Z” level=info msg=“loading plugin “io.containerd.grpc.v1.events”…” type=io.containerd.grpc.v1
2020-04-26T00:30:32.835515056Z time=“2020-04-26T00:30:32.832340683Z” level=info msg=“loading plugin “io.containerd.grpc.v1.healthcheck”…” type=io.containerd.grpc.v1
2020-04-26T00:30:32.835518291Z time=“2020-04-26T00:30:32.832351089Z” level=info msg=“loading plugin “io.containerd.grpc.v1.images”…” type=io.containerd.grpc.v1
2020-04-26T00:30:32.835522043Z time=“2020-04-26T00:30:32.832360317Z” level=info msg=“loading plugin “io.containerd.grpc.v1.leases”…” type=io.containerd.grpc.v1
2020-04-26T00:30:32.835525277Z time=“2020-04-26T00:30:32.832369746Z” level=info msg=“loading plugin “io.containerd.grpc.v1.namespaces”…” type=io.containerd.grpc.v1
2020-04-26T00:30:32.835528550Z time=“2020-04-26T00:30:32.832379487Z” level=info msg=“loading plugin “io.containerd.internal.v1.opt”…” type=io.containerd.internal.v1
2020-04-26T00:30:32.835531765Z time=“2020-04-26T00:30:32.832727410Z” level=info msg=“loading plugin “io.containerd.grpc.v1.snapshots”…” type=io.containerd.grpc.v1
2020-04-26T00:30:32.835534898Z time=“2020-04-26T00:30:32.832742993Z” level=info msg=“loading plugin “io.containerd.grpc.v1.tasks”…” type=io.containerd.grpc.v1
2020-04-26T00:30:32.835537994Z time=“2020-04-26T00:30:32.832753724Z” level=info msg=“loading plugin “io.containerd.grpc.v1.version”…” type=io.containerd.grpc.v1
2020-04-26T00:30:32.835541244Z time=“2020-04-26T00:30:32.832762999Z” level=info msg=“loading plugin “io.containerd.grpc.v1.introspection”…” type=io.containerd.grpc.v1
2020-04-26T00:30:32.835544527Z time=“2020-04-26T00:30:32.833015930Z” level=info msg=serving… address="/var/run/docker/containerd/containerd-debug.sock"
2020-04-26T00:30:32.835547873Z time=“2020-04-26T00:30:32.833088242Z” level=info msg=serving… address="/var/run/docker/containerd/containerd.sock"
2020-04-26T00:30:32.835551000Z time=“2020-04-26T00:30:32.833098116Z” level=info msg=“containerd successfully booted in 0.024077s”
2020-04-26T00:30:32.841391042Z time=“2020-04-26T00:30:32.840984589Z” level=info msg=“Setting the storage driver from the $DOCKER_DRIVER environment variable (overlay2)”
2020-04-26T00:30:32.841409708Z time=“2020-04-26T00:30:32.841208281Z” level=info msg=“parsed scheme: “unix”” module=grpc
2020-04-26T00:30:32.841413939Z time=“2020-04-26T00:30:32.841221388Z” level=info msg=“scheme “unix” not registered, fallback to default scheme” module=grpc
2020-04-26T00:30:32.841417540Z time=“2020-04-26T00:30:32.841238884Z” level=info msg=“ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock 0 }] }” module=grpc
2020-04-26T00:30:32.841429324Z time=“2020-04-26T00:30:32.841248343Z” level=info msg=“ClientConn switching balancer to “pick_first”” module=grpc
2020-04-26T00:30:32.856086690Z time=“2020-04-26T00:30:32.855419075Z” level=info msg=“parsed scheme: “unix”” module=grpc
2020-04-26T00:30:32.856101575Z time=“2020-04-26T00:30:32.855441097Z” level=info msg=“scheme “unix” not registered, fallback to default scheme” module=grpc
2020-04-26T00:30:32.856117228Z time=“2020-04-26T00:30:32.855459197Z” level=info msg=“ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock 0 }] }” module=grpc
2020-04-26T00:30:32.856122223Z time=“2020-04-26T00:30:32.855482352Z” level=info msg=“ClientConn switching balancer to “pick_first”” module=grpc
2020-04-26T00:30:32.903905939Z time=“2020-04-26T00:30:32.902496194Z” level=info msg=“Loading containers: start.”
2020-04-26T00:30:32.921821961Z time=“2020-04-26T00:30:32.919897938Z” level=warning msg=“Running modprobe bridge br_netfilter failed with message: ip: can’t find device ‘bridge’\nbridge 167936 1 br_netfilter\nstp 16384 1 bridge\nllc 16384 2 bridge,stp\nip: can’t find device ‘br_netfilter’\nbr_netfilter 24576 0 \nbridge 167936 1 br_netfilter\nmodprobe: can’t change directory to ‘/lib/modules’: No such file or directory\n, error: exit status 1”
2020-04-26T00:30:33.010510512Z time=“2020-04-26T00:30:33.009105579Z” level=info msg=“Default bridge (docker0) is assigned with an IP address 172.18.0.0/16. Daemon option --bip can be used to set a preferred IP address”
2020-04-26T00:30:33.090405090Z time=“2020-04-26T00:30:33.090086957Z” level=info msg=“Loading containers: done.”
2020-04-26T00:30:33.116443197Z time=“2020-04-26T00:30:33.116038149Z” level=info msg=“Docker daemon” commit=afacb8b7f0 graphdriver(s)=overlay2 version=19.03.8
2020-04-26T00:30:33.116461134Z time=“2020-04-26T00:30:33.116184269Z” level=info msg=“Daemon has completed initialization”
2020-04-26T00:30:33.146837604Z time=“2020-04-26T00:30:33.146409404Z” level=info msg=“API listen on [::]:2375”
2020-04-26T00:30:33.146867006Z time=“2020-04-26T00:30:33.146491075Z” level=info msg=“API listen on /var/run/docker.sock”

e[0;33m*********e[0;m

e[0KAuthenticating with credentials from job payload (GitLab Registry)
e[0;me[0KPulling docker image registry.gitlab.com/*****/docker/python-test-runner ...
e[0;me[0KUsing docker image sha256:152feb96d8efebe2a28bed404bdbb0242dcd237a501a3d543bbf305db6129a75 for registry.gitlab.com/*****/docker/python-test-runner ...
e[0;msection_end:1587861113:prepare_executor
e[0Ksection_start:1587861113:prepare_script
e[0Ke[0Ke[36;1mPreparing environmente[0;m
e[0;mRunning on runner-0277ea0f-project-12618051-concurrent-0 via runner-0277ea0f-srm-1587860695-399a0912...
section_end:1587861118:prepare_script
e[0Ksection_start:1587861118:get_sources
e[0Ke[0Ke[36;1mGetting source from Git repositorye[0;m
e[0;me[32;1m$ eval "$CI_PRE_CLONE_SCRIPT"e[0;m
e[32;1mFetching changes...e[0;m
Initialized empty Git repository in /builds/*****/.git/
e[32;1mCreated fresh repository.e[0;m

... snipped ...

e[32;1mSkipping Git submodules setupe[0;m
section_end:1587861120:get_sources
e[0Ksection_start:1587861120:restore_cache
e[0Ke[0Ke[36;1mRestoring cachee[0;m
e[0;msection_end:1587861121:restore_cache
e[0Ksection_start:1587861121:download_artifacts
e[0Ke[0Ke[36;1mDownloading artifactse[0;m
e[0;msection_end:1587861123:download_artifacts
e[0Ksection_start:1587861123:build_script
e[0Ke[0Ke[36;1mRunning before_script and scripte[0;m
e[0;me[0KAuthenticating with credentials from job payload (GitLab Registry)
e[0;me[32;1m$ ci/test.she[0;m

... snipped ...

e[1;34mBootstrapping test environmente[0m
Login Succeeded
Client:
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 19.03.8
 Storage Driver: overlay2
  Backing Filesystem: <unknown>
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.19.78-coreos
 Operating System: Alpine Linux v3.11 (containerized)
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 3.607GiB
 Name: 4593f2c2a517
 ID: LGTA:FARS:JHUL:KU5F:W6H2:DXVW:ABEU:V3Z6:BE4M:WF3D:T2EG:6ER3
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

WARNING: API is accessible on http://0.0.0.0:2375 without encryption.
         Access to the remote API is equivalent to root access on the host. Refer
         to the 'Docker daemon attack surface' section in the documentation for
         more information: https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

... snipped ...

Environment:

... snipped ...

export DB_URL=postgresql://*****:*****@docker:32768/app

... snipped ...

E       psycopg2.OperationalError: could not connect to server: Connection refused
E       	Is the server running on host "docker" (172.17.0.3) and accepting
E       	TCP/IP connections on port 32768?

Relevant portion of the .gitlab-ci.yml:
stages:
- test

.dind-test: &dind-test
  image: registry.gitlab.com/*****/docker/python-test-runner
  services:
    - docker:dind
  variables:
    # Used by bootstrap.sh
    TEST_HOST: docker
    # Needed to connect to the Docker daemon
    DOCKER_HOST: tcp://docker:2375/
    # Improves performance of the dind service
    DOCKER_DRIVER: overlay2
    # Disable TLS for Docker Compose
    DOCKER_TLS_CERTDIR: ""

######
# test
######

test:
  <<: *dind-test
  stage: test
  script:
    - ci/test.sh
  after_script:
    - ci/bootstrap.sh -ds
  artifacts:
    paths:
      - build/coverage
  coverage: '/\d+%$/'

test.sh calls the bootstrap.sh script which calls these Docker Compose commands:
docker-compose pull
docker-compose build --no-cache
docker-compose up --detach

And here is a snippet from the docker-compose.yml file:
app-db:
image: docker.*****.net/ddl/app
ports:
- 5432

This was working as of about 2-3 weeks ago. I can run the same code on my machine using Gitlab Runner 12.10.1 and 12.10.0 without problems.

Does anyone see something here that I don’t? I see that gitlab.com is using 12.10.0rc2, could this be a bug in that build?

Thank you for your help!!