DIND Containers not accepting connections - shared runner

cmeurer · February 15, 2020, 12:19am

What are you seeing, and how does it differ from what you expect to see?

All of our pipelines are failing due to one of the container we are running refusing to accept connections. No changes were made and previously working pipelines are now failing.

Best example of this is pipeline#118037545 (project #6672684) where there are 3 jobs in the testing stage. The first one was successfully ran under job #438058283 and then manually deployed without issue. Then after we began experiencing failures on other pipelines we decided to try re-running this same already successful job which gave us the same errors as our other pipelines. Waited some time and tried re-running again without success. See attached image.

What troubleshooting steps have you already taken? Can you link to any docs or other resources so we know where you have been?

I’ve tried updating the docker image to docker:19.03.1 from 19.03.0 as well as updating the dind service to docker:19.03.5-dind from docker:19.03.0-dind, as suggested in the recent threads with hanging docker issues, without any changes.

danjac · February 15, 2020, 2:12pm

Seeing what might be a similar issue here: tests have been failing for past 24 hours as unable to connect to postgres container.

nathan.f77 · February 17, 2020, 6:34am

This has also just started happening for me (on GitLab’s hosted CI.)

My Rails app tests suddenly started failing a few days ago with the following error:

rake aborted!
 PG::ConnectionBad: could not translate host name "postgres" to address: Name or service not known

I haven’t changed my .gitlab-ci.yml for a long time. Relevant config:

rspec:
  stage: tests
  services:
    - postgres:latest
    - redis:latest
  ...

nathan.f77 · February 17, 2020, 7:11am

This was the issue: https://github.com/docker-library/postgres/issues/681

The official postgres Docker image was updated with some strict security settings. This was a breaking change that caused the CI builds to fail.

My workaround is to use an older tag for the service image:

  services:
    - postgres:9.6.16-alpine
    - redis:latest

danjac · February 17, 2020, 8:18am

Yes that was the same issue. If you do not wish to downgrade your image you can also set the environment variable POSTGRES_HOST_AUTH_METHOD=trust or use an explicit password POSTGRES_PASSWORD.

Topic		Replies	Views
Gitlab Runner Build DinD fails "Fetching changes" from Repo - Connection Refused GitLab CI/CD ci , runner , docker , pipelines	0	1763	June 23, 2022
Docker dind not working, cannot connect to docker GitLab CI/CD	0	266	October 29, 2023
Connection reset by peer inside dind containers GitLab CI/CD	1	727	May 9, 2024
Docker:dind stops working after 12.1.0 update GitLab CI/CD ci , runner , docker	10	58871	August 27, 2020
DinD failures after upgrade GitLab CI/CD ci , runner , docker	2	400	September 4, 2019

DIND Containers not accepting connections - shared runner

Related topics