GitLab Runner 12.10 - network per build - healtheck for services time out

Hi, I’m having trouble with healthcheck for services in GitLab CI when using Network per build, I have a bunch of services which start just fine and work as expected, but the healthcheck always end in timeout, which adds significant time to build duration. This was working previously when not using the Network per Build feature. I have all the appropriate ports exposed in my Dockerfiles (cannot share them as they are on my company’s private Artifactory)

I’m seeing following logs when running job with defined services (again, the services itself work as expected, but waiting for healthcheck prolongs the overall build time)

Version information

  • GitLab: 12.10 - self managed
  • Runner: 12.10 - self managed
  • using Docker with socket mounting

Relevant config files

relevant part of .gitlab-ci.yml (redacted sensitive stuff)

component tests:
  tags:
    ...
  image: **redacted**/php72-cli:2.1.0
  services:
    - name: **redacted**
      alias: **redacted**
    - .... other services
  variables:
    ...
  stage: test
  script:
    - ...

Example Dockerfile

....
    EXPOSE 8091
....

Troubleshooting

I’ve double checked my Exposed ports, checked documentation on healthcheck and services, not much else I can do I guess

If anybody got any ideas / clues it would be much appreciated, thanks very much in advance :slight_smile:

Hi,

this is an area where I haven’t been before, so please bear with me if my guesses are wrong :wink:

It sounds like a race condition with the service coming up, but the health check does not detect it soon enough. Or the there is a problem with the exposed ports. Is there any chance that your Dockerfile exposes multiple ports? That could point to



Cheers,
Michael

Hey Michael, thanks a lot for Your response.
For exposed ports I’ve checked with docker inspect and there is only one exposed port for service

As for race condititon I think that is unlikely, because when I remove the FF_NETWORK_PER_BUILD: 1 (relevant docs here) from variables in my .gitlab-ci.yml the problem goes away, please see screenshot:


Unfortunately services do not see each other which is problem for me because I’m trying to run integration tests.

Do You have any other ideas?

Thanks again for Your reply :slight_smile:

Hi,

I’ve asked our engineers - maybe you have hit a bug here which needs to be investigated. Your analysis with enabling the feature flag and using network per build is a good one, this narrows down to look into the health checks again … maybe Docker versions introduce trouble here.

That being said, please collect all the details in here. Best would be if you can create a reproducible environment and share it in a new bug report.

Thanks & cheers,
Michael

Hey Michael,
Thanks for getting back to me so quickly, I’ve made a bug report 25660.

I’ve also included public repo with two pipelines which demonstrate the issue.

Hope it is enough and thanks for Your support :slight_smile: