We’ve recently moved from an Omnibus installation on Docker to the kubernetes helm chart, and so far everything is working great, save for one issue: when we run a CI job using a custom image from our own internal registry (also part of the helm chart), we get the following error:
ERROR: Job failed: image pull failed: rpc error: code = Unknown desc = Error response from daemon: error unmarshalling content: invalid character '<' looking for beginning of value
We use split-horizon DNS so that requests from within our cluster go straight to our load balancer IPs while external requests are routed through an IDP. Having fought through an issue earlier today with a similar error, I figured it was routing incorrectly and hitting our IDP and returning some HTML, hence the
<. But I’m stumped after trying the following things:
- Pulling the image as a step in the CI job without using a custom image (works)
- Pulling the image in a before_script step without using a custom image (works)
- Checking the output of nslookup of our registry url within the CI job (correct value)
- Ensuring that $CI_REGISTRY points to the correct location (it does)
As far as I can tell, from every perspective the runner takes it’s able to properly resolve and in most cases correctly pull the image, but when specified as the
image for a CI job task to use, we get the above error. If I use an explicit URL instead of $CI_REGISTRY I get a slightly different error:
ERROR: Job failed: image pull failed: Back-off pulling image "our-registry.our-domain.com/repo/our-runner:latest"
I’m using the simplest of CI jobs to test this:
image: docker:latest services: - docker:dind variables: DOCKER_TLS_CERTDIR: "" DOCKER_HOST: tcp://localhost:2375 stages: - test test_runner: stage: test image: our-registry.our-domain.com/repo/our-runner:latest script: - env
We’re running GitLab 12.10.2-ee on Kubernetes 1.18.2 via Helm v3.
I’m not expecting a solution as much as any pointers on how to debug this. I haven’t been able to get very meaningful output from the docker client and gitlab logs haven’t shown me much else.
Any help is greatly appreciated!