The CI/CD Pipeline suddenly fails without any changes done to the repository

Hello!

As the title suggests, starting today we’ve been experiencing issues with successfully running the build job in our pipeline. This issue is reproduced even when no changes were done to the repository and even on old jobs that actually succeeded before (e.g. a job from 2 months ago).

The error we are facing is the following:

#36 DONE 156.8s
#37 preparing layers for inline cache
WARNING: buildx: git was not found in the system. Current commit information was not captured by the build
ERROR: failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF
Cleaning up project directory and file based variables
ERROR: Job failed: exit code 1

So from what we could piece together as well as test, the issue seems to be related to the runner itself rather than the docker image we are building, but we are kind of stuck here.

Does anyone have any ideas what the problem could be?

I’ll leave the stage in question here:

build:
  stage: build
  image: docker
  services:
    - docker:dind
  script:
    - docker image prune -f

    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY

    # Dev image (includes resources for tests, coverage, etc.)
    - DOCKER_BUILDKIT=1 docker build
      --cache-from $IMAGE_REPO_NAMETAG_LATEST
      --target development
      --build-arg BUILDKIT_INLINE_CACHE=1
      --build-arg SECRET_ELASTIC_ENVIRONMENT=$SECRET_ELASTIC_ENVIRONMENT
      --build-arg SECRET_ELASTIC_FILEBEAT_LOGGER_CLOUD_ID=$SECRET_ELASTIC_FILEBEAT_LOGGER_CLOUD_ID
      --build-arg SECRET_ELASTIC_FILEBEAT_LOGGER_CLOUD_AUTH=$SECRET_ELASTIC_FILEBEAT_LOGGER_CLOUD_AUTH
      -t $IMAGE_REPO_NAMETAG -t $IMAGE_REPO_NAMETAG_LATEST .
    - docker push $IMAGE_REPO_NAMETAG && docker push $IMAGE_REPO_NAMETAG_LATEST

As a sidenote:
We are using a shared runner with the following version: gitlab-runner 15.6.0~beta.186.ga889181a

2 Likes

We ran into this, and it appears to be a new issue with the docker base image. We solved it by pinning to a slightly older version until it’s fixed:

docker@sha256:c8bb6fa5388b56304dd770c4bc0478de81ce18540173b1a589178c0d31bfce90

so you’d do:

build:
  stage: build
  image: docker@sha256:c8bb6fa5388b56304dd770c4bc0478de81ce18540173b1a589178c0d31bfce90
  services:
    - docker:dind@sha256:c8bb6fa5388b56304dd770c4bc0478de81ce18540173b1a589178c0d31bfce90
5 Likes

Thank you very much for the reply and insight!

I can confirm that this was our issue as well.

Thank you so much (both of you!), as I have been struggling with this for several hours today!

Do you know if there is some issue reported to the Docker team that we can use to track progress on the fix?

I’m not really sure if this particular issue was reported, but from what I’ve seen, there have been a number of similar ones reported.

as stated above, it’s probably due to the recent update of the docker latest image which contains some pretty significant changes under the hood Docker Engine 23.0 release notes | Docker Documentation . The best quick fix at the moment is to pin to an older version such as 20.10 (in my case following the same error). I’ll follow up when I’ve figured out how to make the new version work.

Thank you @gkinsman

For what it’s worth the error WARNING: buildx: git was not found in the system. Current commit information was not captured by the build seems to be intentional and might possibly be resolved as part of these changes Git warning being displayed when building in a git directory with no commits yet · Issue #1587 · docker/buildx · GitHub. Nevertheless I have determined that simply adding --load to my build command enabled the pipeline to complete as intended without any other further changes based on the latest docker:dind image (where buildx is the default builder for linux). i.e. The following results in a single image file with 2 tags.

docker build -f docker/Dockerfile -t $REPO:$MYTAG1 -t $REPO:MYTAG2 --load .
docker push --all-tags $MYIMAGE

An alternative would look slightly different but is more geared towards multi-arch builds since it will results in the creation of multiple artifacts i.e. The following results in 3 artifacts; 2 images and an image index.

docker build -f docker/Dockerfile -t $REPO:$MYTAG1 -t $REPO:MYTAG2 --push .

In both cases the afrementioned warning is still displayed but seems to be harmless - at least in my case.

In our case, the warning & subsequent error prevented us from actually being able to build our image. While this was fixed with the solution provided by @gkinsman, I will also try yours as well and come back with what I found.

Thank you for the insight @andromeda306

1 Like

I fixed my gitlab build issue preparing layers for inline cache followed by unexpected EOF by using previous docker image docker:20.10.22-dind instead of docker:dind. Thanks for the tip to look at previous docker image!

1 Like

Thank you so much that does solve the issue for now at least!

Hello,

FYI the error comes from docker 23.0.0.
In my case our runner set up with docker machine installed this version of docker by default.
The solution was to remove builkit_inline_cache=1
But the GitHub issue related seems to be closed. So it has to be fixed.
I have to check that

@gkinsman, @andromeda306 thank you! I fixed gitlab build error, but can not fix web:prod:deploy

Running a docker build with gitlab-ci. Works well locally, but we are moving runners to AWS on ubuntu 20 and it appears that DNS is not working in the containers. Any ideas greatly appreciated, perhas Is there a way to statically set the git repo DNS address globally in the config, not finding any variables for this in the doc.

Regards

I have simply updated my image build job with this line as a quick fix for now until the upstream docker image is either working better or there is a more standard documented way of doing builds going forward.

image: docker:20.10.22-dind

1 Like

I’m getting the same error with the docker:dind tag.

Using docker image sha256:c365741dcfc2b1d1fd3dbd7ded480a147a334836444f0335456f3e7e1b363505 for docker with digest docker@sha256:e4d776dd1e0580dfb670559d887300aa08b53b8a59f5df2d4eaace936ef4d0e9

Environment Details

  • Self-hosted GitLab version 15.9
  • Using Docker executor on GitLab Runner VM
  • GitLab Runner is running Ubuntu 22.04 LTS
  • GitLab CI/CD pipeline file configured to use docker:dind under services: section

I think this problem in now re-occuring?
How do I find an older version of docker to pin to? Where is the repo for the commits?

1 Like

We’re seeing what might be a similar/related problem.

ERROR: failed to solve: process "/bin/sh -c unzip awscliv2.zip && ./aws/install" did not complete successfully: exit code: 4294967295

We don’t use gitlab CI though, we use Drone.

We’re having steps in a docker build inside docker-in-docker fail with a 0xffffffff error.

The build agents are running Docker version 24.0.1, build 6802122, and pulling whatever the latest dind resolves to.

The issue started about two weeks ago for us. Is it the same for you @keshav.c ?

I’ve tried setting different versions of the dind image, which didn’t help, and updating our build agents, which also didn’t help.

For other people having the same issue as us, turning off docker buildkit does help (set DOCKER_BUILDKIT: 0), but that causes other issues for us.

Looks like it might be fixed in 24.0.2