I’m running GitLab 14.10.0-ee with GitLab Runner 14.8.2 on AWS. The GitLab runner is configured to spawn a new AWS EC2 instance to run CICD jobs. This used to work until yesterday, but today, it stopped working. I haven’t made any configuration changes and did not run any updates, so I’m not sure what could be the cause.
In the GitLab web interface, the job status is shown as:
Running with gitlab-runner 14.8.2 (c6e7e194) on
Preparing the “docker+machine” executor 10:13
ERROR: Preparation failed: exit status 1
Will be retried in 3s …
ERROR: Preparation failed: exit status 1
Will be retried in 3s …
ERROR: Preparation failed: exit status 1
Will be retried in 3s …
ERROR: Job failed (system failure): exit status 1
In the AWS console, I can see that a new EC2 instance is spawned (as it should). I also ran netcat against the newly spawned EC2 machine against port 22 to make sure that SSH is available, which it is.
journalctl -f on the gitlab runner shows the following:
Apr 26 13:01:55 <hostname> gitlab-runner[609]: Running pre-create checks... driver=amazonec2 name=runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx operation=create
Apr 26 13:01:55 <hostname> gitlab-runner[609]: Creating machine... driver=amazonec2 name=runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx operation=create
Apr 26 13:01:55 <hostname> gitlab-runner[609]: (runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx) Launching instance... driver=amazonec2 name=runner-yyyyyy-gitlab-docker-machine-9999999-912d6a0c operation=create
Apr 26 13:01:58 <hostname> gitlab-runner[609]: IdleCount is set to 0 so the machine will be created on demand in job context creating=1 idle=0 idleCount=0 idleCountMin=0 idleScaleFactor=0 maxMachineCreate=0 maxMachines=1 removing=0 runner=yyyyyy total=1 used=0
Apr 26 13:02:00 <hostname> gitlab-runner[609]: Waiting for machine to be running, this may take a few minutes... driver=amazonec2 name=runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx operation=create
Apr 26 13:02:01 <hostname> gitlab-runner[609]: Detecting operating system of created instance... driver=amazonec2 name=runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx operation=create
Apr 26 13:02:01 <hostname> gitlab-runner[609]: Waiting for SSH to be available... driver=amazonec2 name=runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx operation=create
Apr 26 13:02:20 <hostname> gitlab-runner[609]: Detecting the provisioner... driver=amazonec2 name=runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx operation=create
Apr 26 13:02:21 <hostname> gitlab-runner[609]: Provisioning with ubuntu(systemd)... driver=amazonec2 name=runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx operation=create
Apr 26 13:02:35 <hostname> gitlab-runner[609]: Installing Docker... driver=amazonec2 name=runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx operation=create
Apr 26 13:02:58 <hostname> gitlab-runner[609]: WARNING: Problem while reading command output error=read |0: file already closed
Apr 26 13:02:58 <hostname> gitlab-runner[609]: WARNING: Problem while reading command output error=read |0: file already closed
Apr 26 13:02:58 <hostname> gitlab-runner[609]: ERROR: Machine creation failed error=exit status 1 name=runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx time=1m3.021323248s
Apr 26 13:02:58 <hostname> gitlab-runner[609]: WARNING: Requesting machine removal lifetime=1m3.021582815s name=runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx now=2022-04-26 13:02:58.461046121 +0000 UTC m=+1441.281657463 reason=Failed to create used=1m3.02158386s usedCount=0
Apr 26 13:02:58 <hostname> gitlab-runner[609]: WARNING: Stopping machine lifetime=1m3.038607714s name=runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx reason=Failed to create used=16.829134ms usedCount=0
Apr 26 13:02:58 <hostname> gitlab-runner[609]: Stopping "runner-yyyyyy-gitlab-docker-machine-9999999-xxxxx"... name=runner-yyyyyy-gitlab-docker-machine-9999999-912d6a0c operation=stop
To me, it looks like it is able to connect to the EC2 instance via SSH, but is then unable to continue for some reason. But I don’t know how to interpret this error message:
Problem while reading command output error=read |0: file already closed
I’d be glad to hear any advice on how to interpret this message and how to troubleshoot this problem any further.
Doesn’t seems like a solution to me. The question is why did it work yesterday and not today?
I have the same issue but I think the Ubuntu base image is changed (maybe 22.04) and now it stopped working.
I have the same issue as well. I can confirm that @lpyfm 's solution solved the problem temporarily, but I don’t think it’s a permanent solution to rely on a docker version that is almost 2years old.
I’ve also encountered the same issue on AWS. The worker instances stays stuck in ‘Initializing’ state and then self terminates with another instance subsequently starting up, and repeating.
My gitlab runner was running fine for a year plus. This also just started happening 2 days ago. Would be good to know what the cause is for affecting so many people at the same time suddenly.
Pinning the docker image version did resolve it for me too.
Hi all, unfortunately I don’t have anything useful to add but wanted to add another “me too” to the pile. No changes on our end to our config, yet we’re seeing the same issues as described here.
gitlab-runner[3942]: {"driver":"amazonec2","level":"info","msg":"Installing Docker...","name":"runner-5-xxx-gitlab-runner-docker-machine-xxx-xxx","operation":"create","time":"2022-04-27"}
gitlab-runner[3942]: {"error":"read |0: file already closed","level":"warning","msg":"Problem while reading command output","time":"2022-04-27"}
gitlab-runner[3942]: {"error":"read |0: file already closed","level":"warning","msg":"Problem while reading command output","time":"2022-04-27"}
gitlab-runner[3942]: {"error":"exit status 1","fields.time":xxx,"level":"error","msg":"Machine creation failed","name":"runner-5-xxx-gitlab-runner-docker-machine-xxx-xxx","time":"2022-04-27"}
gitlab-runner[3942]: {"level":"warning","lifetime":xxx,"msg":"Requesting machine removal","name":"runner-5-xxx-gitlab-runner-docker-machine-xxx-xxx","now":"2022-04-27","reason":"Failed to create","time":"2022-04-27","used":xxx,"usedCount":0}
gitlab-runner[3942]: {"level":"warning","lifetime":xxx,"msg":"Stopping machine","name":"runner-5-xxx-gitlab-runner-docker-machine-xxx-xxx","reason":"Failed to create","time":"2022-04-27","used":xxx,"usedCount":0}
This is causing a new EC2 instance to be started and stopped every minute continuously and is preventing CI jobs from running, which is a big problem for us.
We’ve also ran into this issue, but instead I’ve pointed engine-install-url to the previous version of get.docker.com which is here: https://github.com/docker/docker-install/blob/e5f4d99c754ad5da3fc6e060f989bb508b26ebbd/install.sh (but using the raw in the MachineOptions). You can also see that they indeed added the installation of docker-compose-plugin package 3 days ago.
Same thing happened here at work, around the same time! We applied what @savp suggested and it worked just fine! Waiting for some feedback of the Gitlab issue.
@JorRy, maybe using any version previous to 20.10 would solve the issue too? As @dobrud mentioned, might be a better solution than using an older Docker version.
As JoyRy mentioned, this is due to a change to the script that is pulled in by default from get.docker.com to install docker-compose-plugin. The issue is that that package doesn’t exist on the default AMI installed (Ubuntu 16.04).
I initially got it working by doing the same change (pulling in a version of the script from the raw github commit from the day prior to the change), but now have a better solution, which is to specify a much newer AMI in the MachineOptions instead. This then works with the default docker script (I never realised it was using such an old Ubuntu version by default anyway, so this is a good change to make regardless).
Could you let us know which particular ami you’re using?
I’ve tried upgrading from Ubuntu 16.04 in the past, but had several issues with more recent versions of Ubuntu, so it would be interesting to hear which particular version works with GitLab.