Running into issues with Autoscaling EC2 CI/CD Group Runners

I’m currently running into an issue where the gitlab-runner-manager EC2 that I setup is throwing this error:

 Missing instance ID, this is likely due to a failure during machine creation

My config is as follows (with certain things redacted):

concurrent = 10
check_interval = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "gitlab-runner-manager"
  url = "https://gitlab.com/"
  token = "REDACTED"
  executor = "docker+machine"
  limit = 10
  [runners.custom_build_dir]
  [runners.cache]
    Type = "s3"
    Shared = true
    [runners.cache.s3]
	ServerAddress = "s3.amazonaws.com"
	AccessKey = "REDACTED"
	SecretKey = "REDACTED"
	BucketName = "REDACTED"
	BucketLocation = "us-east-1"
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.machine]
    IdleCount = 1
    IdleTime = 1800
    MachineDriver = "amazonec2"
    MachineName = "gitlab-docker-machine-%s"
    MachineOptions = [
	"amazonec2-access-key=REDACTED",
	"amazonec2-secret-key=REDACTED",
	"amazonec2-region=us-east-1",
	"amazonec2-vpc-id=vpc-",
	"amazonec2-subnet-id=subnet-",
	"amazonec2-use-private-address=true",
	"amazonec2-tags=runner-manager-name,gitlab-aws-autoscaler,gitlab,true,gitlab-runner-autoscaler,true",
	"amazonec2-security-group=docker-machine-scaler",
	"amazonec2-instance-type=m4.2xlarge",
	"amazonec2-request-spot-instance=true",
	"amazonec2-spot-price=0.50",
	"amazonec2-block-duration-minutes=60",
	"amazonec2-iam-instance-profile=gitlab-runner-manager"
    ]
    [[runners.machine.autoscaling]]
      Periods = ["* * * * * * *"]
      IdleCount = 1
      IdleTime = 3600
      Timezone = "UTC"

Other details that are important:

  • The docker-machine-scaler security group has port 22 and docker port 2376 traffic allowed from the gitlab-runner-manager ec2.
  • The runner version running on the gitlab-runner-manager ec2 is 14.9.1.
  • Followed the documentation here: Autoscaling GitLab Runner on AWS EC2 | GitLab But wondering if there are other required variables that need to be set in the runners.machine.MachineOptions.
  • Currently using gitlab.com not the enterprise hosted omnibus version

Would appreciate any help! Thanks in advance!

What Linux version are you using? The docker-machine can fail if the fork of docker-machine relies on default 16.04 ubuntu, which docker no longer supports. The fix in this case is to upgrade to the latest beta of docker-machine maintained by gitlab ([drivers/amazonec2/amazonec2.go · main · GitLab.org / Ops Sub-Department / docker-machine · GitLab])
After this, the output of watch docker-machine ls would sometimes still present the error but then correct itself after about a minute
Hope this helps!