We have a working self hosted GitLab setup and running great with Kubernetes, all at DigitalOcean. Our current Kubernetes cluster has now run 1000’s of jobs successfully.
On DigitalOcean we have setup a pool with autoscaling 1-20 nodes.
In Kubernetes we recently added metric-server which are running:
# kubectl get all -n metrics-server
NAME READY STATUS RESTARTS AGE
pod/metrics-server-5b76987ff-cr59s 1/1 Running 0 14d
pod/metrics-server-5b76987ff-lw4cf 1/1 Running 0 14d
Pretty certain I am missing part of the setup somewhere, but no idea where, as this setup still only using 1 node max at a time. Our queue time can be long at 45-60 minutes per job if a lot of jobs trying to run at once.
Disclaimer: I am not proficient with K8s or Helm. But I can follow instructions fairly well.
Any suggestions on what we might be missing to ensure we get parallelism in our setup? Ideally I’d love to have a single job use as many VMs as needed to quickly complete, but simply have multiple VMs in parallel to handle one job each would be a big improvement.