We are a small gamedev studio that uses GitLab CI/CD to have regular builds avaiable.
For the most time we only had a single custom shell runner setup that did every pipeline.
I then started my own runner, so we now have 2 runners avaiable.
But now we have this issue of jobs being distributed between the machines, which means that build results aren’t available anymore.
I know I could share the artifacts, but that would totally kill our pipeline duration as it would be mostly be busy with up- and downloading the artifacts.
My question now is: Is there anyway to make a runner “stick” to a pipeline or - vice versa- have each pipeline only be executed by a single runner?
I know that tags are supposed to solve this problem, however both of my runners are equal. I just want the pipeline to stick to a single runner. I tried using the CI/CD variables that are available from GitLab 14.1 (Keyword reference for the `.gitlab-ci.yml` file | GitLab) and dynamically set the tag to the $CI_RUNNER_ID but that is evaluated too early, when the RUNNER ID is not yet set.
I tried fiddling around with the runner tags sadly with no success.
Currently we have two equal runners… let’s call them R1 and R2.
We currently only have 1 pipeline configuration (let’s call that P1) , which consists of
preparation (downloading project and preparing env vars)
build (compiling and packaging the game)
testing
deliver (uploading the build to our file storage)
I want that every instance of P1 gets executed by eitherR1 or R2 but have them not share the different stages and jobs, since the output of e.g. the build step is needed to do the testing.
So a single tag for the runners that is hardcoded into the job config would prevent that one of those runners could do any P1 runs.
As I mentioned in the first post, I know I could do that. However up- and downloading 5 GB of data (buildsize) would easily increase our total pipeline time to 300% . That’s why I wanted to see if there’s an workaround for that.
The runners are executed in a shell environment on our dev machines.
That gave us the benefit of having persitance between “build” and “testing” stage, so that the testing stage had the files available.
If I now start to share the big compiled project between those steps I might as well shut down the second runner again, since it would take us so long…
What I tried is:
Gave my runners their ID as tag.
So R1 had the tag R1 and R2 had the tag R2.
I then tried to use that CI_RUNNER_ID as variable tag (as mentioned in my first post). That looked like that.
However the variable doesn’t get expanded in the way I hoped it to.
Right, I see your problem, and AFAIK you can’t use variables in tags.
I’m not sure that there’s a straight forward way around this, but I’m wondering whether you can mis-use rules and/or dependencies to do something like:
Using dependencies probably isn’t strictly necessary, and I haven’t tested any of this at all. I have also made an assumption that you have not used other tags on your runners (in which case the reg exps here won’t make any sense).
However, this is the sort of thing I’d try next, maybe in a throwaway repo.
I’d be interested to see if other people have different workarounds for this, but it is something that comes up relatively frequently.
According to this doc entry you can use variables in tags.
However what doesn’t work is also having the global variable set to the $CI_RUNNER_ID at the beginning. And you can’t update global variables in the script scope. That’s what I tried as well.
Ah, that’s interesting. I think there are some long running issues up about cleaning up how variables work - there are lots of corner cases like yours.
When I first tried it no stage did start, since the preparation stage had runner1 and runner2 which no runner could fulfill, since all tags have to be matched.
I then removed the tag requirement.
Then the preparation stage got executed, but no other stage triggered. Probably since the evaluation of the rule was too early?
I’m giving it a shot with the workflow syntax to maybe set a variable beforehand…
Hi, we are very much be interested in a way to have 1 runner (we also have 2) handle a pipeline, instead of switching in between jobs. I agree with @Gitoza that sharing artifacts with a shared cache would lower the performance.
We have the same scenario where we want to make sure that all jobs in a pipeline is executed on the same runner. We have currently solved it by templating the pipeline configuration and using a parent child setup.
I.e. the parent pipeline contains a job that dynamically generates the configuration for the child pipeline and when doing that you can hardcode the unique runner tag to set on every job.
Of course, you need to know which runner to use, but you can use the GitLab API to find out if a runner is busy or not. There are of course some corner cases where a runner appear to be available even when it is not, e.g. when it is in between jobs etc.
I am also facing the same issue.
Sharing artifacts is one of the solution but its not feasible as per my opinion.
I am having a pipeline which downloads a file of size 5GB and perform the operation in each job as per requirement. If I configure multiple runner with same tag name then the pipeline get failed.
I have shared one scenario in below link.
I am also looking for the solution for this issue.
An idea about this problem. I haven’t tested it yet though… 1_gitlab-ci.yaml is triggered and translates a runner tag to an actually runner name. 1_gitlab-ci.yaml triggers 2_gitlab-ci.yaml and pass runner name as parameter 1_gitlab-ci.yaml waits untill 2_gitlab-ci.yaml is finished.
I think this way is possible to make sure the whole pipeline runs on the same runner when you have more than one runner.
I also would like to have an option to keep pipelines on the same runner - also: if the runner has concurrent instances these need to be restricted also.
Up- and downloading gigabytes of build files is just to time consuming and uses to much disk space that is not necessary if one keeps the files between jobs.
You might consider looking into caching. If the 5GB is stored in the cache and the same cache is used in the next step, it is like it was on the same runner, but I don’t think the upload/download time is there. If I understand correctly, you could use a cache tag based on the pipeline run, so that it won’t use the cache from a previous build pipeline, only the current one.