Why does a Merge Results pipeline build an unchanged result twice?

I’m trialing the “Ultimate” version of GitLab, in particular the Merge Results Pipeline feature, along with the related Merge Trains feature.

I have Merge Results pipeline & Merge Trains enabled for my project, Merge Method is set to Merge Commit, and my .gitlab-ci.yml has the following rules, as per the docs:

    - if: $CI_PIPELINE_SOURCE == 'merge_request_event'

This all seems to work nicely, but there’s one thing I don’t understand.

Imagine there is just one open Merge Request, and there have been no changes on the target branch main since the related branch was created. When I push to the MR branch, I expect a “merge results” pipeline to run, which as I understand it is a preemptive merge of the target branch with my MR branch. The pipeline succeeds, and the MR is now eligible to add to the Merge Train.

And if I do add it to the Merge Train, it builds again and I must wait for this to complete before it can be merged. Why is this necessary? It has already built the result of the merge, and nothing has changed since, and this is detectable, so why the inefficiency?

This leads to my team wanting to use the “Merge Immediately” button instead of adding to the Merge Train, to avoid the superfluous pipeline (and the resulting wait for the actual merge to happen). I don’t want this to become a habit since it defeats the purpose of these two features.

Previously we just used FF-only branch pipelines and although it was painful to constantly rebase or merge the target, at least if the build was green and nothing had changed, the merge could happen immediately.

Perhaps I’m missing something with my configuration?

Never did get a proper answer on this. We just “Merge Immediately” when it seems like a waste of time to rebuild the pipeline a second time, if no Merge Train is active.

The behavior is still the same.

What is especially nasty is that the “merge immediately” button is only available once the pipeline is green.

There is no chance to skip the extra merge train build when the pipeline did not succeed yet - even when no merge train is running yet.
Compared to a FF-only workflow we see three complete pipeline runs (MR, merge train, master) instead of just one (the MR build which is reused for master due to an unchanged sha1).

I might be missing the point of the question, but isn’t the rule you provided one for a merge request pipeline, not a merged results pipeline? And what stops you from using a merge train pipeline without the extra merged results pipeline if you don’t want it to build twice? MT pipeline acts as a merged results pipeline if no MRs are already queued.

Hi bobby-wan:

I’m not sure I follow - what would the rule be for a Merge Results pipeline? The project has the option “Enable merge results pipelines” enabled, and the docs for this say that a Merge Results pipeline is a type of Merge Request pipeline:

The reason we want to use a Merged Results pipeline is that we also want to use Merge Trains, so that the main line is always guaranteed to be green (well, within practical limits). If we use a normal Merge Request pipeline then it does not protect us from the case where a prior merge (after the build has completed) breaks the main line. Without Merge Results Pipelines, the way to do this is to enforce FF-Merge only, which basically makes the developers take on the role of the Merge Train themselves.

I’m not sure what a “MT pipeline” is, though. Merge Train pipeline? Is that the same as Merge Results pipeline? That’s not a term I’ve seen before.

Essentially the fundamental issue is that if there’s no current Merge Train, then a successful MR build has to be run again to actually do the merge, rather than detecting that nothing has changed and it can just go ahead and merge it. There is the manual override, but that’s risky and requires developers to actively work around the Merge Results workflow, which I’d rather they didn’t do, but that’s what we do.

I think I understand the problem better now. I might be missing some vital part of gitlab knowledge, but can’t we control exactly what gets triggered when with rules for CI_MERGE_REQUEST_EVENT_TYPE?
It has detached, merge_train and merged_results values. I believe you can skip the merged results step altogether (in your explanation) and go for a merge train pipeline, which acts as a merged results (runs on the merged changes of source + target branch) if there are no previous merge requests in the train.
So a workflow of CI_MERGE_REQUEST_EVENT_TYPE=“merge_train” (check exact string literal value) would only trigger a MT.

But I want the first build (Merge Results) to run, because that’s what determines whether the MR passes all the tests, or not. It’s not good enough to wait until the merge event to find out that the tests don’t pass, or it doesn’t even build.

What I want to do is skip the Merge Train build and merge immediately if there have been no changes to the target branch since the Merge Results, and there’s no Merge Train currently running. As it stands, I can do this manually, but for every MR the developer assigned has to check those prerequisites just prior, which is arduous, and there’s a chance of a race and thus likely to fail eventually.

It would be much nicer if the Merge Train was smart enough to automatically fall back to a Merge Immediate if those conditions are met, rather than performing an additional, completely unnecessary, extra build of the exact same thing.

I’m not sure what you mean by “builds again” in the MT pipeline, but I guess we don’t hit this issue since we deploy artifacts to a repository during MR and reuse them in MT.
If it’s artifacts being rebuilt, have you tried passing artifacts back and forth between pipelines?

By “builds again” I mean a second pipeline runs, with the exact same commit, and redoes everything, because we build from scratch in every pipeline.

So that’s a good point - if we cached everything fully then the subsequent MT build would be significantly faster. However I have found artifacts to be insufficient as a cache, and the proper GitLab cache mechanism itself is limited to each gitlab runner (we have several) and each concurrent worker on each runner (which is insane). This can be worked around with a local minio instance but that work hasn’t been done yet.

To be fair I don’t think there’s a proper answer to this yet - at least not until GitLab make their Merge Train a bit smarter.