How do I recognize duplicate pipelines?

Duplicate pipelines best practices guidelines

Hi GitLab community

I am relatively new in writing good pipelines. I am not sure how to recognize duplicate pipelines.
I am trying to optimize a ci config file which goes through the following stages:

when commit is made, the following stages are triggered:
lint build

when MR is submitted :
lint build security-scan

after MR pipeline is successful , there is merge train that kicks off and it is exactly the same pipeline that was just ran

lint build security-scan

My questions are:

  • when does it make sense to have lint stages?
  • Is this considered duplicate pipeline?

Here is my pipeline config file with the possible duplicates:


stages:
  - lint
  - build
  - security-scan
  - scan-check


build_a:
  stage: build
  script:
    - echo 'build a'

  rules:
    - if: $CI_MERGE_REQUEST_ID
    - if: '$CI_COMMIT_BRANCH == "main"'
    - if: $CI_COMMIT_TAG

build_b:
  stage: build
  script:
    - echo "Building b"
  rules:
    - if: '$CI_COMMIT_TAG =~ "/^$/"'
    - if: '$CI_COMMIT_TAG =~ "/^$/" && $CI_OPEN_MERGE_REQUESTS'
      when: never


lint_markdown:
  stage: lint
  script:
    - echo 'markdown lint'
  rules:
    - if: $CI_COMMIT_BRANCH
    - if: $CI_MERGE_REQUEST_ID
    - if: $CI_COMMIT_TAG
    - if: '$CI_COMMIT_BRANCH && $CI_OPEN_MERGE_REQUESTS'
      when: never

lint_helm:
  stage: lint
  script:
    - echo 'helm lint'
  rules:
    - if: $CI_COMMIT_BRANCH
    - if: $CI_MERGE_REQUEST_ID
    - if: $CI_COMMIT_TAG
    - if: '$CI_COMMIT_BRANCH && $CI_OPEN_MERGE_REQUESTS'
      when: never

lint_docker:
  stage: lint
  script:
    - echo 'docker lint'
  rules:
    - if: $CI_COMMIT_BRANCH
    - if: $CI_MERGE_REQUEST_ID
    - if: $CI_COMMIT_TAG
    - if: '$CI_COMMIT_BRANCH && $CI_OPEN_MERGE_REQUESTS'
      when: never

lint_yaml:
  stage: lint
  script:
    - echo 'yaml lint'
  rules:
    - if: $CI_COMMIT_BRANCH
    - if: $CI_MERGE_REQUEST_ID
    - if: $CI_COMMIT_TAG
    - if: '$CI_COMMIT_BRANCH && $CI_OPEN_MERGE_REQUESTS'
      when: never

security_scan:
  stage: security-scan
  script:
     - echo 'scan my pipeline'
  rules:
    - if: $CI_MERGE_REQUEST_ID


security_scan_result:
  stage: security-scan
  script:
     - echo 'checking security scan results here'
  rules:
    - if: $CI_MERGE_REQUEST_ID

Thanks for taking the time to read this :blush:

Here are some things that you can use to improve you CI file.

  • First things first, use workflow:rules. There are used to tell gitlab when and when not to create a pipeline. In other words, if one of the workflow:rules the pipeline will be created otherwise it won’t.
workflow:
  rules:
    - if: $CI_MERGE_REQUEST_ID
    - if: '$CI_COMMIT_BRANCH == "main"'
    - if: $CI_COMMIT_TAG

This will save you so much effort and line of yaml code, all the jobs will inherit this unless otherwise.

  • As you noticed I haven’t included the $CI_COMMIT_BRANCH. I think that most of the time you shouldn’t run pipelines on each push on any branch. It can uselessly take time and money. Instead I think $CI_MERGE_REQUEST_ID give a similar outcome and it is often cheaper.
    With $CI_MERGE_REQUEST_ID once you create an MR, it will behave like $CI_COMMIT_BRANCH, it will create a pipeline on each push to the source branch until the MR is merged.
  • In your jobs you can override the rules as you want for example
build_a:
  stage: build
  script:
    - echo 'build a'
build_b:
  stage: build
  script:
    - echo "Building b"
  rules:
    - if: '$CI_COMMIT_TAG =~ "/^$/" && $CI_OPEN_MERGE_REQUESTS'
      when: never
lint_markdown:
  stage: lint
  script:
    - echo 'markdown lint'
....

Now for the questions:

In my opinion, on Merge requests, this way no one will be able to merge un-linted code. Also running the lint on each push is overkill. You can run the linter during the dev before even pushing your code. (Tip: You can use git hooks).

No it is not! MR pipelines are like Branch pipelines. they run on the source branch code only. Quoting the docs

Merge request pipelines, which run on the changes in the merge request’s source branch. Introduced in GitLab 14.9, these pipelines display a merge request label to indicate that the pipeline ran only on the contents of the source branch, ignoring the target branch. In GitLab 14.8 and earlier, the label is detached.
Merge request pipelines | GitLab

Having that, you should always run your non-independent test on the main branch (target branch in our case). By non-independent I mean the test that can have different outcome if the code is merged.

For example, Let’s say your lint job checks for duplicate function names. Consider this

  1. In your branch your created a function func get_users and created your MR. ** MR Pipeline passes!**
  2. In my branch I created a function with the same name func get_users and created MR. MR Pipeline passes!
  3. I merged my branch main branch pipeline passes!
  4. You merged your branch main branch pipelines fails
    ==> Because the main branch will contain two functions named get_users

There are other types of merge request pipelines that help avoid such scenarios without creating a pipeline for the target branch. But they are in PREMIUM

That’s all for now! Please let me know if something is not clear or you have more questions/opinion.

1 Like