Trouble using needs: job doesn't wait for jobs it is dependent on to finish before it is processed

Hello,

In our project, we are occasionally experiencing strange pipeline behaviour, where a deployment job appearing at the very end of the pipeline, with clearly defined dependencies via the ‘needs’ facility, apparently does wait for artefacts to be produced by jobs appearing earlier in the pipeline. Since I am not sure whether this is a known issue, it would be great if someone would comment.

We are using Gitlab 12.8.6-ee, self-managed, with both Windows- and Linux-based runners doing the work. There is one multi-stage pipeline and all jobs are producing artefacts that are consumed in later stages. The last job expects artefacts produced by four different jobs in order to create a package that is published as the final build result. Job dependencies are declared using the ‘needs’ directive as described here. All runners are using shell executors.

This is pretty much basic pipelining setup. The final job in the pipeline is defined as follows:

publish:
    stage: publish
    tags: [ publish ]
    script:
        - deploymentscripts/publishArtifacts.sh
    timeout: 25 minutes
    retry:
        max: 1
        when:
            - runner_system_failure
            - stuck_or_timeout_failure        
    needs:
        - job: baselibs:gcc:release:package
          artifacts: true
        - job: cortex:gcc:package
          artifacts: true
        - job: hmi:dolphinintern:msvc:package
          artifacts: true
        - job: pilos:package
          artifacts: true

The intention should be obvious I hope. It waits for several packages to be produced in previous stages, then it calls a script that puts it all together and deploys to a remote server location. Usually it works fine, but sometimes the following behaviour gets reported:

  1. One or more jobs declared in ‘needs’ are still running.
  2. Despite of that, the ‘publish’ job is executed.
  3. It downloads only the available artefacts.
  4. It doesn’t wait for all artefacts to become available.
  5. It produces and publishes an incomplete result.
  6. In the end, the pipeline is marked as successful.
  7. No errors are emitted in the runner output.
    I would expect that the ‘publish’ job as declared above waits for all four needs to finish before it gets processed.

In case that it matters, I have noticed that our runners have different versions: 12.7.1, 12.8.0~beta.2253, 12.9.0~beta.2331 and 12.10.1.

Thank you for your help.

Best regards,
Marko

Have you found a solution for this?

In my case, it only takes for one of the needed jobs to run, for the last job to start running without waiting for the remaining needed job.

No solution yet.
I am still hoping that someone would comment this issue.

It has been some time that we havent observed this issue any more.

We updated our Gitlab instance to version 13.x in January this year and currently we are quite up-to-date. All runners are also 13.7 and newer.

For now, I consider this issue as done. :slight_smile: