Unreliable results from GitLab APIs called from a CI pipeline

Problem to solve

I’m working on a CI pipeline that:

  1. Checks to see if some artifacts already exist. If they don’t, set a flag to trigger the building of the missing artifacts.
  2. Build the artifacts if the flags require it.
  3. Deploy a finished “thing” that includes the artifacts, either from this pipeline or a previously successful pipeline.

The problem I’m having is that the API to check the existence of an artifact sometimes works and sometimes returns a 404 error.

Steps to reproduce

stages:
  - check_artifacts
  - pre-build
  - deploy

check_lkft_artifact_exists:
  tags:
    - webdev
  stage: check_artifacts
  image: curlimages/curl:latest
  rules:
    - if: '$CI_COMMIT_BRANCH == "staging" && $SCHEDULED_BY_SPRINTO != "true"'
    - if: '$CI_COMMIT_BRANCH == "main" && $SCHEDULED_BY_SPRINTO != "true"'
  script:
    - |
      ARTIFACT_URL="https://gitlab.com/api/v4/projects/$CI_PROJECT_ID/jobs/artifacts/$CI_COMMIT_REF_SLUG/raw/lkft_artifact.txt?job=lkft"
      echo "Checking for LKFT artifact at: $ARTIFACT_URL"
      if curl --connect-timeout 5 --retry 5 --retry-delay 1 --show-error --fail --output /dev/null --header "JOB-TOKEN: $CI_JOB_TOKEN" "$ARTIFACT_URL"; then
        echo "LKFT artifact FOUND in previous pipeline. Will *not* set RUN_LKFT_BUILD=true."
        echo "RUN_LKFT_BUILD=false" > build_flags.env # Explicitly set to false
      else
        echo "LKFT artifact NOT FOUND. Will set RUN_LKFT_BUILD=true to trigger build job."
        echo "RUN_LKFT_BUILD=true" > build_flags.env # This variable indicates the LKFT build job should run
      fi
    - cat build_flags.env # For debugging purposes (VERIFY THIS OUTPUT!)
  artifacts:
    reports:
      dotenv: build_flags.env # Make the variables in this file available to downstream jobs
    expire_in: 1 day # This file is temporary

check_documentation_artifact_exists:
  tags:
    - webdev
  stage: check_artifacts
  image: curlimages/curl:latest
  rules:
    - if: '$CI_COMMIT_BRANCH == "staging" && $SCHEDULED_BY_SPRINTO != "true"'
    - if: '$CI_COMMIT_BRANCH == "main" && $SCHEDULED_BY_SPRINTO != "true"'
  script:
    - |
      ARTIFACT_URL="https://gitlab.com/api/v4/projects/$CI_PROJECT_ID/jobs/artifacts/$CI_COMMIT_REF_SLUG/raw/documentation_artifact.txt?job=documentation"
      echo "Checking for Documentation artifact at: $ARTIFACT_URL"
      if curl --connect-timeout 5 --retry 5 --retry-delay 1 --show-error --fail --output /dev/null --header "JOB-TOKEN: $CI_JOB_TOKEN" "$ARTIFACT_URL"; then
        echo "Documentation artifact FOUND in previous pipeline. Will *not* set RUN_DOCUMENTATION_BUILD=true."
        echo "RUN_DOCUMENTATION_BUILD=false" > build_flags.env # Explicitly set to false
      else
        echo "Documentation artifact NOT FOUND. Will set RUN_DOCUMENTATION_BUILD=true to trigger build job."
        echo "RUN_DOCUMENTATION_BUILD=true" > build_flags.env # This variable indicates the documentation build job should run
      fi
    - cat build_flags.env # For debugging purposes (VERIFY THIS OUTPUT!)
  artifacts:
    reports:
      dotenv: build_flags.env # Make the variables in this file available to downstream jobs
    expire_in: 1 day # This file is temporary

lkft:
  tags:
    - webdev
  stage: pre-build
  # Explicitly declare needs for the dotenv variables to be available for rules.
  needs:
    - job: check_lkft_artifact_exists
      artifacts: true # This makes the dotenv variables available for rules evaluation
  rules:
    - if: '$CI_COMMIT_BRANCH == "staging" && $SCHEDULED_BY_SPRINTO != "true"'
    - if: '$CI_COMMIT_BRANCH == "main" && $SCHEDULED_BY_SPRINTO != "true"'
  image: node:20
  script: |
    echo "RUN_LKFT_BUILD=$RUN_LKFT_BUILD" # Debugging
    if [ "$RUN_LKFT_BUILD" = "true" ]; then
      echo "LKFT build job running..."
      touch lkft_artifact.txt
      ls -al
    else
      echo "LKFT build job skipped."
    fi
  artifacts:
    untracked: false
    expire_in: "1 week"
    paths:
      - lkft_artifact.txt

documentation:
  tags:
    - webdev
  stage: pre-build
  needs:
    - job: check_documentation_artifact_exists
      artifacts: true # This makes the dotenv variables available for rules evaluation
  dependencies:
    - check_documentation_artifact_exists # Ensure its artifacts (including dotenv) are downloaded
  rules:
    - if: '$CI_COMMIT_BRANCH == "staging" && $SCHEDULED_BY_SPRINTO != "true"'
    - if: '$CI_COMMIT_BRANCH == "main" && $SCHEDULED_BY_SPRINTO != "true"'
  image: node:20
  script: |
    echo "RUN_DOCUMENTATION_BUILD=$RUN_DOCUMENTATION_BUILD" # Debugging
    if [ "$RUN_DOCUMENTATION_BUILD" = "true" ]; then
      echo "Documentation build job running..."
      touch documentation_artifact.txt
      ls -al
    else
      echo "Documentation build job skipped."
    fi
  artifacts:
    untracked: false
    expire_in: "1 week"
    paths:
      - documentation_artifact.txt

deployment:
  tags:
    - webdev
  stage: deploy
  rules:
    - if: '$CI_COMMIT_BRANCH == "staging" && $SCHEDULED_BY_SPRINTO != "true"'
    - if: '$CI_COMMIT_BRANCH == "main" && $SCHEDULED_BY_SPRINTO != "true"'
  image: node:20
  needs:
    # We need the check_artifact jobs to ensure their `dotenv` variables are loaded
    - job: check_lkft_artifact_exists
      artifacts: true # We need RUN_LKFT_BUILD variable
    - job: check_documentation_artifact_exists
      artifacts: true # We need RUN_DOCUMENTATION_BUILD variable
    # Reference the jobs that produce the artifacts in order to ensure that the
    # job sequence is correct and the artifacts are pulled in if they are built
    # in this pipeline. If they are not built in this pipeline, the artifacts will be
    # pulled in from the previous successful pipeline.
    - job: lkft
      artifacts: true
      optional: true
    - job: documentation
      artifacts: true
      optional: true
  script: |
    echo "Deployment job running..."
    ls -la # Initial view

    echo "RUN_LKFT_BUILD=$RUN_LKFT_BUILD" # Debugging
    echo "RUN_DOCUMENTATION_BUILD=$RUN_DOCUMENTATION_BUILD" # Debugging

    # --- Conditional Artifact Retrieval for LKFT ---
    # The artifact_path part in the current pipeline download URL was causing issues.
    # Remove it for general job artifact download.
    if [ "$RUN_LKFT_BUILD" = "true" ]; then
      echo "LKFT build ran in this pipeline. Artifacts from current build should already be here."
      # URL="https://gitlab.com/api/v4/projects/$CI_PROJECT_ID/jobs/artifacts/$CI_PIPELINE_ID/download?job=lkft"
    else
      echo "LKFT build was skipped in this pipeline. Attempting to retrieve artifacts from latest successful pipeline on branch."
      URL="https://gitlab.com/api/v4/projects/$CI_PROJECT_ID/jobs/artifacts/$CI_COMMIT_REF_SLUG/download?job=lkft"
      echo "$URL"
      curl --connect-timeout 5 --retry 5 --retry-delay 1 --location --header "JOB-TOKEN: $CI_JOB_TOKEN" "$URL" --output lkft_artifacts.zip || { echo "Failed to download current LKFT artifacts. Continuing..."; }
      if [ -f lkft_artifacts.zip ]; then
        unzip lkft_artifacts.zip -d lkft_downloaded_artifacts || { echo "Failed to unzip LKFT artifacts. Continuing..."; }
        rm lkft_artifacts.zip
      else
        echo "No current LKFT artifacts zip found to unzip."
      fi
    fi

    # --- Conditional Artifact Retrieval for Documentation ---
    if [ "$RUN_DOCUMENTATION_BUILD" = "true" ]; then
      echo "Documentation build ran in this pipeline. Artifacts from current build should already be here."
      # URL="https://gitlab.com/api/v4/projects/$CI_PROJECT_ID/jobs/artifacts/$CI_PIPELINE_ID/download?job=documentation"
    else
      echo "Documentation build was skipped in this pipeline. Attempting to retrieve artifacts from latest successful pipeline on branch."
      URL="https://gitlab.com/api/v4/projects/$CI_PROJECT_ID/jobs/artifacts/$CI_COMMIT_REF_SLUG/download?job=documentation"
      echo "$URL"
      curl --connect-timeout 5 --retry 5 --retry-delay 1 --location --header "JOB-TOKEN: $CI_JOB_TOKEN" "$URL" --output documentation_artifacts.zip || { echo "Failed to download current documentation artifacts. Continuing..."; }
      if [ -f documentation_artifacts.zip ]; then
        unzip documentation_artifacts.zip -d docs_downloaded_artifacts || { echo "Failed to unzip documentation artifacts. Continuing..."; }
        rm documentation_artifacts.zip
      else
        echo "No current documentation artifacts zip found to unzip."
      fi
    fi

    echo "Contents after artifact retrieval:"
    ls -la

If I have this CI file in a repository with no artifacts then the expected outcome of the pipeline would be that the “lkft” and “documentation” artifacts would be built, and the “deployment” job would use those artifacts.

On the second run, I would expect the existing artifacts to be detected, the “pre-build” jobs to bail and the “deployment” job to pull in artifacts from the last successful run.

On subsequent runs, I would expect - all things being equal - the same behaviour of run #2. However, that is NOT what I get. WIthout any explanation, the call to try to retrieve a file from an artifact fails with a 404 error even though the artifact exists and the file exists in the artifact!

It looks like my issue might be related to Artifacts archive downloaded via API cannot be extracted (#29118) · Issues · GitLab.org / GitLab · GitLab but that isn’t entirely clear because the linked issue seems to be covering quite a few different problems being experienced.

Configuration

No special configuration.

Versions

Please add an x whether options apply, and add the version information.

  • Self-managed
  • GitLab.com SaaS
  • Dedicated

Versions

  • GitLab (Web: /help or self-managed system information sudo gitlab-rake gitlab:env:info):
    GitLab Enterprise Edition 18.1.0-pre 6b1662fa73b

I think I understand what is going wrong.

On the first run, the pre-build jobs do their thing and we have artifacts uploaded. :check_mark:

On the second run, the check_artifacts jobs do their thing and detect we have artifacts from run #1. So the pre-build jobs get run, don’t do anything … and don’t create new artifacts. The deployment job runs and pulls in the artifacts from run #1. :check_mark:

On the third run, the check_artifacts jobs run and the API says there aren’t any artifacts

I think that this is because the last successful run didn’t actually create any artifacts because the pre-build jobs didn’t do anything!

So, despite the fact that the artifacts actually exist, the API is looking specifically at the last successful run.

I cannot immediately see a solution to this problem.

I ended up switching from artifacts to packages, and introducing new stages to look for the newest package, check if its timestamp was new enough and building new packages if the timestamp was too old.

A shame that GitLab doesn’t have an artiface that allows me to retrieve an archive file from any successful pipeline and not just the latest, but there we go.