CI: Correct use of cache and artifacts to speed up pipeline

Hi forum,
I’m trying to set up a .gitlab-ci.yml to automate building binaries (using two seperate runners) that users can download and work on.

Background: I’m using a custom build system to turn font source files into font binaries and post-process them in multiple steps. The build system and all tools used in it are Python modules and are installed in a local virtual environment in the project folder when a user works on it. An example pipeline looks like this:

Source file → raw binary → processed binary for opening in a special editor → pre-final binary ⇶ special final variants of the previous binary for different target environments → test release.

Users may want to grab a file from anywhere in that pipeline for working on or testing.

I went in with the following expectations for the pipeline:

  1. By default, on non-master branches, only produce the raw binary because that’s what people want most often. The other jobs can be triggered manually. On master, a test release (= everything) should be built by default so we can test everything.
  2. The virtual environment should be shared between jobs in the same pipeline (and possibly all pipelines for a branch) to avoid redundant module fetches. If it gets lost, it will automatically be regenerated, so it’s not vital.
  3. Jobs should use the artifacts from the dependent jobs without rebuilding anything by themselves, i.e. I expected a job to run all stages/jobs it depends on and use all their artifacts for further processing. (The build system puts artifacts into build/ subdirectories, the special editor files to go source/*.ttf, the test-release goes into release/.)
  4. The build artifacts should only include artifacts the job itself produces.

After a lot of head-scratching and fiddling, my .gitlab-ci.yml looks like this:

before_script:
  - . bootstrap.sh # Set up the virtual environment unless it exists already.

# Cache the virtual environment and build directory just for the branch
cache:
  key: "$CI_COMMIT_REF_NAME"
  paths:
    - venv
    - .doit.db
    - __pycache__
    - build/**/*.*
    - sources/*.ttf

stages:
  - build
  - build-vtt
  - build-hinted
  - build-production
  - build-test-release

unhinted-fonts:
    stage: build
    script:
      - doit unhinted
    artifacts:
      expire_in: 10 days
      paths:
        - build/unhinted/*.*

vtt-fonts:
    stage: build-vtt
    dependencies:
      - unhinted-fonts
    when: manual
    script:
      - doit ufo2vtt
    artifacts:
      expire_in: 10 days
      paths:
        - sources/*.ttf

hinted-fonts:
    stage: build-hinted
    dependencies:
      - vtt-fonts
    when: manual
    script:
      - doit hinted
    artifacts:
      expire_in: 10 days
      paths:
        - build/hinted/*.*

desktop-fonts:
    stage: build-production
    dependencies:
      - hinted-fonts
    when: manual
    script:
      - doit desktop
    artifacts:
      expire_in: 10 days
      paths:
        - build/desktop/*.*

app-fonts:
    stage: build-production
    dependencies:
      - hinted-fonts
    when: manual
    script:
      - doit app
    artifacts:
      expire_in: 10 days
      paths:
        - build/app/*.*

web-fonts:
    stage: build-production
    dependencies:
      - hinted-fonts
    when: manual
    script:
      - doit web
    artifacts:
      expire_in: 10 days
      paths:
        - build/web/**/*.*

I get the following behavior on Gitlab 10.4.3 with the 10.4 shell runner:

  1. Jobs always regenerate the virtual environment. Shouldn’t multiple retries of the unhinted-fonts job reuse the virtual environment?
  2. Jobs don’t reuse the build/ cache.
  3. Manually triggering e.g. the hinted-fonts job (stage 3) after only the unhinted-fonts(stage 1) job ran does not actually run the vtt-fonts job (stage 2). The build system handles this without a problem, but why isn’t vtt-fonts run before that? This would actually be okay if the build files are cached so a manual triggering of stage 2 would just use the files generated in stage 3.
  4. Going from unhinted-fonts to vtt-fonts downloads the artifacts but the downloaded files are regenerated anyway. I suppose the build system isn’t fooled, maybe because the caching works differently from what I expect?

So, what am I doing wrong?

Side note: I’d like to cache the virtual environment for the entire branch but restrict the build/ cache to the current pipeline, since commits will probably change the source files and the binaries get redone anyway. If this separation isn’t possible, keeping the cache to the pipeline only may work as well, something I’d need to test in practice.

Edit: Using the default cache key makes cache passing work, but it’s not the sharing strategy I had in mind…