How to use gitlab cache

Hi there.

I’m trying to figure out how to use the caching feature of the gitlab pipelines.

The setup we are using is very simple.
We use the docker executor.
The default image is just a node:alpine image.
Every stage needs to install some npm packages, so initially we started with yarn install in the before_script.

The file was something like this

default:
  image: node:alpine

  before_script:
    - yarn install

stages:
  - audit
  - lint
  - test
  - build

audit:
  stage: audit
  script:
    - yarn audit --level moderate

lint:
  stage: lint
  script:
    - yarn lint

test:
  stage: test
  script:
    - yarn coverage

build:
  stage: build
  script:
    - yarn build

then, once we saw that yarn install spends 4-5 mins each time, we wanted to cache the results and not run it on every step.
yarn.lock is the same, so the output of yarn install is the same as well.

I tried every single combination I found on the internet about cache.
None of them worked.

I tried caching node_modules, I tried using a .yarn directory as the cache, etc.
I tried untracked: true, I tried global cache or adding it in every job / stage, etc.

for example, one of the attempts I made ( I have plenty variations, I won’t post all of them )

default:
  image: node:alpine

  before_script:
    - yarn install

stages:
  - audit
  - install-packages
  - lint
  - test
  - build

audit:
  stage: audit
  script:
    - yarn audit --level moderate

install-packages:
  stage: install-packages
  script:
    - yarn install
  cache:
    paths:
      - node_modules
    untracked: true
    policy: pull-push

lint:
  stage: lint
  script:
    - yarn lint
  cache:
    paths:
      - node_modules
    policy: pull

test:
  stage: test
  script:
    - yarn coverage
  cache:
    paths:
      - node_modules
    policy: pull

build:
  stage: build
  script:
    - yarn build
  cache:
    paths:
      - node_modules
    policy: pull

Every time, the result was the same more or less.
The install-packages stage I made would have a message like this at the end:

Saving cache for successful job 00:20
Creating cache default...
node_modules: found 70214 matching files and directories 
untracked: found 59929 files                       
No URL provided, cache will be not uploaded to shared cache server. Cache will be stored only locally. 
Created cache
Job succeeded

but the very next stage, would say

Checking out bf6cbf54 as tech-debt/test-pipeline-fix...
Removing node_modules/
Skipping Git submodules setup
Restoring cache 00:00
Checking cache for default...
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted. 
Successfully extracted cache
Executing "step_script" stage of the job script 00:01
Using docker image sha256:9d..4 for docker.ko...f3 ...
$ yarn lint
yarn run v1.22.5
$ eslint --ext .ts,.tsx ./src/
/bin/sh: eslint: not found
error Command failed with exit code 127.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
ERROR: Job failed: exit code 127

and go on to fail because node_modules was not there.

my questions :

  • where does that cache go exactly? do we have to set something in the runner config?
  • why do I see the Removing node_modules/ in the next stage/job? Isn’t the whole point of the cache to add/mount that directory? why is it removing it?

I looked at the docs and I cannot find any answer, unless I’m missing something.

In paper, it looks so simple.
Mark cache directory path, push or pull, done.
But it’s not working. I must be doing something wrong.
Any help would be appreciated.

Thank you.

So, a cache is for things like dependencies that you install for the pipeline. For example, you might need to install Node, so that your pipeline jobs can run npm.

What you need is to pass artifacts between the pipeline stages. e.g.

install-packages:
  stage: install-packages
  script:
    - yarn install
artifacts:
    paths:
      - node_modules
    expire_in: 2 week

You will probably want to read about when artifacts are deleted and the related sections on how to keep the latest pipeline.

1 Like

The gitlab documentation specifically mentions caching node modules with cache: in their examples:

https://docs.gitlab.com/ee/ci/caching/

We’ve tried several incarnations of this setup, but none of them seems to be able to restore the cache for the current or subsequent runs of the pipeline. Any advice on how to debug this would be welcome, like how can we inspect the cache after a pipeline is complete, or at the beginning of a new pipeline?

Hi,

With Docker executor cache from jobs is stored in /cache dir in the container itself. Since the container is ephemeral this is not stored on Docker host out of the box. You need to configure your GitLab Runner for local cache. Make sure you have something like this in your config.toml (not including other options)

[[runners]]
cache_dir = "/cache"
[runners.docker]
disable_cache = false
cache_dir = ""
volumes = ['/cache']

There is also option to use distributed cache.

2 Likes

that was the issue!
works great now.

Thank you @balonik

I have the exact same problem but the solution doesn’t work.
my config.toml looks like this:

concurrent = 12
check_interval = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "Desktop Runner"
  url = "https://gitlab.com/"
  token = "[HIDDEN]"
  executor = "docker"
  cache_dir = "/cache"
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "ubuntu"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]
    cache_dir = ""
    shm_size = 0

(which is a combination of the default file and your suggestion)

Any idea?

The runner is on my local windows computer, setup is pretty much default - it’s my first try using a runner on my local computer.

@fortuneNext
Unfortunately, I don’t have recent experience with Docker Desktop on Windows. Does it run in the WSL nowdays? I think it used to spinup Ubuntu VM in VirtualBox in the past. Make sure the path /cache actually exists in the Linux box. I would also check the gitlab-runner pod logs for any errors to maybe help identify the cause.

Im really not sure, not an expert on that stuff either - but I think it runs on WSL, yeah.
Generally, I just followed the full tutorial on how to do it on windows… :confused:

By prod logs, you mean the logs of the runner itself?

Running with gitlab-runner 14.8.2 (c6e7e194)
  on DESKTOP-UJSU1UD pd5fqeha
Preparing the "docker" executor
00:04
Using Docker executor with image node:lts ...
Pulling docker image node:lts ...
Using docker image sha256:b426ce8b7669391d2b17144e06723bca91cd71420a11a0102e62dc9db43775b6 for node:lts with digest node@sha256:61b6cc81ecc3f94f614dca6bfdc5262d15a6618f7aabfbfc6f9f05c935ee753c ...
Preparing environment
00:01
Running on runner-pd5fqeha-project-22849827-concurrent-0 via DESKTOP-UJSU1UD...
Getting source from Git repository
00:03
Fetching changes with git depth set to 50...
Reinitialized existing Git repository in /builds/fortuneNext/groupschedule/.git/
Checking out 91c0f4c0 as better-package-structure...
Skipping Git submodules setup
Restoring cache
00:01
Not downloading cache e16610382ab78c75dd041a5e0c31a7cb0b3529be due to policy
Executing "step_script" stage of the job script
00:44
Using docker image sha256:b426ce8b7669391d2b17144e06723bca91cd71420a11a0102e62dc9db43775b6 for node:lts with digest node@sha256:61b6cc81ecc3f94f614dca6bfdc5262d15a6618f7aabfbfc6f9f05c935ee753c ...
$ npm ci
npm WARN deprecated source-map-resolve@0.6.0: See https://github.com/lydell/source-map-resolve#deprecated
npm WARN deprecated request@2.88.2: request has been deprecated, see https://github.com/request/request/issues/3142
npm WARN deprecated har-validator@5.1.5: this library is no longer supported
npm WARN deprecated uuid@3.4.0: Please upgrade  to version 7 or higher.  Older versions may use Math.random() in certain circumstances, which is known to be problematic.  See https://v8.dev/blog/math-random for details.
npm WARN deprecated uuid@3.4.0: Please upgrade  to version 7 or higher.  Older versions may use Math.random() in certain circumstances, which is known to be problematic.  See https://v8.dev/blog/math-random for details.
added 1648 packages, and audited 1650 packages in 41s
132 packages are looking for funding
  run `npm fund` for details
12 vulnerabilities (8 moderate, 4 high)
To address all issues (including breaking changes), run:
  npm audit fix --force
Run `npm audit` for details.
Saving cache for successful job
00:14
Creating cache e16610382ab78c75dd041a5e0c31a7cb0b3529be...
node_modules/: found 67097 matching files and directories 
No URL provided, cache will be not uploaded to shared cache server. Cache will be stored only locally. 
Created cache
Cleaning up project directory and file based variables
00:01
Job succeeded
Running with gitlab-runner 14.8.2 (c6e7e194)
  on DESKTOP-UJSU1UD pd5fqeha
Preparing the "docker" executor
00:05
Using Docker executor with image trion/ng-cli ...
Pulling docker image trion/ng-cli ...
Using docker image sha256:9770f767536345d6c59bda1c5090f2bf04c95c56be60c1a4242369fbb0f786d9 for trion/ng-cli with digest trion/ng-cli@sha256:25eba2cedc7960b679ff173c671fa0380f9b62b6a78330b54ad13a072a1721eb ...
Preparing environment
00:01
Running on runner-pd5fqeha-project-22849827-concurrent-2 via DESKTOP-UJSU1UD...
Getting source from Git repository
00:04
Fetching changes with git depth set to 50...
Reinitialized existing Git repository in /builds/fortuneNext/groupschedule/.git/
Checking out 91c0f4c0 as better-package-structure...
Skipping Git submodules setup
Restoring cache
00:03
Checking cache for e16610382ab78c75dd041a5e0c31a7cb0b3529be...
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted. 
Successfully extracted cache
Executing "step_script" stage of the job script
00:03
Using docker image sha256:9770f767536345d6c59bda1c5090f2bf04c95c56be60c1a4242369fbb0f786d9 for trion/ng-cli with digest trion/ng-cli@sha256:25eba2cedc7960b679ff173c671fa0380f9b62b6a78330b54ad13a072a1721eb ...
$ ng build
Node packages may not be installed. Try installing with 'npm install'.
Could not find the '@angular-devkit/build-angular:browser' builder's node package.
Cleaning up project directory and file based variables
00:03
ERROR: Job failed: exit code 1

There are no cache related errors in the Job logs. What’s your .gitlab-ci.yml?

The missing packages should be in the cache, but aren’t loaded actually. Idk why the other message (regarding deletion of node_modules) is suddenly missing, it was in the logs before :confused:

Gitlab CI (I suppose the error wouldn’t be here as gitlab shared runners run it without any problem):

image: trion/ng-cli

stages:
  - .pre
  - Build and Test
  - Deploy

.angular:
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: pull

.functions:
  cache:
    key:
      files:
        - functions/package-lock.json
    paths:
      - functions/node_modules/
    policy: pull

.angularAndFunctions:
  cache:
    - key:
        files:
          - package-lock.json
      paths:
        - node_modules/
      policy: pull
    - key:
        files:
          - functions/package-lock.json
      paths:
        - functions/node_modules/
      policy: pull


Install Angular Dependencies:
  stage: .pre
  image: node:lts
  script: npm ci
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: push


Install Functions Dependencies:
  stage: .pre
  image: node:lts
  script:
    - cd functions
    - npm ci
  cache:
    key:
      files:
        - functions/package-lock.json
    paths:
      - functions/node_modules/
    policy: push

Lint Angular and Functions:
  stage: Build and Test
  script:
    - ng lint
    - cd functions
    - npm run lint
  extends:
    - .angularAndFunctions

Test Angular:
  stage: Build and Test
  script: ng test --progress false --watch false
  image: trion/ng-cli-karma
  extends: .angular

Build Angular:
  stage: Build and Test
  script: ng build
  artifacts:
    paths:
      - "dist/"
  extends: .angular


Build Angular Prod:
  stage: Build and Test
  script: ng build --prod
  extends: .angular

Build Functions:
  stage: Build and Test
  script:
    - cd functions
    - npm run build
  artifacts:
    paths:
      - "functions/lib"
  extends: .functions

Deploy Angular:
  stage: Deploy
  image: andreysenov/firebase-tools
  script:
    - firebase deploy --only hosting --project=dev --token $FIREBASE_TOKEN
  dependencies:
    - Build Angular

Deploy Functions:
  stage: Deploy
  image: andreysenov/firebase-tools
  script:
    - firebase deploy --only functions --project=dev --token $FIREBASE_TOKEN
  extends: .functions
  dependencies:
    - Build Functions

This seems to be a long existing bug.

1 Like

So, I moved the gitlab runner to the WSL now instead of running directly on windows.
With and without your additions, suddenly one job is successfull (the lint and test step):

Running with gitlab-runner 14.8.2 (c6e7e194)
  on Tims WSL Desktop Runner 8xHBxThv
Preparing the "docker" executor
00:04
Using Docker executor with image trion/ng-cli ...
Pulling docker image trion/ng-cli ...
Using docker image sha256:58a81d3dae88391b1ead7679b7c1cc3410253dc44fe1db23eb2db2848631124c for trion/ng-cli with digest trion/ng-cli@sha256:12c64f3df8d1fdd3832bef5fa5f7c5132a822d7d7f0dd667893fe78cd1f19fad ...
Preparing environment
00:03
Running on runner-8xhbxthv-project-22849827-concurrent-0 via DESKTOP-UJSU1UD...
Getting source from Git repository
00:06
Fetching changes with git depth set to 50...
Reinitialized existing Git repository in /builds/fortuneNext/groupschedule/.git/
Checking out 91c0f4c0 as better-package-structure...
Removing node_modules/
Skipping Git submodules setup
Restoring cache
00:09
Checking cache for e16610382ab78c75dd041a5e0c31a7cb0b3529be...
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted. 
Successfully extracted cache
Checking cache for 0ac53856ed401c2c35b7b05a3dcc0e16e0d17276...
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted. 
Successfully extracted cache
Executing "step_script" stage of the job script
00:09
Using docker image sha256:58a81d3dae88391b1ead7679b7c1cc3410253dc44fe1db23eb2db2848631124c for trion/ng-cli with digest trion/ng-cli@sha256:12c64f3df8d1fdd3832bef5fa5f7c5132a822d7d7f0dd667893fe78cd1f19fad ...
$ ng lint
Your global Angular CLI version (13.2.6) is greater than your local version (13.2.5). The local Angular CLI version is used.
To disable this warning use "ng config -g cli.warnings.versionMismatch false".
Linting "GroupSchedule"...
All files pass linting.
$ cd functions
$ npm run lint
> lint
> eslint --ext .js,.ts .
Saving cache for successful job
00:01
Not uploading cache e16610382ab78c75dd041a5e0c31a7cb0b3529be due to policy
Not uploading cache 0ac53856ed401c2c35b7b05a3dcc0e16e0d17276 due to policy
Cleaning up project directory and file based variables
00:01
Job succeeded

Which is interessting, as it now again writes “removing node_modules” (but not functions/node_modules !), altough it definitly needs both to work.

The other jobs still fail without writing that.
My confusion is complete.

cache cannot be a list, you cannot have multiple key defined in .angularAndFunctions:
ref: docs

For posterity, the answers in this thread are no longer accurate.

The only way I got it to work is over here: Using multiple caches in gitlab ci broken when not using distributed caching (#36877) · Issues · GitLab.org / gitlab-runner · GitLab.