Slow Runners caused by caching/extracting cache

Slow Runners caused by caching/extracting cache

Whenever a pipeline starts, the node_modules cache gets checked and extracted. After the task finishes, the runner creates a new cache.
Unfortunately, this takes over 15 Minutes for every task. :frowning:


We are using Gitlab 13.0.5, our Runners are on version 12.8.0~beta (we can’t upgrade them, because on version 13.x the node_modules folder can’t be removed and the pipelines fails).

The node_modules folder is cached globally, because our proxy is very slow. Before a task starts, we run yarn install. The caching part of the ci-configuration is below:

image

The node_modules folder contains a very large amount of files. I assume that’s why it is taking so long.
To speed things up, we created three runners so some stage-tasks run in parallel, but a stage with a single task slows down everything.
How could I lower the amount of time it takes to finish a task?
Is it possible to extract the cache just once for all stages? Or is there another way in speeding the extraction/caching of the runner?
It’s very annoying to wait for a pipeline to finish after 2 hours in total.

Thanks

Sincerely, Kevin

Hi,

where is the runner cache located? I could imagine that IO slows operations down quite significantly. Also, some metrics graphs on systems performance would maybe provide more insights where the real bottleneck on the runner hosts are.

Besides, what exactly does

We are using Gitlab 13.0.5, our Runners are on version 12.8.0~beta (we can’t upgrade them, because on version 13.x the node_modules folder can’t be removed and the pipelines fails).

mean? Is there issue (URL) blocking you from upgrading?

Cheers,
Michael

Hi! I’m going to add to this-- I tested running a simple tar -xf cache.tar.gz <files> command during after_script, and it took about half the time to compress and decompress the cached files. What’s actually going on under the hood that makes the cache step that much slower than simple gzip compression? I use GitLab.com with a custom runner on my own server, if that helps, and the cache is being stored locally on that machine.

3 Likes

You could use the policy: pull for jobs that don’t update the cache. But tar / untar is still a lot faster for us.