Build randomly fails on Kubernetes runner

We host 1 runner on K8s. The nodejs/Gatsby build randomly failed without a clear reason.

Helm chart values.yml

gitlabUrl: https://gitlab.com
runnerRegistrationToken: <redacted>
rbac:
  create: true
concurrent: 5

.gitlab-ci.yml

cache:
  paths:
    - node_modules/
    - .cache
    - public

image: registry.gitlab.com/<redacted>

variables:
  GIT_DEPTH: 1
  DOCKER_DRIVER: overlay2

# Job name must be 'pages' in order for GitLab to deploy to static site
pages:
  only: # Only run for these branches
  - develop
  - main
  - gitlab-debug

  stage: build

  tags:
  - docker

  before_script:
  - yarn install --no-progress
  - ./prebuild.sh # fetch content repo

  script:
  - ./node_modules/.bin/gatsby build

  artifacts:  # Use by GitLab pages
    paths:
    - public

Pod log:

ERROR: Could not create cache adapter               error=cache factory not found: factory for cache adapter "" was not registered
WARNING: Error streaming logs gitlab/runner-xkkowhyb-project-27801842-concurrent-02qtcc/helper:/logs-27801842-1665683882/output.log: command terminated with exit code 137. Retrying...  job=1665683882 project=27801842 runner=xKKoWHyb
WARNING: Error while executing file based variables removal script  error=pod "runner-xkkowhyb-project-27801842-concurrent-02qtcc" (on namespace "gitlab") is not running and cannot execute commands; current phase is "Failed" job=1665683882 project=27801842 runner=xKKoWHyb
WARNING: Job failed: pod "runner-xkkowhyb-project-27801842-concurrent-02qtcc" status is "Failed"  duration_s=806.933936964 job=1665683882 project=27801842 runner=xKKoWHyb
WARNING: Failed to process runner                   builds=0 error=pod "runner-xkkowhyb-project-27801842-concurrent-02qtcc" status is "Failed" executor=kubernetes runner=xKKoWHyb

Hi @vietnugent :wave:

Based on that factory for cache adapter "" was not registered, gitlab-runner#27610 seems to be about that same problem. My colleague Julian posted a work-around there.

Could you please try that as well and let us know how it goes?

Thank you & kind regards!

Hi, I don’t think it’s a cache problem. That message simply means there isn’t a remote cache, so a local cache will be used. I have multiple jobs for the same project. The shorter ones (8minutes long) usually run fine. The full build (~40minutes long) is the one that’s having this random error.

Similar issue: GitLab K8s Runner fails for unknown reasons

My guess is it’s a communication issue between the controller pod and the build runner pod.