Gitlab registry error when kaniko cache attempts to push a layer

Or alternatively: “building and reusing per-MR docker images with kaniko and the gitlab registry as a cache”.

I have a Merge Request build job for a repository that requires a docker image to be built first. The docker image is based on the Dockerfile stored in the repository. The second stage is then built within the image created by the first stage. I am using kaniko to build the docker image.

Originally I had it set up so that the Docker Build stage would only run if the Dockerfile changed, but this caused problems for Merge Requests, because the wrong image would end up being used between different MR branches. There was some cryptic comments in the docs about using only: refs: [merge_requests] but that didn’t seem to help (the code below has this bit commented out).

As an alternative, I’m looking at building the docker image ever time the job runs, but relying on caching to speed up the build process, essentially just retagging the image for every build, except in the rare cases where the Dockerfile has changed:

variables:
  GIT_SUBMODULE_STRATEGY: recursive

# always attempt to build the image, but rely on the cache to optimise if there are no changes
build_docker:
  stage: .pre
  image:
    name: gcr.io/kaniko-project/executor:debug
    entrypoint: [""]
  script:
    - echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"}}}" > /kaniko/.docker/config.json
    - /kaniko/executor --cache=true --cache-repo=$CI_REGISTRY --context $CI_PROJECT_DIR --dockerfile $CI_PROJECT_DIR/Dockerfile --destination $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME
#  only:
#    refs:
#      - merge_requests
#    changes:
#      - Dockerfile

build:
  stage: build
  artifacts:
    when: always
    name: "$env:CI_JOB_STAGE-$env:CI_COMMIT_REF_NAME"
    paths:
      - build/Testing
    reports:
      junit: "build/Testing/**/*.xml"
  image: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME
  script:
    - cmake -B build
    - cmake --build build -v
    - cmake --build build --target tests

I think this should work, but there’s an issue after a build:

 INFO[1215] Pushing layer registry.gitlab.com:2a2c59ff36b85f3f53bce3256345d97acd6b3659bd438ec12430773ce1c1ec9e to cache now 
WARN[1218] error uploading layer to cache: failed to push to destination registry.gitlab.com:fc83d775810318fdd35df579091ebc3bf9b20cda4e288929e1ca79390c89cdc1: HEAD https://index.docker.io/v2/library/registry.gitlab.com/blobs/sha256:b11101269a6284d7ffc132a183bef8b3942fd58e847c8e5c8f398024f78183f4: unsupported status code 401 

Because the layer push to the gitlab registry fails, the image is not available to subsequent builds, so the caching is basically non-operational.

Is there a way to fix this issue?

Also, is there a better way to do all of this? Is there a way to make the first approach work correctly?

I think if you remove --cache-repo=$CI_REGISTRY and leave only --cache then it should work without any problem :thinking:

Now it fails because it tries to push the cache in the root of the Gitlab Registry - and you don’t have access to it, of course.

@rpadovani thank you for the hint - that fixed the issue, although it does take almost 5 minutes to check the cache. Still much faster than rebuilding the image though.

Well, this can be caused by a lot of different factors: how big is the Docker image, how powerful is the runner, how physically distant is the runner from the registry, and so on…