Container Registry Question

Our pipeline so far is very simple however some of the jobs take a while to finish because we have to install a whole bunch of packages every time:

image: php7.1

...

unit-test:
  stage: test
  before_script:
    - apt-get update -yqq
    - apt-get install git libmcrypt-dev libpq-dev libcurl4-gnutls-dev libicu-dev libvpx-dev libjpeg-dev libpng-dev libxpm-dev zlib1g-dev libfreetype6-dev libxml2-dev libexpat1-dev libbz2-dev libgmp3-dev libldap2-dev unixodbc-dev libsqlite3-dev libaspell-dev libsnmp-dev libpcre3-dev libtidy-dev -yqq
    - docker-php-ext-install mbstring mcrypt pdo_pgsql curl json intl gd xml zip bz2 opcache
    - curl --show-error --silent https://getcomposer.org/installer | php -- --version=1.10.17
    - php composer.phar install
    - pecl install xdebug-2.9.2
    - docker-php-ext-enable xdebug

...

I was reading up on a way to improve our pipelines and saw the Container Registry page. However I’m not sure if I understand what the purpose of the container registry is because I’m very new to Docker and images and DevOps in general.

From my understanding I could create my own image so that instead of doing image: php7.1, I could do image: custom-php7.1? And this custom PHP image already has everything we need so we don’t have to do any apt-get install?

yes, you can create your personal image with base php7.1 image and other deps. to the new custom image.

2 Likes

What is in fact container registry?

it just place where we can store our images, essentially like dockerhub.

I think you have two possible paths:

  • Container [sic] (image) Registry, using custom docker images; or
  • GitLab-CI Cache. This has the benefit of being a bit simpler to setup and maintain.

The custom image has an advantage that when available locally, no external network access is required for the pipeline. However, it does mean you need to maintain (periodically rebuild) the image, otherwise it might be prone to bug/security lag.


Container Registry

I have to laugh a little (at the gitlab naming convention), since the docs say

With the GitLab Container Registry, every project can have its
own space to store Docker images.

(bold is mine). From Docker overview | Docker Documentation,

Docker registries

A Docker registry stores Docker images.

(sfsg)

Images

An image is a read-only template with instructions for creating a Docker container. …

Containers

A container is a runnable instance of an image. …

So an image is a static set of files (that typically include executables), and a container is one instance of that image in some form of running state.

Bottom line on that part … really, I think gitlab’s “Container Registry” is a “Docker Image Registry”, but perhaps they didn’t want it misinterpreted as some form of picture gallery. I’ll stick with “Container Registry”, but realize that it stores docker images.

The image: tag in your .gitlab-ci.yml references a docker image on hub.docker.com. It can be advantageous to download it from somewhere other than hub.docker.com, though:

  1. with or without a docker account, they impose rate-limits;
  2. you may want to cache it locally so that you don’t download the same image repeatedly; or
  3. you may want to work on a customized image (that you don’t want to upload to hub.docker.com).

I think the latter is applicable here. There are (at least) two ways to host a custom registry, whether on-prem or on some not-docker hub/registry. For me, I run a local docker registry separate from gitlab, since almost all of the custom packages I create are usable both inside and outside of gitlab. (In fact, some of them are useful when gitlab has not yet started.) For this, I deployed a local registry (including a local-only SSL key/cert pair, required for many use-cases) and access it via on-prem ports.

Others might not want to put in the effort to run the docker registry external to gitlab, so the Container Registry might be an alternative, in which case one needs to reference GitLab Container Registry administration | GitLab (for enabling/setup) and GitLab Container Registry | GitLab (for use).

I don’t have experience with the Container Registry itself, but if you follow those guidelines (or deploy your own Docker Registry), then you could create a custom docker image with something like the following (untested) Dockerfile:

FROM php7.1

RUN apt-get update -yqq \
  && apt-get install -yqq git libmcrypt-dev libpq-dev libcurl4-gnutls-dev libicu-dev libvpx-dev libjpeg-dev \
    libpng-dev libxpm-dev zlib1g-dev libfreetype6-dev libxml2-dev libexpat1-dev libbz2-dev libgmp3-dev \
    libldap2-dev unixodbc-dev libsqlite3-dev libaspell-dev libsnmp-dev libpcre3-dev libtidy-dev \
  && docker-php-ext-install mbstring mcrypt pdo_pgsql curl json intl gd xml zip bz2 opcache \
  && curl --show-error --silent https://getcomposer.org/installer | php -- --version=1.10.17 \
  && php composer.phar install \
  && pecl install xdebug-2.9.2 \
  && docker-php-ext-enable xdebug

Then something like:

docker build -t myphp .  # from the dir containing the Dockerfile
docker tag myphp myregistry.mydomain.com/myphp:7.1
docker push myregistry.mydomain.com/myphp:7.1

After that, change your .gitlab-ci.yml to use it, as in

image: myregistry.mydomain.com/myphp:7.1

unit-test:
  stage: test
  before_script:
    - # none of the build stuff required anymore
  script:
    - ...

GitLab-CI Cache

With this option, you would not need a registry, you would instead cache the dependency files so that a subsequent apt-get install would be much faster. Since these are os packages that might have pre-/post-install scripts and such, it would probably be best to cache the *.deb files downloaded into /var/cache/apt/archives/ (and continue to let apt-get run all associated scripts). (Spoiler: googling gitlab ci cache apt-get produces many relevant-looking links.)

I haven’t tested it, but Caching apt (#991) · Issues · GitLab.org / gitlab-runner · GitLab suggests that one can do something like:

image: php7.1

cache:
  paths:
  - apt-cache/

unit-test:
  stage: test
    - apt-get update -yqq
    - apt-get -o dir::cache::archives="apt-cache" install -yqq ...
...

(There’s also a later comment, Caching apt (#991) · Issues · GitLab.org / gitlab-runner · GitLab, not sure which is better for you.)

You still run apt-get update and apt-get install, but if all is good then none of the cached .deb files would need to be re-downloaded (the only network activity is apt-get update). When a package is updated, then that one package is downloaded and the cache would be updated (so that the next time the pipeline runs, it is benefiting from the recently-updated package).

The cache is stored where the gitlab-runner is installed; depending on your use-case, you might want/need a distributed cache, in which case you need something like S3 object storage. In my case, I installed min.io for on-prem S3-compatible storage, configured gitlab for it, and voilà, distributed cache.

1 Like

Wow thanks a lot for the guide @bill.evans - I’m going to try this once HTTPS is enabled on our GitLab server.