Cannot run docker compose in docker container as part of a job

Cannot run docker compose in docker container as part of a job

  • I am attempting to run docker compose in a container as a job in my pipeline. My pipeline has two jobs: Build and Test. The Build job builds the project and puts the artifacts in the build/ directory of the runner container. The Test job then mounts the binaries (contained in the artifacts) into a docker compose instance and starts the compose instance to run some tests on the containers within the docker compose instance. The Test job also runs in a custom python3.8 image which is stored locally from the host machine of the runner.

What are you seeing, and how does that differ from what you expect to see?

  • When the pipeline runs, the docker compose up -d line causes the job to error out with this message: /bin/bash: line 19: docker: command not found

Are you using self-managed or GitLab.com?

  • self-hosted runner

What version are you on?

Version:      15.1.0
Git revision: 76984217
Git branch:   15-1-stable
GO version:   go1.17.7
Built:        2022-06-20T10:10:54+0000
OS/Arch:      linux/amd64

.gitlab-ci.yml

Build:
  stage: Build
  script:
    - cmake -E make_directory build
    - cd build && cmake -DOUTPUT_PATH_PREFIX=$(pwd) ../ -DCMAKE_BUILD_TYPE=Debug
    - cmake --build . --config Debug -- -j$(nproc)
  artifacts:
    paths:
      - "build"
    expire_in: 24 hours

Test:
  stage: Test
  script:
    - cd /builds/project/gen1-compose
    - docker compose up -d
    - docker exec main-1 bash ./start.sh
    - python3 main.py
    - docker compose down

config.toml

concurrent = 1
check_interval = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "CUSTOM-RUNNER"
  url = "https://gitlab.com"
  token = "ABC-123"
  executor = "docker"
    [runners.custom_build_dir]
      [runners.cache]
      [runners.cache.s3]
      [runners.cache.gcs]
      [runners.cache.azure]
    [runners.docker]
      tls_verify = false
      image = "https://example/image"
      privileged = true
      disable_entrypoint_overwrite = false
      oom_kill_disable = false
      disable_cache = false
      volumes = ["/cache", "/var/volatile/tmp"]
      pull_policy = ["if-not-present"]
      shm_size = 0

Custom python3.8 Dockerfile

FROM docker/compose as base

RUN apt-get update && apt-get install wget unzip libgconf-2-4 neovim -y

FROM python:3.8 as py38

ENV DEBIAN_FRONTEND noninteractive
ENV TARGET_CHROME_VERSION 111.0.5563.64-1
ENV TARGET_WEBDRIVER_VERSION 111.0.5563.64

# Set up the Chrome PPA
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
RUN echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list

# Update the package list and install chrome
RUN apt-get update -y
RUN apt-get install -y "google-chrome-stable=${TARGET_CHROME_VERSION}"

# Set up Chromedriver Environment variables
ENV CHROMEDRIVER_DIR /chromedriver
RUN mkdir $CHROMEDRIVER_DIR

# Download and install Chromedriver
RUN wget -q --continue -P $CHROMEDRIVER_DIR "http://chromedriver.storage.googleapis.com/$TARGET_WEBDRIVER_VERSION/chromedriver_linux64.zip"
RUN unzip $CHROMEDRIVER_DIR/chromedriver* -d $CHROMEDRIVER_DIR

# Put Chromedriver into the PATH
ENV PATH $CHROMEDRIVER_DIR:$PATH

Please let me know if more info is needed and I will gladly add it. Thank you!

Looking at your Dockerfile, I don’t see anything from the base target going into the py38 target. That is, I think you get the same image if you remove everything up to the second FROM.

Then, based on config.toml, both your jobs use the same image. Because nothing from the base target ended up in your custom py38 image, the docker command is not in it and you get the error you see.

I suggest you use two images: one that has everything needed for the Build job (assuming it doesn’t need docker and google-chrome-stable) and one for the Test job.