I am trying to add CI/CD to our workflow. For that purpose I would like to not touch any of the repos at all and add this functionality on top in side repos (that use the source as git submodule
). This ninja approach will ensure that some of my colleagues, who are against version control in general let alone things such as CICD, will have very little to do but will be able to enjoy the benefits at least when it comes to automatic packaging and deployment.
A colleague of mine has created a subgroup project (we all have access to it) in our GitLab company instance (we don’t have direct access to it, that is we do not manage it and are simply end-users). Inside project there are multiple repos - some just for experimentation but others actual products for customers. Let’s say the repos that I am interested in are called r1, r2 and r3. Each repo is a Python repo with a setup.py
to build a wheel using setuptools.setup
with r2 and r3 using r1 as a dependency (all mentioned in the respective setup.py
files).
My plan is to do the following:
- Create a repository that will take r1, r2 and r3 as
git submodule
s. All three are initialized inside./components/
inside their respective directory (e.g../components/r1
for r1) - Add a
.gitlab-ci.yml
at the top level of this repo that will handle the building of each package for eachgit submodule
using the providedsetup.py
file.
My .gitlab-ci.yml
currently contains just a simple stage (with some log messages for debugging purposes) with a single job.
.gitlab-ci.yml
.git_vars:
variables:
GIT_SUBMODULE_STRATEGY: recursive
GIT_SUBMODULE_DEPTH: 1
stages:
- build
build-pypl-pkg:
stage: build
rules:
image: python:latest
variables: !reference [.git_vars, variables]
script:
- echo Installing Twine for publishing PyPI package
- pip install build twine
# TODO Remove. Currently for debugging purposes
- cat .gitmodules
# TODO Remove. Currently for debugging purposes
- ls -alhR components/
- echo Building package for component SWA Generic
- python -m build components/swa_generic/
- echo Building package for component SWA Kernel
- python -m build components/swa_kernel/
- echo Building package for component SWA Visibility
- python -m build components/swa_visibility/
- echo Package will be published at ${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/pypi
- TWINE_PASSWORD=${CI_JOB_TOKEN} TWINE_USERNAME=gitlab-ci-token python -m twine upload --verbose --repository-url ${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/pypi dist/*
The job is more or less copy-paste from the official GitLab documentation on building PyPI-compatible packages and publishing those to the project’s repository.
The job fails with the following (I have used dummy data but the structure in URLs and paths is as close to the real thing as possible) errors:
Fetching changes with git depth set to 20...
Initialized empty Git repository in /builds/company/deperatment/group/cloud/services/project/.git/
Created fresh repository.
Checking out 05d65552 as main...
Updating/initializing submodules recursively with git depth set to 1...
Submodule 'r1 (https://gitlab.example.com/company/deperatment/group/project/r1.git) registered for path 'components/r1'
Submodule 'r2' (https://gitlab.example.com/company/deperatment/group/project/r2.git) registered for path 'components/r2'
Submodule 'r3' (https://gitlab.example.com/company/deperatment/group/project/r3.git) registered for path 'components/r3'
Synchronizing submodule url for 'components/r1'
Synchronizing submodule url for 'components/r2'
Synchronizing submodule url for 'components/r3'
Cloning into '/builds/company/deperatment/group/cloud/services/project/components/r1'...
fatal: could not read Username for 'https://gitlab.example.com': No such device or address
fatal: clone of 'https://gitlab.example.com/company/deperatment/group/project/r1.git' into submodule path '/builds/company/deperatment/group/cloud/services/project/components/r1' failed
Failed to clone 'components/r1'. Retry scheduled
Cloning into '/builds/company/deperatment/group/cloud/services/project/components/r2'...
fatal: could not read Username for 'https://gitlab.example.com': No such device or address
fatal: clone of 'https://gitlab.example.com/company/deperatment/group/project/r2.git' into submodule path '/builds/company/deperatment/group/cloud/services/project/components/r2 failed
Failed to clone 'components/r2'. Retry scheduled
Cloning into '/builds/company/deperatment/group/cloud/services/project/components/r3'...
fatal: could not read Username for 'https://gitlab.example.com': No such device or address
fatal: clone of 'https://gitlab.example.com/company/deperatment/group/project/r3.git' into submodule path '/builds/company/deperatment/group/cloud/services/project/components/r3' failed
Failed to clone 'components/r3'. Retry scheduled
Cloning into '/builds/company/deperatment/group/cloud/services/project/components/r1'...
fatal: could not read Username for 'https://gitlab.example.com': No such device or address
fatal: clone of 'https://gitlab.example.com/company/deperatment/group/project/r1.git' into submodule path '/builds/company/deperatment/group/cloud/services/project/components/r1' failed
Failed to clone 'components/r1' a second time, aborting
Cleaning up project directory and file based variables 00:01
ERROR: Job failed: command terminated with exit code 1
According to GitLab documentation on CICD and git submodules
a submodule can be accessed either via absolute or relative URLs. I prefer to stick to the absolute URLs since we are still learning about GitLab (and git in general ), so things change quite often. In addition sometimes one would like to use an external (outside of the GitLab instance) repo, so instead of importing that repo it’s easier to just reference it.
Whenever absolute URLs are used, the documentation states that Personal Access Token has to be used.
First of all I find this strange especially in an environment where a lot of things are shared (team) and second of all it binds the whole procedure to a specific user. If that user disappears (e.g. layoff or whatever), another user with their PAT will have to take over and so on. I would hope that a CI_JOB_TOKEN
can be used here instead?
Last but not least I still don’t know how to add my PAT in this particular case, if I absolutely have to use it. Should I add to my script
git config
and set up SSH in the provided image that builds the package or set up an access token in the project and pipe it via stdin
? This looks like an overkill to me and also I am anything but expert in this regard.