With GitLab Pages and CI/CD, how do you build different branches to different (sub-)directories?

matherion · August 6, 2020, 10:20am

[This is a duplicate of https://stackoverflow.com/questions/63281252/with-gitlab-pages-and-ci-cd-how-do-you-build-different-branches-to-different-s, where I’ll also post the solution if one exists/materializes ]

I am writing a number of R BookDown books. These are suitable as living books, that can be updated more or less continuously. However, we now want to use one in teaching. Therefore, we want to build an environment where we have a production version (the prod branch) hosted at GitLab pages with, for example, the default branch, called dev, and maybe a staging branch rendered into subdirectories (e.g. /dev and /staging).

There are apparently several approaches, but I haven’t managed to get any to work.

One is to use Review Apps (this seems promising), but these seem to require Kubernetes (see here; I’m not entirely sure what that is; and certainly don’t need it for anything else, it seems?).
Another is to build Merge Trains. However, GitLab adamantly keeps insisting that (like here - expect that including the proposed solution didn’t work in my case).
I also tried checking out different branches and then building into the right directory, using the suggestion here. However, git complains that “dev did not match any file(s) known to git”.

I am probably (well, quite clearly) way out of my depth here. But it feels like this (i.e. building different branches in sub-directories of Pages’ public directory) should be doable even for people who don’t code for a living…

My current CI configuration is here, and the GitLab Runner complaint about the file it doesn’t know is here. - but I’m not sure that’s useful to look at. There seem to be so many different approaches (and I kept lots of fragments there, commented out, in case I’d need them again).

Is there a generic GitLab pages CI configuration that builds one branch into public and other branches into subdirectories?

(tagging @dnsmichi)

dnsmichi · August 10, 2020, 12:27pm

Hi,

took me a while to fully understand where you are going to

Kubernetes is a container orchestrator, and it allows to run web server containers for this specific example. The idea here is to have different containers for different environments - prod, stage, dev.

The main benefit with review apps is that you can dynamically update DNS and have <branchname>.FQDN accessible with that specific environment. That way you do not need to remember IP address, or care about specific port mappings.

Containers make it easier to start, stop, deploy specific environments, which are run by CI/CD jobs. If you do not have access to a Kubernetes, you could build the same with your own VMs, whereas different deployment branches target this then.

Another example with Nginx is linked in the docs, as you can also run your own GitLab pages server on your self-hosted GitLab instance.

GitLab Pages with dev, stage, prod

GitLab Pages serves the content via Nginx on GitLab.com, and uses the configured custom domain by default for the main branch. All other branches can be tricked into a deployment with the *.gitlab.io subdomains via special CI config, but this is experimental and does not work reliably yet with replacing specific rewrite base paths.

The environments allow to enable the review app for a specific environment though, this is the direction you should be aiming. The mechanism for deployments and running machines is the idea here.

Pages Versioning

I think the discussion from

leads into using the GitLab API as entry point described in this blog post:

This probably is the foundation for future architecture decisions on GitLab pages but I don’t know when this will happen.

Create your own Versioning

One thing which is pretty common is to create your own sub-folders from generation jobs. latest, <version>, dev could then be the sub entry point for the documentation. That’s how we have built online docs with mkdocs in the past too, each folder was its own path.

The problem I see here is with the different Git branches required a “merge them all together and then deploy as once” job. I would move this into a dedicated job which can be triggered only on-demand at first glance.

deploy_prod:
  stage: deploy
  script:
    - ./scripts/build-docs-content.sh
  when: manual
  only:
  - master
  artifacts:
    paths:
      - ./public

cat scripts/build-docs-content.sh

# define the versions to fetch

# loop over them, clone/checkout, build into target directory underneath public/

Your question

$ git checkout origin/prod
error: pathspec 'origin/prod' did not match any file(s) known to git
ERROR: Job failed: exit code 1

origin is the remote, and prod as branch name might not yet exist.

Suggestions

Move the installation portions into a Dockerfile, push that to the registry and then build based on that. Installing the things on every CI run is really cost intensive. .gitlab-ci.yml · dev · Michael Friedrich / Open Methodologie en Statistiek · GitLab
Consider your own versioning, hidden in a script avoiding the YAML edit attempts. You can also test the script offline, when using git clone ... inside.
Upvote the existing issues.

Cheers,
Michael

matherion · August 11, 2020, 10:20am

Wow! Thank you for that extensive answer! There’s quite a bit I have to learn to understand everything, so I might take a while before I can reply - but I wanted to drop a quick thank-you already for this awesome explanation!!!

snim2 · August 11, 2020, 9:40pm

This is Jekyll, not R BookDown, but this .gitlab-ci.yml deploys different branches to different environments. You could change the rules to use tags, rather than branches, if you wish.

The current drawback to this (from your perspective) is that the URLs of the non-production environments are not very clean, and they are messier in my case because I have a project within a subgroup, within a group. For my purposes that doesn’t matter, but YMMV.

A final option, might be to deploy your environments somewhere else. I have one repo where I use a webhook to automatically deploy a mkdocs repo to readthedocs, which can deal with versioning docs based on git tags. I don’t think RTD can deal with R BookDown, but maybe there is a similar hosting service that can.

Cheers,

Sarah

matherion · August 12, 2020, 8:36am

Wow! Thank you for chiming in!!! That looks like an awesome example that I can use to learn how to do all this

I can live with non-clean URLs for the non-production branches. Those are, after all, not for end users, anyway - and I don’t think it’s unreasonable to expect people who do the development to be able to deal with non-clean URLs

Deploying the environments elsewhere sounds like it can be great in other cases, but I won’t need that now.

I’ve read up on Kubernetes and Docker a bit, and it seems like Docker is super-useful in any case (to enhance reproducibility), and that I won’t need to dive into Kubernetes further, hopefully, so it looks like your example is perfect.

So - do I see correctly that you actually create a docker image in the meta-build-image job, and then use that for subsequent jobs? That seems super-efficient, impressive!

For now, thank you very much! I’ll be back to report success.*

Thank you!

Gjalt-Jorn

*Of course I won’t , I’ll be back when I get completely stuck, but I thought it was nice to end on a positive note

snim2 · August 12, 2020, 9:31am

You’re exactly right about the meta-build-image job, there’s a really nice GitLab blog post that talks you through how this works.

Good luck!