Is it possible to have a kind of persistent Docker image for GitLab-CI?

Caused by different Runner configuration. It has no effect on your problem.

Well, I created a new account on gitlab.com in order to test the cache. But, unfortunately, it doesn’t work either: just a tiny change in a single Markdown source file but 16 minutes and 58 seconds for the job, although the log mentions a cache restored.

That’s private repository.

But why don’t you just replicate it locally? I suppose you have a local clone and sphinx on your workstation. Just keep the source files (remove everything else), run the Sphinx-build and see what gets created as cache and add it do the paths under cache in CI.

(Sorry for the private repository, which is now public.)

I proceeded as you advised and, AFAICS locally, the cache is located in build/doctrees. By a command I added to .gitlab-ci.yml, this is confirmed on the remote repository according to the log of the previous job:

$ ls -alh build/doctrees
total 22M
drwxr-xr-x  9 root root 4.0K Jul 13 21:07 .
drwxr-xr-x  4 root root 4.0K Jul 13 21:07 ..
drwxr-xr-x  4 root root 4.0K Jul 13 21:07 0_cette_faq
drwxr-xr-x  7 root root 4.0K Jul 13 21:07 1_generalites
drwxr-xr-x  8 root root 4.0K Jul 13 21:07 2_programmation
drwxr-xr-x  9 root root 4.0K Jul 13 21:07 3_composition
drwxr-xr-x 18 root root 4.0K Jul 13 21:07 4_domaines_specialises
drwxr-xr-x 11 root root 4.0K Jul 13 21:07 5_fichiers
drwxr-xr-x  7 root root 4.0K Jul 13 21:07 6_distributions
-rw-r--r--  1 root root  22M Jul 13 21:07 environment.pickle
-rw-r--r--  1 root root  13K Jul 13 21:07 index.doctree

Unfortunately, a tiny change in a single source file still regenerates all the 1226 HTML pages:

building [html]: targets for 1226 source files that are out of date
updating environment: 0 added, 1226 changed, 0 removed

although:

Thats strange. Try to run the ls -alh build/doctrees as a first step to see if the cache is really there before running sphinx-build

Here it is.

Ok, so the GitLab Runner cache functionality is working and it properly restores cache. The issue is that sphinx-build is not using it. I am out of ideas here.

What is strange is that, locally, sphinx-build does use the cache.

Maybe the problem comes from the job is run by shared runners that are not same each time. Is it possible to ask to always use the same shared runner ?

The cache on the GitLab SaaS Runners is shared, all runners use the same cache.

In the hope of making more obvious the trouble I’m facing, I created a minimal Sphinx-doc content:

  • with mainly default Sphinx settings,
  • with a minimal conf.py config file,
  • but with many (100) test files in order to highlight the phenomenon.

The .gitlab-ci.yml file contains two consecutive identical make html instructions. This highlights that, despite a tiny harmless change (only) to a non source file (.gitlab-ci.yml):

  1. the first make html instruction, despite a cache claimed to be restored, regenerates all the HTML pages, with the message building [html]: targets for 95 source files that are out of date,
  2. the second make html instruction doesn’t generate any of the HTML pages, with the message building [html]: targets for 0 source files that are out of date.

The second make html instruction behaves just as the first (and single) one when run locally on my computer.

I got help outside this forum from a guy (but who has no clue about sphinx-doc). Here is what he said:

[Y]ou are shooting yourself in your foot because you use make. make is utterly ineffective in CI pipelines because at the beginning of each pipeline the repo is cloned afresh, meaning the file
modification dates of the source files are usually newer than the cached output files even if nothing did change. Because make only considers file modification dates/timestamps, make is definitely not helpful for the first invocation. The second invocation obviously behaves correctly because the first make invocation rebuilds all outputs.

So you can reduce this minimal example by removing make and simply calling sphinx-build. And the pipelines for this new repository also show that a cache is pulled at the beginning and uploaded at the end of the pipeline. So GitLab’s caching is working. It’s now a question of what sphinx-build needs to determine whether to rebuild. If it works like make, you are out of luck because of file timestamps. If it works with another mechanism, you should look up where that is stored and check that all information needed for the rebuild are cached (maybe more is needed than just the doctrees directory).

I told him that:

  • I used to try with sphinx-build instead of make but I didn’t work either,
  • I’m afraid sphinx-build doesn’t work with another mechanism and, regarding the cache, only the doctrees directory is involved.

He answers then:

So you need to use one of the solutions modifying the file timestamps to match the git commits to have an effective solution here. There are a variety of tools out there that do this. Just search the internet for git checkouts with mtime preserved.

Is there some body who knows how to modify the timestamps to match the git commits?

@balonik I wanted to let you know that I’ve finally been able to find a solution to this problem!

I moved heaven and earth the question on countless forums (including the current one), asked friends of mine, and spent hours hours to see if it would work better on a self-hosted GitLab instance instance of GitLab with a self-hosted ‘runner’ (it didn’t work any better but I managed to create such an instance , etc. I went through moments of terrible doubts. And then, in the end, I came across an extremely simple solution and, this, completely by chance and, above all, by miracle (see below).

You can see the result: for this slight modification on a single source file, the regeneration only took 43 seconds.

The resulting site of our new French LaTeX FAQ can be seen here

What’s absolutely mind-boggling about this story is that I’ve stumbled across the extremely simple solution above, but it’s actually a miracle: for weeks, I started with a Docker image based on Linux enhanced with Python (necessary because Sphinx-doc is based on it) which was followed by the installation of Sphinx-doc and then the Sphinx-doc compilation of the source files. I’d been thinking for a long time that I’d have to start with a Docker image that already included Sphinx-doc: a friend of mine did it for me at the beginning of July and it showed a slight acceleration of the boot process (but nothing in terms of compilation); I stopped using his because, at the time, I had to stick to version 6.2.1 of Sphinx-doc (I’ll spare you the details of why) and it was stuck to it but, since mid-July, I can (and must) use the most recent version of Sphinx-doc (7.1.2). When, in desperation, I decided to re-test a Docker image incorporating Sphinx-doc, I couldn’t rely on the official Sphinx-doc image because it provided a version that wasn’t up to date; I found another up to date one which solved the problem. I mentioned that in the issue above and the Sphinx-doc developer who had helped me so much asked me to test the thing with their official image, this time updated; I did I did so and, for some reason not yet clear, my problem persisted with this official image. As you can see, the solution I’ve found is a miracle: I could have found a thousand other Docker images integrating Sphinx-doc with which the problem would have persisted!

I wanted to thank you again for your great and nice help!

1 Like