CI_SERVER_TLS_CA_FILE file deleted after each job

I used to have a running gitlab-runner version 11.9.0 on a CentOS machine but I recently migrated to an Ubuntu 20.04 machine and installed the lastest gitlab-runner version (13.8.0) and encountered an issue in my CI.

I have two projects, lets say ProjectA and ProjectB. ProjectB has a dependency on the ProjectA so in the CI of the ProjectB I have this step:

pull project a:
  stage: pull project a
  script:
  - cd ../ProjectA
  - git reset --hard
  - git fetch https://gitlab-runner-token:$GITLAB_RUNNER_TOKEN@my.git.url.com/projects/ProjectA.git/ 
  - git checkout $CI_COMMIT_REF_NAME
  - git pull https://gitlab-runner-token:$GITLAB_RUNNER_TOKEN@my.git.url.com/projects/ProjectA.git/ $CI_COMMIT_REF_NAME

And during the fetch command I receive this error:

fatal: unable to access 'https://my.git.url.com/projects/ProjectA.git/': Problem with the SSL CA cert (path? access rights?)

I discovered that it’s because the runner uses a file called CI_SERVER_TLS_CA_FILE located in the temporary directory (ProjectA.tmp in my case) to fetch and pull from repositories. And it used to work with my old CentOS machine because this file was not deleted after each job during a continuous integration. Now that this file is deleted after each job, at the end of the continuous integration of my ProjectA this CI_SERVER_TLS_CA_FILE file doesn’t exist anymore. Thus, whenever my ProjectB CI is triggered, my “pull project a” stage fails.

I also tried with the gitlab-runner version 13.1.0 (oldest possible version available on Ubuntu focal) but no success.

So I am wondering if it’s a bug from the Gitlab runner, if it’s the expected behavior (it wouldn’t surprise me to delete a certificat since it’s related to security) or if I am just doing it wrong to pull my project dependency?

Thank you for your help.

Hi, from a quick google:

could be some useful info there, something that might help:

Adding this environment = ["GIT_SSL_NO_VERIFY=1"] to /etc/gitlab-runner/config.toml works for me.

there might be some other tips to solve it better than not verifying the certs, but apparently that helps.

The issue you linked is about the ‘sslCAInfo’ path value in the .git/config file not being set to the correct value. My config file is correct, it’s just that the CI_SERVER_TLS_CA_FILE is automatically deleted so it can’t work.
Also, the ["GIT_SSL_NO_VERIFY=1"] should work but when we talk about SSL issue, disabling it is never a good solution.

Yes, as I said, there might be some other tips to solve it better than not verifying the certs - but for debugging your problem it might help you find out more about where the problem actually is. Whether you try it or not, it’s up to you. You might get some more info from the log files on your system into what is happening.

Lower down the post: Runner unable to fetch via https - Problem with the SSL CA cert (#2950) · Issues · GitLab.org / gitlab-runner · GitLab

I also had this problem on Gitlab 10.7.1 and Gitlab-runner 10.8.0.
I solved this issue by specifing for each runner an builds dir and an cache dir in /etc/gitlab-runner/config.toml
[[runners]] name = "runner-name" url = "https://git.lab" token = "token1234" executor = "ssh" builds_dir = "/path/to/builds/dir" cache_dir = "/path/to/builds/dir" 
I don't know if this is right way but it works for me

perhaps that can help, there are plenty of posts in there of people having a similar issue to you, not just like the first post where someone is missing a certificate. And others were having to downgrade to an earlier version, but obviously you have tried that but for the latest Ubuntu installed, you can’t downgrade further.

And this post perhaps also check your logs with debug enabled: Runner unable to fetch via https - Problem with the SSL CA cert (#2950) · Issues · GitLab.org / gitlab-runner · GitLab

I suffered the same problem. Adding CI_DEBUG_TRACE: "true" to my .gitlab-ci.yml I found a lot of lines in the log about rvm. I didn't understand why my ci uses rvm. The solution was to remove it completely with rvm implode and rm -rf .rvm, and it worked.

a couple of things you can try to find a solution to your problem.

I added CI_DEBUG_TRACE: “true” to my .gitlab-ci.yml and got:

Cleaning up file based variables
+ set -eo pipefail
+ set +o noclobber
+ :
+ eval '$'\''rm'\'' "-f" "/home/gitlab-runner/builds/<token>/0/<projects>/ProjectB.tmp/CI_SERVER_TLS_CA_FILE"
'
++ rm -f /home/gitlab-runner/builds/<token>/0/<projects>/ProjectB.tmp/CI_SERVER_TLS_CA_FILE

So the CI_SERVER_TLS_CA_FILE file being deleted at the end of each job is something that is intentional. I wanted to know if it’s still something that is configurable so I started to look in the source code of gitlab-runner and found:

File /gitlab-runner//shells/bash.go:

func (b *BashWriter) writeScript(w io.Writer) {
	_, _ = io.WriteString(w, "set -eo pipefail\n")
	_, _ = io.WriteString(w, "set +o noclobber\n")
	_, _ = io.WriteString(w, ": | eval "+helpers.ShellEscape(b.String())+"\n")
	_, _ = io.WriteString(w, "exit 0\n")
}

Which is called by /gitlab-runner/shells/abstract.go:

func (b *AbstractShell) writeScript(w ShellWriter, buildStage common.BuildStage, info common.ShellScriptInfo) error {
	methods := map[common.BuildStage]func(ShellWriter, common.ShellScriptInfo) error{
		common.BuildStagePrepare:                  b.writePrepareScript,
		common.BuildStageGetSources:               b.writeGetSourcesScript,
		common.BuildStageRestoreCache:             b.writeRestoreCacheScript,
		common.BuildStageDownloadArtifacts:        b.writeDownloadArtifactsScript,
		common.BuildStageAfterScript:              b.writeAfterScript,
		common.BuildStageArchiveOnSuccessCache:    b.writeArchiveCacheOnSuccessScript,
		common.BuildStageArchiveOnFailureCache:    b.writeArchiveCacheOnFailureScript,
		common.BuildStageUploadOnSuccessArtifacts: b.writeUploadArtifactsOnSuccessScript,
		common.BuildStageUploadOnFailureArtifacts: b.writeUploadArtifactsOnFailureScript,
		common.BuildStageCleanupFileVariables:     b.writeCleanupFileVariablesScript,
	}

	fn, ok := methods[buildStage]
	if !ok {
		return b.writeUserScript(w, info, buildStage)
	}
	return fn(w, info)
}

And this common.BuildStageCleanupFileVariables seems to be set in /gitlab-runner/common/build.go:

// getPredefinedEnv returns whether a stage should be executed on
//  the predefined environment that GitLab Runner provided.
func getPredefinedEnv(buildStage BuildStage) bool {
	env := map[BuildStage]bool{
		BuildStagePrepare:                  true,
		BuildStageGetSources:               true,
		BuildStageRestoreCache:             true,
		BuildStageDownloadArtifacts:        true,
		BuildStageAfterScript:              false,
		BuildStageArchiveOnSuccessCache:    true,
		BuildStageArchiveOnFailureCache:    true,
		BuildStageUploadOnFailureArtifacts: true,
		BuildStageUploadOnSuccessArtifacts: true,
		BuildStageCleanupFileVariables:     true,
	}

	predefined, ok := env[buildStage]
	if !ok {
		return false
	}

	return predefined
}

And looking in the documentation, I didn’t find any way to configure this BuildStageCleanupFileVariables variable to false and it doesn’t seem to be part of the predefined variables either. So I concluded that trying to avoid the deletion of the CI_SERVER_TLS_CA_FILE file was not the correct way to go.
However, in the debugs logs I also found the $CI_SERVER_TLS_CA_FILE variable which contains the absolute path to the CI_SERVER_TLS_CA_FILE file. And since that both my ProjectA and ProjectB come from the same Gitlab server, I can use the same TLS certificate to fetch/pull both projects. So I ended up with this job that works fine:

pull project a:
  stage: pull project a
  script:
  - cd ../ProjectA
  - cp $CI_SERVER_TLS_CA_FILE ../ProjectA.tmp/CI_SERVER_TLS_CA_FILE # Copy the TLS certifcate
  - git reset --hard
  - git fetch https://gitlab-runner-token:$GITLAB_RUNNER_TOKEN@my.git.url.com/projects/ProjectA.git/ 
  - git checkout $CI_COMMIT_REF_NAME
  - git pull https://gitlab-runner-token:$GITLAB_RUNNER_TOKEN@my.git.url.com/projects/ProjectA.git/ $CI_COMMIT_REF_NAME
  - rm -f ../ProjectA.tmp/CI_SERVER_TLS_CA_FILE # Delete it before the end of the job since it seems to be the good thing to do

I’ll mark it as solved.

1 Like