Code signing fails under CI

What’s different between letting GitLab CI run its script through an SSH connection and me running the same commands interactively over an SSH connection to the same machine?

I am trying to build a Windows application with a signed ClickOnce installer. It fails under GitLab CI with “signtool Error: No certificates were found that met all the given criteria.”

Setup:

  • Gitlab Community Editiion 8.14.4, locally hosted
  • Using the VirtualBox runner under Ubuntu with:
    ** Guest OS Windows 10 64-bit
    ** Visual Studio Community 2015
    ** CygWin SSH server

My .gitlab-ci.yml script invokes a DOS batch file which runs Fake (F# make) which invokes MSBuild to run build targets against my Visual Studio solution and then invoke NUnit to run the tests.

Everything works fine if I launch my VirtualBox image, log in and run the same commands in a CygWin shell.

Everything works fine if I SSH into the VirtualBox image and run the same commands as my CI script does over the SSH connection. I’m even opening the SSH connection from the Linux machine that hosts the CI runners.

Everything works fine under CI if disable code signing by modifying my Fake target to pass “/property:SignManifests=false” to msbuild.

But letting the CI runner run the same commands over the SSH connection causes signtool to fail.

I’ve hit Google in depth and found lots of potential solutions but none have worked. Things I have tried in the Win10 guest configuration:

  1. Install the code signing certificate in the user account store for the user account the CI builds run as. (The VS solution is set up to reference a .p12 file for the certificate; it’s a recently purchased cert, and definitely not expired.)
  2. Install the certificate in the local machine store.
  3. Install the certificate in the Cygwin SSH service store.
  4. Use MMC to give full access to the certificates in the local machine store to both the user and Cygwin SSH service accounts.
  5. Use a self-signing certificate instead of the real one.
  6. Use a devenv build instead of msbuild.
  7. Invoke msbuild from the CI script directly in order to cut Fake out of the loop.

In other threads I have observed other people having similar problems with other CI systems but the above list of things I’ve tried encompasses all of the solutions suggested.

Suggestions welcome. To me it’s looking like there is some fundamental difference between me running commands manually over SSH and the CI runner running the same commands automatically over SSH with the same login between the same two machines. I’m baffled as to what the difference might be. The only difference I can see without knowing the internals of the CI runner code is that the CI runner runs as superuser at the Linux end - but the Windows end login that does the build is the same.