Issue With 32-bit Windows Runners

I’m having two issues with our 32-bit Windows runners that I suspect are related. The runners are at version 14.4.0 and our GitLab instance is at version 14.4.1-ee. The runners are tied to specific machines running 32-bit Windows 10 Pro (19043). Here’s a representative config.toml file. There’s nothing fancy going on:

concurrent = 1
check_interval = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = <hostname>
  url = <url>
  token = <token>
  executor = "shell"
  shell = "powershell"
  output_limit = 81920000
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]

The relevant jobs install an application built in a previous stage using msiexec and run a lengthy series of tests. The tests require elevated privileges to interact with drivers and hardware. The gitlab-runner service runs as NT AUTHORITY/SYSTEM (the service is configured to login as the local system user, switching this to a local administrator account doesn’t change anything). The problems are:

  1. On two of the machines, the runners don’t upload traces to the coordinator. Running in debug mode, it appears they aren’t even attempting to upload traces (I don’t see anything in WireShark, either). However, they download and upload build artifacts.

  2. On the same machines, the jobs fail to load a (signed) driver. This typically happens when the application is run without sufficient privileges. However, executing the same commands using either exec or an administrator PowerShell session succeeds.

Here is representative debugging output:

PS C:\gitlab-runner> .\gitlab-runner-windows-386.exe --debug run
Runtime platform                                    arch=386 os=windows pid=2364 revision=4b9e985a version=14.4.0
Starting multi-runner from C:\gitlab-runner\config.toml...  builds=0
Checking runtime mode                               GOOS=windows uid=-1
Configuration loaded                                builds=0
listenaddress: ""
sessionserver:
  listenaddress: ""
  advertiseaddress: ""
  sessiontimeout: 1800
concurrent: 1
checkinterval: 0
loglevel: null
logformat: null
user: ""
runners:
- name: <hostname>
  limit: 0
  outputlimit: 81920000
  requestconcurrency: 0
  runnercredentials:
    url: <url>
    token: <token>
    tlscafile: ""
    tlscertfile: ""
    tlskeyfile: ""
  runnersettings:
    executor: shell
    buildsdir: ""
    cachedir: ""
    cloneurl: ""
    environment: []
    preclonescript: ""
    prebuildscript: ""
    postbuildscript: ""
    debugtracedisabled: false
    shell: powershell
    custombuilddir:
      enabled: false
    referees: null
    cache:
      type: ""
      path: ""
      shared: false
      s3:
        serveraddress: ""
        accesskey: ""
        secretkey: ""
        bucketname: ""
        bucketlocation: ""
        insecure: false
        authenticationtype: ""
      gcs:
        cachegcscredentials:
          accessid: ""
          privatekey: ""
        credentialsfile: ""
        bucketname: ""
      azure:
        cacheazurecredentials:
          accountname: ""
          accountkey: ""
        containername: ""
        storagedomain: ""
    gracefulkilltimeout: null
    forcekilltimeout: null
    featureflags: {}
    ssh: null
    docker: null
    parallels: null
    virtualbox: null
    machine: null
    kubernetes: null
    custom: null
sentrydsn: null
modtime: 2021-11-12T19:43:27.7983993-08:00
loaded: true
  builds=0
listen_address not defined, metrics & debug endpoints disabled  builds=0
[session_server].listen_address not defined, session endpoints disabled  builds=0
Starting worker                                     builds=0 worker=0
Feeding runners to channel                          builds=0
Dialing: tcp <url>:443 ...
Checking for jobs... nothing                        runner=fT9zCaM7
Feeding runners to channel                          builds=0
Checking for jobs... received                       job=8686 repo_url=<url/repo>.git runner=fT9zCaM7
Processing chain                                    chain-leaf=[0x13f24840 0x13f24b00 0x13f24dc0] context=certificate-chain-build
Certificate doesn't provide parent URL: exiting the loop  Issuer=ISRG Root X1 IssuerCertURL=[] Serial=172886928669790476064670243504169061120 Subject=ISRG Root X1 context=certificate-chain-build
Failed to requeue the runner                        builds=1 runner=fT9zCaM7
Running with gitlab-runner 14.4.0 (4b9e985a)        job=8686 project=2 runner=fT9zCaM7
  on <hostname> fT9zCaM7                            job=8686 project=2 runner=fT9zCaM7
Preparing the "shell" executor          job=8686 project=2 runner=fT9zCaM7
Shell configuration: environment: []
dockercommand:
- powershell
- -NoProfile
- -NoLogo
- -InputFormat
- text
- -OutputFormat
- text
- -NonInteractive
- -ExecutionPolicy
- Bypass
- -Command
- '-'
command: powershell
arguments:
- -NoProfile
- -NonInteractive
- -ExecutionPolicy
- Bypass
- -Command
passfile: true
extension: ps1
  job=8686 project=2 runner=fT9zCaM7
Using Shell executor...                             job=8686 project=2 runner=fT9zCaM7
Waiting for signals...                              job=8686 project=2 runner=fT9zCaM7
No referees configured                              job=8686 project=2 runner=fT9zCaM7
Executing build stage                               build_stage=prepare_script job=8686 project=2 runner=fT9zCaM7
Preparing environment                   job=8686 project=2 runner=fT9zCaM7
Using new shell command execution                   job=8686 project=2 runner=fT9zCaM7
Executing build stage                               build_stage=get_sources job=8686 project=2 runner=fT9zCaM7
Getting source from Git repository      job=8686 project=2 runner=fT9zCaM7
Using new shell command execution                   job=8686 project=2 runner=fT9zCaM7
Feeding runners to channel                          builds=1
Submitting job to coordinator... ok                 code=200 job=8686 job-status= runner=fT9zCaM7 update-interval=0s
Executing build stage                               build_stage=restore_cache job=8686 project=2 runner=fT9zCaM7
Skipping stage (nothing to do)                      build_stage=restore_cache job=8686 project=2 runner=fT9zCaM7
Executing build stage                               build_stage=download_artifacts job=8686 project=2 runner=fT9zCaM7
Downloading artifacts                   job=8686 project=2 runner=fT9zCaM7
Using new shell command execution                   job=8686 project=2 runner=fT9zCaM7
Executing build stage                               build_stage=step_script job=8686 project=2 runner=fT9zCaM7
Executing "step_script" stage of the job script  job=8686 project=2 runner=fT9zCaM7
Using new shell command execution                   job=8686 project=2 runner=fT9zCaM7
Submitting job to coordinator... ok                 code=200 job=8686 job-status= runner=fT9zCaM7 update-interval=0s
Executing build stage                               build_stage=after_script job=8686 project=2 runner=fT9zCaM7
Running after_script                    job=8686 project=2 runner=fT9zCaM7
Using new shell command execution                   job=8686 project=2 runner=fT9zCaM7
Executing build stage                               build_stage=archive_cache_on_failure job=8686 project=2 runner=fT9zCaM7
Skipping stage (nothing to do)                      build_stage=archive_cache_on_failure job=8686 project=2 runner=fT9zCaM7
Executing build stage                               build_stage=upload_artifacts_on_failure job=8686 project=2 runner=fT9zCaM7
Uploading artifacts for failed job      job=8686 project=2 runner=fT9zCaM7
Using new shell command execution                   job=8686 project=2 runner=fT9zCaM7
Skipping referees execution                         job=8686 project=2 runner=fT9zCaM7
Executing build stage                               build_stage=cleanup_file_variables job=8686 project=2 runner=fT9zCaM7
Cleaning up project directory and file based variables  job=8686 project=2 runner=fT9zCaM7
Using new shell command execution                   job=8686 project=2 runner=fT9zCaM7
WARNING: Job failed: exit status 1
                 duration_s=51.9473531 job=8686 project=2 runner=fT9zCaM7
Submitting job to coordinator... ok                 code=200 job=8686 job-status= runner=fT9zCaM7 update-interval=0s
WARNING: Failed to process runner                   builds=0 error=exit status 1 executor=shell runner=fT9zCaM7
Checking for jobs... nothing                        runner=fT9zCaM7
Feeding runners to channel                          builds=0
WARNING: Starting graceful shutdown, waiting for builds to finish  StopSignal=quit builds=0
Broadcasting interrupt signal                       builds=0
All workers stopped. Can exit now                   builds=0

This only occurs when the jobs are trigged from the CI web interface. It does not happen when the same commands are run using exec or from an elevated command prompt. It doesn’t happen on another 32-bit Windows 10 machine that, as far as I can tell, is virtually identical (except for the underlying hardware). It doesn’t occur on any of our 64-bit test machines. The 32-bit and 64-bit jobs are identical except that they install the 32-bit and 64-bit versions of the application, respectively.

Clearly, there’s a difference between the 32-bit Windows machines causing one to succeed and two to fail in the same way. I just don’t know what it is, and I’ve been banging my head against the problem for days. My intuition is that it’s some kind of permissions or security setting, but I haven’t been able to figure out which one. The machines are sometimes on different networks, but fiddling with the firewall or which network the machines are on doesn’t work. If anyone has any insight into what could be causing these problems, I would appreciate it. I’m probably overlooking something obvious.