GitLab Pages Site Issue Post-Upgrade: OAuth Authentication Problem Beyond Version 16.3.7


GitLab Pages Site Issue Post-Upgrade: OAuth Authentication Problem Beyond Version 16.3.7

After upgrading our GitLab CE server (on RHEL 8) from version 16.3.7 to 16.11.0, I encountered a persistent issue with a GitLab Pages site that uses OAuth authentication. This site, which also uses Hugo for rendering and serves over HTTPS with a self-signed certificate, becomes inaccessible post-upgrade. Users are required to log in with GitLab credentials, and access is restricted to project members only. Interestingly, I first tested the upgrade in our Lab environment, where there were no issues with a similar setup, so I suspect the problem is related to differences between the Prod and Lab environments.

Environment Details:

  • Production Site:

    • Served over HTTPS, uses OAuth for authentication.
    • Aggregates multiple GitLab projects as submodules within the main site project.
    • Utilizes a shared runner for CI/CD pipelines.
  • Lab Environment:

    • Served over HTTP, does not use OAuth authentication.
    • Only one submodule, with the same shared runner setup, and no issues encountered.

Issue Encountered:

Upgrading to any version beyond 16.3.7 results in a 500 Internal Server Error on the site, with logs indicating an OAuth token error (specifically, a “malformed token”). I’ve attempted to regenerate OAuth tokens, clear caches, restart services, and reconfigure the server, but these steps haven’t resolved the issue. Rolling back to 16.7.7 temporarily fixes it, but I’m seeking a solution for the latest version.

Troubleshooting Steps Taken:

  1. Regenerated the OAuth token and cleared caches.
  2. Restarted and reconfigured GitLab services.
  3. Reviewed and adjusted OAuth and GitLab Pages settings.
  4. Compared settings with the Lab environment, which differs in lacking HTTPS and OAuth, yet does not encounter the same problem.

Log Output:

For context, here is a sample log from the GitLab Pages service during an access attempt:

{"correlation_id":"01J5S85DQWJFWTH7GJNPFHKBFT","error":"token is malformed: token contains an invalid number of segments","host":"site-name.domain.net","level":"error","msg":"failed to decrypt secure code","path":"/auth","state":"CJicCkqJPBtks73GNJ_OLQ==","time":"2024-08-20T21:17:34-04:00"}

Has anyone else experienced OAuth issues on GitLab Pages following an upgrade beyond 16.3.7? Are there known changes in OAuth handling or HTTPS requirements in newer versions? Or any suggestions for resolving an error related to a “malformed token”? Any insights or steps I may have missed would be greatly appreciated. I can add more details too!
Thank you very much !