Intermittent SAML login failure

We’re seeing intermittent problems with users logging into gitlab-ce after enabling omniauth. We’re currently on 12.2.5 installed via the omnibus installer and have been using this for a while now with the same problem.

The user trying to log in will see a redirect to our ADFS login and then when redirecting back to the omniauth callback URL they will see a ERR_CONNECTION_RESET in chrome (failure occurs in firefox and also occurs with incognito or firefox private sessions).

The call back URL is https://gitlabserver/users/auth/saml/callback

All of the headers and the SAMLResponse form data look appropriate.
The connection reset occurs very quickly. Chrome web tools show a response in about 30-40ms.

The nginx access log shows
<IP address> - - [01/Oct/2019:19:49:47 +0000] "POST /users/auth/saml/callback HTTP/2.0" 408 0 "<adfs url>" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36"

I can find no other indication that the request made it past nginx in the rails production.log produciton_json.log or the workhorse log file.

The 408 response seems incorrect given that it’s unlikely to have timed out in 30-40ms.

Any ideas? It seems like something in gitlab is rejecting the saml callback - but I don’t know how to coax any actual logging to tell me why.

It’s intermittent in that many users don’t have the problem, but if a user runs into the problem they’re usually unable to login for a day or two.