OIDC provider timeouts on the gitlab side cause intermittent authentication failures

We are seeing some intermittent issues with gitlab.com being able to verify the OIDC tokens that it’s giving out.

Our setup is using gitlab as an OIDC identity provider in the root account which has access to different accounts in our organization. This has been relatively stable for us for a while but we appear to be getting more and more intermittent issues recently: https://docs.gitlab.com/ee/ci/cloud_services/aws/

An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: Couldn’t retrieve verification key from your identity provider, please reference AssumeRoleWithWebIdentity documentation for requirements.

The response from our service provider who has more detailed logging than we have access to:

As the error suggests, AWS was trying to validate the token, but it was unable to fetch the verification key which was used by OIDC provider to sign the token. This is an issue from OIDC provider end. It could be possible that the OIDC provider might be down for some time or it was not equipped to handle huge traffic. Since it is an issue from OIDC provider end, we don’t have enough visibility on what happened to the provider during the time it was down.

1 Like

We are seeing this in our environments too. Did you find any solid resolution for this?

No help from gitlab support on this or any word if they think this is a problem. We’ve implemented retry logic with an exponential backoff but the root problem still exists.

The best path forward on a paid plan is to upvote this issue to prioritize a fix: Parallel jobs attempting to AssumeRoleWithWebIdentity in AWS IAM fail with `InvalidIdentityToken` error (#374001) · Issues · GitLab.org / GitLab · GitLab