Helm based upgrade of 14.6.2-ee to 14.7.1 CrashLoopBackOff due to DB inaccessibility

Last week, an upgrade of GitLab installed in our kubernetes cluster by a Helm chart was triggered. It was to move us from 14.6.2 to 14.7.1. Currently it is stuck trying to demonstrate DB access in the dependency initContainer on the sidekiq and webservice pods.

The dependency container dumps this message several times before it drops and forces the pod to restart, "“fe_sendauth: error sending password authentication”. I tracked this error down to postgres code.

If I manually trigger psql in the dependency container before it crashes using the same credentials, I can replicate the same results.

The database being talked with is an AWS RDS Postgres cluster using engine version 13.5.

In investigating the difference between the sidekiq container for 14.6.1 and 14.7.1, the only difference I can find among ruby gems, /scripts/* and anything else I could think of is that psql is upgraded from version 13.2 to version 14.1.

We are running on a FIPS 140-2 compliant kubernetes node, but I’ve seen that even when on the same kubernetes node, the 14.6 pod succeeds where the 14.7 fails.

Details:
Error Message originates on Line 1046 of fe_auth.c in the postgresql code: PostgreSQL Source Code: src/interfaces/libpq/fe-auth.c File Reference

Any idea why my new 14.7.1 sidekiq pod can’t talk to my database?

This was tracked down to the use of FIPS restricting the ability to use MD5. Changing the database password encryption and re-creating the related password resolved the problem:

SET password_encryption  = 'scram-sha-256'; 
ALTER USER "gitlab" with password 'NOTHISISNOTMYPASSWORD';

More details:

1 Like