Helm Chart Failing to start because Redis won't start

Replace this template with your information

I deployed Gitlab via helm chart. After a power loss shut down the cluster, everything but gitlab started just fine. Since I couldn’t figure out the exact reason why it wasn’t starting, I decided to update it, so ran helm upgrade gitlab ...othervars and upgraded the deployment.

But the migration is failing because redis isn’t starting properly.

Migration logs: (shortened to remove full stack trace. Full trace available here)

Begin parsing .erb files from /var/opt/gitlab/templates
Writing /srv/gitlab/config/resque.yml
Writing /srv/gitlab/config/gitlab.yml
Writing /srv/gitlab/config/database.yml
Copying other config files found in /var/opt/gitlab/templates
Attempting to run '/scripts/wait-for-deps /scripts/db-migrate' as a main process
Checking database connection and schema version
Database Schema - current: 20200221142216, codebase: 20200325152327
Checking database migrations are up-to-date
Performing migrations (this will initialized if needed)
rake aborted!
StandardError: An error has occurred, this and all later migrations canceled:

Error connecting to Redis on gitlab-redis-master:6379 (Redis::TimeoutError)
...
/srv/gitlab/lib/tasks/gitlab/db.rake:49:in `block (3 levels) in <main>'

Caused by:
Redis::CannotConnectError: Error connecting to Redis on gitlab-redis-master:6379 (Redis::TimeoutError)
...
/srv/gitlab/lib/tasks/gitlab/db.rake:49:in `block (3 levels) in <main>'

Caused by:
Redis::TimeoutError: Redis::TimeoutError
...
/srv/gitlab/lib/tasks/gitlab/db.rake:49:in `block (3 levels) in <main>'

Caused by:
IO::EINPROGRESSWaitWritable: Operation now in progress - connect(2) would block
...
/srv/gitlab/lib/tasks/gitlab/db.rake:49:in `block (3 levels) in <main>'
Tasks: TOP => db:migrate
(See full trace by running task with --trace)
== 20200221144534 DropActivatePrometheusServicesBackgroundJobs: migrating =====

gitlab-redis-master-0 pod:

metrics container (running):

time="2020-04-07T02:03:01Z" level=info msg="Redis Metrics Exporter v1.3.5    build date: 2019-12-16-18:43:41    sha1: 14dda66e724e45935782db610aca803594107ff0    Go: go1.13.5    GOOS: linux    GOARCH: amd64"
time="2020-04-07T02:03:01Z" level=info msg="Providing metrics at :9121/metrics"
time="2020-04-07T02:03:07Z" level=error msg="Couldn't connect to redis instance"
time="2020-04-07T02:04:07Z" level=error msg="Couldn't connect to redis instance"
....repeats

gitlab-redis container (CrashLoopBackOff):

 02:39:37.36 INFO  ==> ** Starting Redis **
1:C 07 Apr 2020 02:39:37.379 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 07 Apr 2020 02:39:37.379 # Redis version=5.0.7, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 07 Apr 2020 02:39:37.379 # Configuration loaded
1:M 07 Apr 2020 02:39:37.382 * Running mode=standalone, port=6379.
1:M 07 Apr 2020 02:39:37.382 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 07 Apr 2020 02:39:37.382 # Server initialized
1:M 07 Apr 2020 02:39:37.382 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 07 Apr 2020 02:39:37.383 * Reading RDB preamble from AOF file...
1:M 07 Apr 2020 02:39:37.392 * Reading the remaining AOF tail...
1:M 07 Apr 2020 02:39:38.191 # Bad file format reading the append only file: make a backup of your AOF file, then use ./redis-check-aof --fix <filename>

values.yaml for helm chart:

# helm install -n gitlab gitlab gitlab/gitlab -f manifests/gitlab/values.yml
# helm upgrade -n gitlab gitlab gitlab/gitlab -f manifests/gitlab/values.yml

global:
  edition: ee
  hosts:
    domain: example.com
    https: false
    gitlab:
      name: gitlab.example.com
      https: true
    minio:
      name: minio.example.com
      https: false
    registry:
      name: cr.example.com
      https: true
  ingress:
    configureCertmanager: false
    class: nginx
    enabled: true
    tls:
      enabled: true
    annotations:
      cert-manager.io/cluster-issuer: "letsencrypt-prod"
      kubernetes.io/tls-acme: true
      nginx.ingress.kubernetes.io/proxy-body-size: 512m
      nginx.ingress.kubernetes.io/proxy-connect-timeout: 15
  gitaly:
    persistence:
      size: 8Gi
  psql:
    host: gitlab-postgres-postgresql
    database: gitlab
    user: gitlab
    password:
      secret: gitlab-postgresql
      key: postgres-gitlab-password
  minio:
    enabled: true
  grafana:
    enabled: false
  appConfig:
    ldap:
      servers:
        main:
          label: 'LDAP'
          host: 'ipa.example.com'
          port: 389
          uid: 'uid'
          base: 'asdf'
          active_directory: 'false'
          attributes:
            email: ['mail', 'email']
          bind_dn: 'asdf'
          password:
            secret: ldap-bind
            key: ldap-password
          encryption: 'plain'
  registry:
    enabled: true
    bucket: registry
gitlab:
  migrations:
    enabled: true
  unicorn:
    ingress:
      tls:
        secretName: gitlab-unicorn-tls
upgradeCheck:
  enabled: false
certmanager:
  install: false
nginx-ingress:
  enabled: false
prometheus:
  install: true
redis:
  persistence:
    size: 1Gi
postgresql:
  install: false
registry:
  enabled: true
  ingress:
    enabled: true
    tls:
      enabled: true
      secretName: gitlab-registry-tls
    annotations:
      cert-manager.io/cluster-issuer: "letsencrypt-prod"
      kubernetes.io/tls-acme: true
      nginx.ingress.kubernetes.io/proxy-body-size: 512m
      nginx.ingress.kubernetes.io/proxy-connect-timeout: 15
gitlab-runner:
  install: true
  privileged: true
  rbac:
    create: true
  runners:
    locked: false
    privileged: true
minio:
  persistence:
    size: 10Gi
gitaly:
  persistence:
    size: 8Gi

How can I fix Redis to launch properly?