"502 Whoops, GitLab is taking too much time to respond." after 13.x update

I previously tried to upgrade to 13.0 from 12.10.6, but it resulted in the same errors I am about to describe. I tried again today to upgrade from 12.10.6 -> 13.0.1 -> 13.1.0 and had the same issue occur. Essentially, the install appears to run successfully but when I go to access our GitLab instance, we forever get a “502 Whoops, GitLab is taking too much time to respond.” error page. We have a gitlab-ce instance installed on Debian Stretch via omnibus. We’re accessing it through an Apache proxy. I’ve tried it on two separate servers, one with 6 CPU cores and 16GB RAM and the other with 4 CPU cores and 8GB RAM.

/var/log/gitlab/gitlab-workhorse/current:

{"correlation_id":"ko0qBWPMkR","duration_ms":0,"error":"badgateway: failed to receive response: dial tcp 127.0.0.1:8182: connect: connection refused","level":"error","method":"GET","msg":"error","time":"2020-06-23T07:27:21-07:00","uri":"/"}

gitlab-ctl status

# gitlab-ctl status
run: alertmanager: (pid 10240) 484s; run: log: (pid 9569) 517s
run: gitaly: (pid 10253) 483s; run: log: (pid 9340) 554s
run: gitlab-exporter: (pid 10261) 483s; run: log: (pid 9557) 518s
run: gitlab-workhorse: (pid 10271) 482s; run: log: (pid 9524) 520s
run: grafana: (pid 10285) 482s; run: log: (pid 9576) 515s
run: logrotate: (pid 10301) 482s; run: log: (pid 9536) 520s
run: node-exporter: (pid 10317) 481s; run: log: (pid 9551) 519s
run: postgres-exporter: (pid 10322) 481s; run: log: (pid 9572) 516s
run: postgresql: (pid 10332) 480s; run: log: (pid 9351) 553s
run: prometheus: (pid 10341) 480s; run: log: (pid 9565) 517s
run: puma: (pid 10356) 479s; run: log: (pid 9515) 521s
run: redis: (pid 10362) 479s; run: log: (pid 9333) 555s
run: redis-exporter: (pid 10367) 479s; run: log: (pid 9562) 518s
run: registry: (pid 10374) 478s; run: log: (pid 9540) 519s
run: sidekiq: (pid 10460) 475s; run: log: (pid 9519) 521s

/var/log/gitlab/puma/puma_stdout.log

{"timestamp":"2020-06-23T14:33:09.522Z","pid":10356,"message":"PumaWorkerKiller: Consuming 1700.8984375 mb with master and 2 workers."}
{"timestamp":"2020-06-23T14:33:29.523Z","pid":10356,"message":"PumaWorkerKiller: Consuming 1701.59765625 mb with master and 2 workers."}
{"timestamp":"2020-06-23T14:33:49.524Z","pid":10356,"message":"PumaWorkerKiller: Consuming 1701.80859375 mb with master and 2 workers."}
{"timestamp":"2020-06-23T14:34:09.524Z","pid":10356,"message":"PumaWorkerKiller: Consuming 1702.35546875 mb with master and 2 workers."}
{"timestamp":"2020-06-23T14:34:29.525Z","pid":10356,"message":"PumaWorkerKiller: Consuming 1702.76171875 mb with master and 2 workers."}
{"timestamp":"2020-06-23T14:34:49.525Z","pid":10356,"message":"PumaWorkerKiller: Consuming 1703.04296875 mb with master and 2 workers."}
{"timestamp":"2020-06-23T14:35:09.526Z","pid":10356,"message":"PumaWorkerKiller: Consuming 1703.8203125 mb with master and 2 workers."}
{"timestamp":"2020-06-23T14:35:29.527Z","pid":10356,"message":"PumaWorkerKiller: Consuming 1704.1171875 mb with master and 2 workers."}
{"timestamp":"2020-06-23T14:35:49.527Z","pid":10356,"message":"PumaWorkerKiller: Consuming 1704.4609375 mb with master and 2 workers."}

/var/log/gitlab/gitlab-rails/production.log

Completed 200 OK in 6ms (Views: 0.4ms | ActiveRecord: 0.0ms | Elasticsearch: 0.0ms | Allocations: 651)
Started GET "/-/metrics" for 127.0.0.1 at 2020-06-23 07:36:38 -0700
Processing by MetricsController#index as HTML
Completed 200 OK in 6ms (Views: 0.3ms | ActiveRecord: 0.0ms | Elasticsearch: 0.0ms | Allocations: 651)
Started GET "/-/metrics" for 127.0.0.1 at 2020-06-23 07:36:53 -0700
Processing by MetricsController#index as HTML
Completed 200 OK in 6ms (Views: 0.3ms | ActiveRecord: 0.0ms | Elasticsearch: 0.0ms | Allocations: 651)
Started GET "/-/metrics" for 127.0.0.1 at 2020-06-23 07:37:08 -0700
Processing by MetricsController#index as HTML
Completed 200 OK in 6ms (Views: 0.3ms | ActiveRecord: 0.0ms | Elasticsearch: 0.0ms | Allocations: 651)

/var/log/gitlab/gitlab-rails/sidekiq_exporter.log

[2020-06-23T07:36:32.070-0700] 127.0.0.1 - - [23/Jun/2020:07:36:32 MST] "GET /metrics HTTP/1.1" 200 10992 "-" "Prometheus/2.16.0"
[2020-06-23T07:36:47.069-0700] 127.0.0.1 - - [23/Jun/2020:07:36:47 MST] "GET /metrics HTTP/1.1" 200 10992 "-" "Prometheus/2.16.0"
[2020-06-23T07:37:02.070-0700] 127.0.0.1 - - [23/Jun/2020:07:37:02 MST] "GET /metrics HTTP/1.1" 200 10992 "-" "Prometheus/2.16.0"
[2020-06-23T07:37:17.069-0700] 127.0.0.1 - - [23/Jun/2020:07:37:17 MST] "GET /metrics HTTP/1.1" 200 10992 "-" "Prometheus/2.16.0"
[2020-06-23T07:37:32.069-0700] 127.0.0.1 - - [23/Jun/2020:07:37:32 MST] "GET /metrics HTTP/1.1" 200 10999 "-" "Prometheus/2.16.0"
[2020-06-23T07:37:47.076-0700] 127.0.0.1 - - [23/Jun/2020:07:37:47 MST] "GET /metrics HTTP/1.1" 200 10999 "-" "Prometheus/2.16.0"
[2020-06-23T07:38:02.069-0700] 127.0.0.1 - - [23/Jun/2020:07:38:02 MST] "GET /metrics HTTP/1.1" 200 10999 "-" "Prometheus/2.16.0"
[2020-06-23T07:38:17.069-0700] 127.0.0.1 - - [23/Jun/2020:07:38:17 MST] "GET /metrics HTTP/1.1" 200 10999 "-" "Prometheus/2.16.0"
[2020-06-23T07:38:32.069-0700] 127.0.0.1 - - [23/Jun/2020:07:38:32 MST] "GET /metrics HTTP/1.1" 200 10999 "-" "Prometheus/2.16.0"
[2020-06-23T07:38:47.069-0700] 127.0.0.1 - - [23/Jun/2020:07:38:47 MST] "GET /metrics HTTP/1.1" 200 10999 "-" "Prometheus/2.16.0"

/var/log/gitlab/gitlab-workhorse/current:

{"correlation_id":"ydcDWwu7qN","duration_ms":0,"host":"git.domain.com","level":"info","method":"POST","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"35.166.34.92:39710","remote_ip":"35.166.34.92","status":502,"system":"http","time":"2020-06-23T07:39:53-07:00","uri":"/api/v4/jobs/request","user_agent":"gitlab-runner 12.10.2 (12-10-stable; go1.13.8; linux/amd64)","written_bytes":24}
{"correlation_id":"sL0J3gzZBd8","duration_ms":0,"host":"git.domain.com","level":"info","method":"POST","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"35.166.34.92:39722","remote_ip":"35.166.34.92","status":204,"system":"http","time":"2020-06-23T07:40:31-07:00","uri":"/api/v4/jobs/request","user_agent":"gitlab-runner 12.10.2 (12-10-stable; go1.13.8; linux/amd64)","written_bytes":0}
{"correlation_id":"QSy6kGgQpr7","duration_ms":0,"error":"badgateway: failed to receive response: dial tcp 127.0.0.1:8182: connect: connection refused","level":"error","method":"POST","msg":"error","time":"2020-06-23T07:40:31-07:00","uri":"/api/v4/jobs/request"}
{"correlation_id":"QSy6kGgQpr7","duration_ms":0,"host":"git.domain.com","level":"info","method":"POST","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"35.166.34.92:39726","remote_ip":"35.166.34.92","status":502,"system":"http","time":"2020-06-23T07:40:31-07:00","uri":"/api/v4/jobs/request","user_agent":"gitlab-runner 12.10.2 (12-10-stable; go1.13.8; linux/amd64)","written_bytes":24}
{"correlation_id":"794AsYF1MH1","duration_ms":0,"host":"git.domain.com","level":"info","method":"POST","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"35.166.34.92:39732","remote_ip":"35.166.34.92","status":204,"system":"http","time":"2020-06-23T07:40:31-07:00","uri":"/api/v4/jobs/request","user_agent":"gitlab-runner 12.10.2 (12-10-stable; go1.13.8; linux/amd64)","written_bytes":0}
{"correlation_id":"lW0HDSZIqX2","duration_ms":0,"host":"git.domain.com","level":"info","method":"POST","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"35.166.34.92:39736","remote_ip":"35.166.34.92","status":204,"system":"http","time":"2020-06-23T07:40:32-07:00","uri":"/api/v4/jobs/request","user_agent":"gitlab-runner 12.10.2 (12-10-stable; go1.13.8; linux/amd64)","written_bytes":0}
{"correlation_id":"o8ivcEiYN44","duration_ms":0,"host":"git.domain.com","level":"info","method":"POST","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"35.166.34.92:39740","remote_ip":"35.166.34.92","status":204,"system":"http","time":"2020-06-23T07:40:32-07:00","uri":"/api/v4/jobs/request","user_agent":"gitlab-runner 12.10.2 (12-10-stable; go1.13.8; linux/amd64)","written_bytes":0}
{"correlation_id":"oj40wlT2iP6","duration_ms":0,"host":"git.domain.com","level":"info","method":"POST","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"35.166.34.92:39744","remote_ip":"35.166.34.92","status":204,"system":"http","time":"2020-06-23T07:40:32-07:00","uri":"/api/v4/jobs/request","user_agent":"gitlab-runner 12.10.2 (12-10-stable; go1.13.8; linux/amd64)","written_bytes":0}
{"correlation_id":"rcR7DG6lK48","duration_ms":0,"error":"badgateway: failed to receive response: dial tcp 127.0.0.1:8182: connect: connection refused","level":"error","method":"POST","msg":"error","time":"2020-06-23T07:40:33-07:00","uri":"/api/v4/jobs/request"}
{"correlation_id":"rcR7DG6lK48","duration_ms":0,"host":"git.domain.com","level":"info","method":"POST","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"35.166.34.92:39748","remote_ip":"35.166.34.92","status":502,"system":"http","time":"2020-06-23T07:40:33-07:00","uri":"/api/v4/jobs/request","user_agent":"gitlab-runner 12.10.2 (12-10-stable; go1.13.8; linux/amd64)","written_bytes":24}
{"correlation_id":"xpOfcOhbcs3","duration_ms":0,"error":"badgateway: failed to receive response: dial tcp 127.0.0.1:8182: connect: connection refused","level":"error","method":"GET","msg":"error","time":"2020-06-23T07:40:39-07:00","uri":"/"}
{"correlation_id":"xpOfcOhbcs3","duration_ms":0,"host":"git.domain.com","level":"info","method":"GET","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"68.109.179.83:39754","remote_ip":"68.109.179.83","status":502,"system":"http","time":"2020-06-23T07:40:39-07:00","uri":"/","user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:77.0) Gecko/20100101 Firefox/77.0","written_bytes":2940}

Our Apache vhost:

<IfModule mod_ssl.c>
  <VirtualHost *:443>
    ServerName git.domain.com
    ServerSignature Off

    ProxyPreserveHost On

    # Ensure that encoded slashes are not decoded but left in their encoded state.
    # http://doc.gitlab.com/ce/api/projects.html#get-single-project
    AllowEncodedSlashes NoDecode

    <Location />
      # New authorization commands for apache 2.4 and up
      # http://httpd.apache.org/docs/2.4/upgrading.html#access
      Require all granted

      #Allow forwarding to gitlab-workhorse
      ProxyPassReverse http://127.0.0.1:8181
      ProxyPassReverse http://git.domain.com/
    </Location>

    # Apache equivalent of nginx try files
    # http://serverfault.com/questions/290784/what-is-apaches-equivalent-of-nginxs-try-files
    # http://stackoverflow.com/questions/10954516/apache2-proxypass-for-rails-app-gitlab
    RewriteEngine on

    #Forward all requests to gitlab-workhorse except existing files like error documents
    RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f [OR]
    RewriteCond %{REQUEST_URI} ^/uploads/.*
    RewriteRule .* http://127.0.0.1:8181%{REQUEST_URI} [P,QSA,NE]

    RequestHeader set X_FORWARDED_PROTO 'https'
    RequestHeader set X-Forwarded-Ssl on

    # needed for downloading attachments
    DocumentRoot /opt/gitlab/embedded/service/gitlab-rails/public

    #Set up apache error documents, if back end goes down (i.e. 503 error) then a maintenance/deploy page is thrown up.
    ErrorDocument 404 /404.html
    ErrorDocument 422 /422.html
    ErrorDocument 500 /500.html
    ErrorDocument 502 /502.html
    ErrorDocument 503 /503.html

    SSLCertificateFile /etc/letsencrypt/live/git.domain.com/fullchain.pem
    SSLCertificateKeyFile /etc/letsencrypt/live/git.domain.com/privkey.pem
    Include /etc/letsencrypt/options-ssl-apache.conf
  </VirtualHost>
</IfModule>

Note: git.domain.com is just a placeholder. Anywhere you see that, it is set to the correct domain name in the configs/logs

It doesn’t seem like any other log files are being written to actively, but I can provide any logs or configs that may be required. I will need to revert back to 12.10.6 (again) so my team can continue working, however.

Does anyone have any ideas on what is going on here? We discovered today that our install of GitLab 12.10.6 is broken now and we cannot create new users without commenting out a line of code in the User model.

Hate to bump this again, but does anyone have any ideas? We’ll be 3 releases behind on Wednesday.

I always use the zero-downtime procedure for upgrading, but after going through that process, I often see our servers giving the message you’re reporting for a short time (typically less than a minute).

I don’t remember if it also happened when I upgraded to 13 (but there were other reasons to expect downtime when I did that upgrade).

I don’t see any clear problems in your logs, so I’m going to go with the simple answer: Have you tried being patient to see if GitLab comes back?

1 Like

I had a similar problem. My gitlab-workhorse configuration (gitlab_workhorse[‘auth_backend’]) was still using the unicorn service port instead of the puma service port.

1 Like

I did. I let it sit for an hour (thankfully my team doesn’t do much work on Saturdays) and it never came back up. I’m used to seeing the error message after an upgrade, but it usually goes away after a few minutes at most.

Thank you. I’ll try this solution after hours tonight. Do you know offhand what the default puma port is?

Edit: Looks like I should just follow these instructions. https://docs.gitlab.com/omnibus/settings/puma.html#converting-unicorn-settings-to-puma

Your note worked, ocanema. Thank you so much. You have no idea how much headache you saved me. Both of my instances are now on 13.2.0.

Note: I’m running Omnibus with Apache so my process and configs may not match yours exactly if someone else finds this and wants to know how to fix it.

I had to:

  1. Edit /etc/gitlab/gitlab.rb (you should always back this file up first before modifying)
  2. Add (at least) the following, if not all of the Puma defaults from the gitlab.rb template
    puma['enabled'] = true
    # this can be any port so long as it matches gitlab_workhorse['auth_backend']
    puma['port'] = 8098
    
    and modify the following values:
    unicorn['enabled'] = false
    gitlab_workhorse['auth_backend'] = "http://localhost:8098"
    
  3. Update to 13.0.x first via apt install gitlab-ce=13.0.10-ce.0 -V. The GitLab docs seem to insinuate this is essential. Do not go directly to 13.1.0+ from 12.x
  4. Update to 13.2.x via apt install gitlab-ce gitlab-runner

Should I find that it simply means it is still booting up. Like the others have said here. Just wait 4-5 mins and the ruby server will start and all will be good.

Panic over!