Gitlab version: 8.2.1
Nginx version: 1.4.6
We recently upgraded from Gitlab 7.14.0 to Gitlab 8.2.1 on a 16GB/8cpu VM at DigitalOcean. We have one Merge Request with 55 comments that simply won’t load in the browser. All other merge_requests load fine. This is repeatable, and we’ve been unable to view this merge request at all in the browser. We get the following error from NGINX:
2015/12/02 14:49:02 [error] 9094#0: *62 upstream timed out (110: Connection timed out) while reading response header from upstream, client: x.x.x.x, server: gitlab.domain.com, request: "GET /group/project/merge_requests/854 HTTP/1.1", upstream: "http://unix:/home/git/gitlab/tmp/sockets/gitlab.socket/group/project/merge_requests/900", host: "gitlab.domain.com"
(Our errors are different from this question: Page timeouts when accessing merge request (502))
In config/unicorn.rb we have set timeout 1200 (up from an original 30, then tried 300 and 600). 600 was working well with Gitlab 7.14.0. We have worker_processes 12 also set in config/unicorn.rb.
Gitlab 8.2.1 uses gitlab-workhorse, but I’m not familiar enough with gitlab-workhorse to know if there are settings for it.
Our workhorse settings in Nginx:
upstream gitlab {
server unix:/home/git/gitlab/tmp/sockets/gitlab.socket fail_timeout=0;
}
upstream gitlab-workhorse {
server unix:/home/git/gitlab/tmp/sockets/gitlab-workhorse.socket fail_timeout=0;
}
location @gitlab-workhorse {
client_max_body_size 0;
gzip off;
proxy_read_timeout 600;
proxy_connect_timeout 600;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Ssl on;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_pass http://gitlab-workhorse;
}
Output from free:
total used free shared buffers cached
Mem: 16433928 11946992 4486936 119052 236972 8130620
-/+ buffers/cache: 3579400 12854528
We’ve restarted Gitlab with service gitlab restart in the hopes there was a hungry resource, but we saw no difference.
Any suggestions on how we figure out what is going on and fix this issue?