I have a git lab server which dies at least once a day, usually at night (when we have more requests). The whole machine becomes inaccessible and I have to reboot is. Looking at gitlab logs I did not manage to find and useful clue, besides de fact that a git process usually starts consuming a lot of memory before crashing.

I’m running a GitLab Community Edition 10.6.0 8f82e53 on top of a m3.2xlarge on EC2 (it used to be a m3.xlarge, but I increased it to test if it would solve the problem. It did not).

More info:

ubuntu@ip-10-x-x-x:~$ uname -a
Linux ip-10-x-x-x 3.13.0-149-generic #199-Ubuntu SMP Thu May 17 10:12:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
ubuntu@ip-10-x-x-x:~$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 14.04.2 LTS
Release:	14.04
Codename:	trusty
ubuntu@ip-10-x-x-x:~$ git version
git version 1.9.1
ubuntu@ip-10-x-x-x:~$ ulimit -c
root@ip-10-x-x-x:~# cat /proc/sys/vm/swappiness
# I also tried to desable swap, but it did not work
root@ip-10-x-x-x:~# cat /proc/sys/vm/min_free_kbytes
ubuntu@ip-10-x-x-x:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1      493G  203G  270G  43% /
none            4.0K     0  4.0K   0% /sys/fs/cgroup
udev             15G   12K   15G   1% /dev
tmpfs           3.0G  384K  3.0G   1% /run
none            5.0M     0  5.0M   0% /run/lock
none             15G     0   15G   0% /run/shm
none            100M     0  100M   0% /run/user
/dev/xvdb        74G   52M   70G   1% /mnt
/dev/xvdf       739G  166G  540G  24% /mnt2

Before crashing:

A log I’ve seen only one:

File descriptors:

ubuntu@ip-10-x-x-x:~$ ulimit -aH
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 240041
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 4096
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 240041
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
# at an arbitrary time at night
ubuntu@ip-10-x-x-x:~$ lsof | wc -l
ubuntu@ip-10-x-x-x:~$ sudo lsof | wc -l


I am seeing the same or similar error, Gitlab (hosted locally in a VM on Debian) becomes unresponsive with high CPU load.

