Gitaly timeout on CephFS when creating a new project

Hi all,

I have a gitlab-ce docker installation with git-data being stored on a CephFS. We only recently switched from iSCSI to CephFS. Since then we experienced errors when creating new projects. The reason is the Gitaly CreateRepository method that is taking longer than 10 seconds and thus timing out with the default fast timeout. Now I am wondering why creating a repository takes this long and how this might be linked to the CephFS.

Did anyone have a similar experience or has an idea what could be the cause? The CephFS is mounted over a 10Gb network and reaches this performance with decent IOPS in the fio benchmarks suggested by GitLab to test the filesystem.

Seems that Git operations on Ceph filesystems are slow, or cause performance issues. Some tuning tips are discussed in kubernetes - CephFs and CephRBD being to slow to git clone - Stack Overflow Found some performance monitoring tips in GitLab on CephFS (#1) · Issues · GitLab.com / Operations · GitLab

1 Like

Thanks for the fast reply! I have seen the reports about performance issues with git on Ceph. However, I don’t see them for anything else but creating a new project. Cloning or pushing runs between 30 - 80MB/s which is more than enough. If I push a larger repository (~500MB) to a new remote it takes longer to create the project than writing all files afterwards. To me, this does not look like a problem with general file I/O.

I did check the relevant Gitaly code but did not find anything particularly time consuming in there. And I am not firm enough with Gitaly debugging to get meaningful logs about where all the time is spent when running in our production environment.

I’m not familiar with the Gitaly code and how it relates to I/O unfortunately. Maybe this hits a bug, or needs more investigation. I’d suggest opening an issue in the Gitaly project and discuss with our engineers, linking to this forum topic.