CI job hangs for 10 minutes even though job is complete (private runner)

I’ve run into an issue with a private runner communicating with GitLab.com, where the job output ceases to show up on GitLab after a few messages even though the job is progressing on the runner (verified by looking at the console output). I’ve installed mitmproxy to watch the HTTP requests between the runner and GitLab.com, and the problem stems from one request that hangs on getting a response from GitLab.com (the response does not come back). Because the response doesn’t come back, the runner does not send any more messages to GitLab. Eventually the runner times out the request (after 10 minutes) and sends the rest of the job output (which had finished some 8 minutes ago) in another request which gets an immediate response. This has the effect of every job taking 11-12 minutes even though it really finishes in 1-2 minutes.

One would assume this is a network problem and the request never made it to GitLab.com, except the hanging request actually does receive a response some 16 minutes later, well after the runner timed out and sent all the remaining job output and closes the job. The only way I know this is by looking in mitmproxy and seeing the 403 error on the response (“Job is not running”).

Has anyone else experienced this? I’ve tried removing and reregistering the runner multiple times over the span of a month, same result.

I’ve logged this as an issue on GitLab.com Support Tracker (https://gitlab.com/gitlab-com/support-forum/issues/4844) but doesn’t seem to be getting any attention. Please help!

Hi, sorry to dig up that topic, but did you happen to solve this issue ?

I’m running on exactly the same, with my corporate’s gitlab instance. I suspect a proxy we use could be the problem root cause, but I can’t find a workaround.

Any help would be greatly appreciated :slight_smile:

Hello!

We have the same situation, only we do not use a proxy. Our assembly of the docker container hangs at the stage of its loading in the gitlab registry.

We do not understand how to solve this problem, the work has stopped …