An error occurred while fetching the job log

ctryti · January 7, 2020, 2:11pm

I’m having trouble loading job logs for finished jobs. When navigating to a finished job (https://gitlab-instance/group/project/-/jobs/number), we see a loading animation which eventually times out, and a banner appears with the message in the title.

If i navigate to the job while it is running, the output is continously shown as expected. But just as the job finishes, the loading animation comes up again, and the banner eventually appears.

Viewing the raw log by pressing the “Show complete raw” redirects me to the s3-compatible store, and shows me the correct log.

Our setup:

Gitlab 12.5.4-ee (2a57951c0ee) installed on our on-prem Kubernetes cluster using helm chart version 2.5.5.
s3-compatible storage setup for all external storage (artifacts, lfs, registry, uploads, packages, externalDiffs, pseudonymizer).
ci_enabled_live_trace is enabled

Any tips on how to debug this further?

fx.payet · January 8, 2020, 7:44pm

We’ve got the exact same issue. Did you manage to solve this?

ctryti · January 9, 2020, 7:17am

No. I suspect it has something to do with the incremental logging architecture (https://docs.gitlab.com/ee/administration/job_logs.html#new-incremental-logging-architecture). But I’m not familiar enough with Gitlab or ruby to know where and what to look for.

childs · March 26, 2020, 1:54pm

I have the exact same issue. I do not suppose you found more information about this @ctryti?

In our case we have this on one Scheduled Pipeline. The stage for the builds are successful but there are no logs. Also, there are no Artifacts to download.

However, if we run this manually from the UI (even when pressing play on the Schedule) it succeeds. It also succeeds when we run it from a commit.

ctryti · March 27, 2020, 7:30am

I’m not sure its exactly the same problem. We found out that our problem was with unicorn running in multiple pods. A runner would transfer logs to one of these unicorn-pods, but when we in a browser want to view the job logs, we get loadbalanced between all unicorn pods, and wont always land on a pod that has the actual logs.

We solved it by creating a persistant volume that is shared between all unicorn pods, and disabling the incremental logging linked above. It’s not the correct solution, but at least we can consistently view job logs now.

bleser · May 13, 2020, 3:09pm

As you are running this on kubernetes, I assume you run Gitlab-instances in cluster.
Can you try adding following setting to your gitlab.rb file?

gitlab_rails['artifacts_objects_store_direct_upload'] = true

Seems Gitlab pushes the job logs by default to the local file-system and then uploads it in a proceeding task. If you run clustered gitlab setup, the file will be on only 1 instance’s file system and therefor you get an error when requesting it on an instance that does not have it on the file-system.

The setting above makes sure the artifact is uploaded directly to S3 without storing it on the local filesystem first.

ref: https://docs.gitlab.com/ee/administration/job_artifacts.html#object-storage-settings

casper37a · October 15, 2020, 10:06pm

I have the exact same isssue, also I am getting a “undefined” message.
@ctryti solution din’t work for me…