An error occurred while fetching the job log

I’m having trouble loading job logs for finished jobs. When navigating to a finished job (https://gitlab-instance/group/project/-/jobs/number), we see a loading animation which eventually times out, and a banner appears with the message in the title.

If i navigate to the job while it is running, the output is continously shown as expected. But just as the job finishes, the loading animation comes up again, and the banner eventually appears.

Viewing the raw log by pressing the “Show complete raw” redirects me to the s3-compatible store, and shows me the correct log.

Our setup:

  • Gitlab 12.5.4-ee (2a57951c0ee) installed on our on-prem Kubernetes cluster using helm chart version 2.5.5.
  • s3-compatible storage setup for all external storage (artifacts, lfs, registry, uploads, packages, externalDiffs, pseudonymizer).
  • ci_enabled_live_trace is enabled

Any tips on how to debug this further?

We’ve got the exact same issue. Did you manage to solve this?

No. I suspect it has something to do with the incremental logging architecture (https://docs.gitlab.com/ee/administration/job_logs.html#new-incremental-logging-architecture). But I’m not familiar enough with Gitlab or ruby to know where and what to look for.

I have the exact same issue. I do not suppose you found more information about this @ctryti?

In our case we have this on one Scheduled Pipeline. The stage for the builds are successful but there are no logs. Also, there are no Artifacts to download.

However, if we run this manually from the UI (even when pressing play on the Schedule) it succeeds. It also succeeds when we run it from a commit.

I’m not sure its exactly the same problem. We found out that our problem was with unicorn running in multiple pods. A runner would transfer logs to one of these unicorn-pods, but when we in a browser want to view the job logs, we get loadbalanced between all unicorn pods, and wont always land on a pod that has the actual logs.

We solved it by creating a persistant volume that is shared between all unicorn pods, and disabling the incremental logging linked above. It’s not the correct solution, but at least we can consistently view job logs now.