Jobs in K8s exiting as expected, but the pod keep running

  What are you seeing, and how does that differ from what you expect to see?
    Some of the jobs finish with
{"command_exit_code": 0, "script": "/scripts-142-420934/step_script"}


{"command_exit_code": 1, "script": "/scripts-142-420934/step_script"}

and randomly get stuck in a “running” state (the pod).
In the UI the user sees the job is done and CI is complete as expected, but for me, when taking care of the cluster, this is very annoying.

    • GitLab (Hint: /help): 14.6.1-ee
    • Runner (Hint: /admin/runners): gitlab-org/gitlab-runner:alpine-v16.3.0
    This is not relevant, the jobs was worked fine in normal docker runner, so we just move into K8s, and it’s work the same, but some times the exit is stuck the pod on running, so it’s not related to any specific pipeline.

    Nothing - I don’t know where to start
    I have a corn job that checks if a pod’s last log is ““command_exit_code”:” and if so, delete the pod, this is a work around, but not a solution.