SSH into CI server

I need to debug docker build running in the CI server. Is it possible to SSH into the server for debugging / troubleshooting purposes ? I’m dealing with failing build described here Runner exits out as "job succeeded" before finishing a job

Greetings!

When using a docker executor, it tends to make a container for the job itself. This would make SSH’ing into it difficult, as it tends to “die out” once the job is completed (regardless of exit status).

What you might want to do instead is utilize either debug logging or have the .gitlab-ci.yaml file produce more information that can assist you in debugging.

Another option is to manually run the job steps manually via the CI server itself. Basically, spin up a container using the same image, clone the project’s repo into the container, and then start running the steps from the job manually. This will basically let you see what is occurring when the CI/CD job runs.

Hope this helps!

2 Likes

Hi @jcolyer thank you for taking time and giving me some pointers here. I can tell it was helpful. I’ve got a theory but I wanna confirm it with you since you’re more experienced with CI server.
Does the after_script section of the .gitlab-ci.yml file have its own timeout ?

Greetings!

Not in the sense of the pipeline/job settings. Basically, the anatomy of a CI/CD job is:

  • before_script
  • script
  • after_script

The job can be a bit more complex that that, but that is a good general view of a pipeline job.

All of these tie into the timeout settings of the overall job. So if your pipeline settings allow for the job to run for 1 hour (the default), it would be 1 hour for the whole pipeline job to do those three code bits.

As a thing to keep in mind, the runner timeout setting would override the project level.

1 Like

I was asking because I noticed that when I move the build script from after_script to script the image build won’t time out. It successfully builds and pushes the image onto ECR. Whereas if the script is in after_script the image build exits out as “succeeded” even before finishing the job and pushing. Still can’t believe I had been stuck on this for almost 3 weeks. Gosh …

I think I used the after_script section for the wrong purpose. The purpose of after_script is to include jobs for clean up. Not for including jobs that build. I misplaced the script.