Not sure what happened, it worked fine a few days ago when I last used it. It’s as if it doesn’t execute the job scripts and quits or something.
Running with gitlab-runner 14.6.0 (5316d4ac)
on LAPTOP-NSD23I7N khE_F_t7
feature flags: FF_USE_FASTZIP:true
Preparing the "shell" executor
00:00
Using Shell executor...
Preparing environment
00:01
Running on LAPTOP-NSD23I7N...
Getting source from Git repository
00:18
Fetching changes with git depth set to 50...
Reinitialized existing Git repository in C:/GitLab-Runner/builds/khE_F_t7/0/emrys90/project-cards/.git/
Checking out 745abf5d as dev...
Removing Library/
git-lfs/3.0.2 (GitHub; windows amd64; go 1.17.2)
Skipping Git submodules setup
Restoring cache
01:53
Version: 14.6.0
Git revision: 5316d4ac
Git branch: 14-6-stable
GO version: go1.13.8
Built: 2021-12-17T17:35:49+0000
OS/Arch: windows/amd64
Checking cache for dev-android-applab...
Runtime platform arch=amd64 os=windows pid=23420 revision=5316d4ac version=14.6.0
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted.
Successfully extracted cache
Executing "step_script" stage of the job script
00:01
$ bash "ci/build.sh"
Cleaning up project directory and file based variables
00:00
ERROR: Job failed: exit status 1
It begins running the script, but fails due to not having all the environment variables set that the runner uses. Here’s the first few lines of the script, which you can see from the logs its not showing up in there:
#!/usr/bin/env bash
set -e
set -x
echo "Building for $BUILD_TARGET"
export BUILD_PATH=./Builds/$BUILD_TARGET/
It’s in the .yml file. Everything worked fine in this whole pipeline a few days ago. If the script was the issue I would still at least get that first echo statement.
Checking cache for dev-android-applab...
Runtime platform arch=amd64 os=windows pid=21916 revision=5316d4ac version=14.6.0
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted.
Successfully extracted cache
Executing "step_script" stage of the job script
00:01
$ echo "$BUILD_TARGET"
Android
$ bash "ci/build.sh"
Cleaning up project directory and file based variables
00:00
ERROR: Job failed: exit status 1
OK, so if you have the variable available in the YAML file, but it’s just not getting to the script, why not change the script so that it accepts a command line argument. So in the YAML file you’d have:
I think you might be misunderstanding what the issue is? My script isn’t even running. If it was running, I would get the first echo in that script. Even if it wasn’t getting the variable, the echo would still happen.
But your log does show that the runner is executing $ bash "ci/build.sh", so it is trying to execute that line, and you have seen that you can run the script correctly manually as gitlab-runner and get some output.
These are just random ideas for debugging, but I would be inclined to change the she-bang line to #!/bin/bash and see what happens.
Clearly, it’s not a permissions issue because you’re calling bash directly from the YAML file, but you could also change that to chmod +x ci/build.sh && ./ci/build.sh just to avoid the extra shell invocation, and see if you get a different result.
Changing it to #!/bin/bash had no effect. This is running on Windows, so I don’t think its any kind of file permissions error. Plus this whole thing worked as of a few days ago and I can’t think of anything I would have changed to break it.
Well, it’s likely that something has changed in your environment. It is odd though that you can run the script manually as the gitlab-runner user but not automatically.
A longer term solution, which would prevent this from happening again, would be to use a Docker runner and pin the version of the image that you use. It will mean changing your infrastructure a bit, but that would give you some assurance that the environment is stable.
I’m at a loss as for what could have caused it or how to fix it now… I have no idea what could possibly prevent a script from executing through the runner.
I would guess that the thing that changed was something in your wider environment; maybe a package update, or windows update, or something that might have changed a setting or and env var somewhere, maybe.
Any ideas what I need to do to fix it? It’s worked fine for a long time and do not want to have to redo a full process like trying to move it all to docker as a workaround. Plus the fact that I need to build it on Windows for Unity, so I can’t just use a Linux image in a docker for it.
I think you need to gather more information here about exactly what is causing the problem. You might also try creating a new runner and seeing whether that gives you different results (but I’d guess not?).
However, if it’s Unity that you are building, I use this image which is made specifically for building Unity apps on GitLab, and can target any of the relevant platforms, including Windows and Android. There’s an example repo with a simple Unity app that shows how the image is intended to be used.
More information would be helpful for sure, but I honestly have no idea what to look for. I don’t have any idea what could possibly caused this. The only thing I recently changed on my computer was installing docker, I had never used it before. I think there might have also been some recent Windows updates that installed.
Does anyone have any other ideas on how to possibly fix this issue? I would like to avoid switching it to docker as that would be a time consuming process that takes me away from other development needs, and this is a production product with lots of ongoing work needed so I am effectively shut down right now until this is solved. Also, I had used the gableroux a while back and switched away from it for an issue I was having, I don’t remember what the issue was, but it makes me more hesitant as well to try and switch to docker.
This has worked fine for over a year, so I would like to just get this functional again.
I’ve tried removing docker from the PATH, and uninstalling the recent Windows updates, but its still not working.