Hi,
After upgrading our EKS self-hosted Gitlab and runners to 16.4.1, we are seeing pipelines getting stuck in a running state and some of the manual jobs are stuck in a created state. In past versions, all manual jobs would end up an a “manual action” state and the pipeline would complete execute and move to a “passed” state.
I just wondering if anyone else has seen this behavior or has suggestions on how to fix this.
I’ve tried several variations and have seen the behavior with several types of pipelines.
Here is a CI configuration I have been testing with
#
# the variables
#
variables:
UBUNTU: ubuntu:20.04
MAVEN: ubuntu:20.04
#MAVEN: maven:3.6.3-openjdk-11
CI_DEBUG_TRACE: "true"
#
# the stages
#
stages:
- build
- test
- deploy
#
# the build jobs
#
build:
stage: build
image: $MAVEN
script:
- echo "Start build"
- sleep 20
- echo "END build"
# rules:
# - if: $NO_BUILD
# when: never
# - when: on_success
#
# the test jobs
#
test:
stage: test
image: $MAVEN
script:
- echo "Start test"
- sleep 15
- echo "END test"
# rules:
# - if: $NO_TEST
# when: never
# - when: on_success
#
# the deploy stage jobs
#
.deploy:
stage: deploy
image: $UBUNTU
script:
- echo "Start deploy"
- sleep 20
- echo "END deploy"
# rules:
# - if: $CI_JOB_NAME =~ /int01/
# when: manual
# - if: $LIMIT_DEPLOY
# when: never
# - when: manual
foo-int01:
extends: .deploy
environment:
name: foo-int01
when: manual
foo-int02:
extends: .deploy
environment:
name: foo-int02
when: manual
foo-int03:
extends: .deploy
environment:
name: foo-int03
when: manual
foo-int04:
extends: .deploy
environment:
name: foo-int04
when: manual
foo-int05:
extends: .deploy
environment:
name: foo-int05
when: manual
foo-int06:
extends: .deploy
environment:
name: foo-int06
when: manual
bar-int01:
extends: .deploy
environment:
name: bar-int01
when: manual
bar-int02:
extends: .deploy
environment:
name: bar-int02
when: manual
bar-int03:
extends: .deploy
environment:
name: bar-int03
when: manual
bar-int04:
extends: .deploy
environment:
name: bar-int04
when: manual
bar-int05:
extends: .deploy
environment:
name: bar-int05
when: manual
And here’s the screenshot from the pipeline execution. Sometimes when watching the execution, one or maybe 2 of the manual jobs will move from the created state to “manual action” maybe 10-15 seconds after the others that do make it to the “manual action” state.
I’ve searched the forums here and found a few similar issues, but they seem to be generally for more complex pipelines. I’m sorry if I’ve missed a previous topic that is relevant.