Pipeline stuck in running state, some jobs stuck in created state

Hi,
After upgrading our EKS self-hosted Gitlab and runners to 16.4.1, we are seeing pipelines getting stuck in a running state and some of the manual jobs are stuck in a created state. In past versions, all manual jobs would end up an a “manual action” state and the pipeline would complete execute and move to a “passed” state.

I just wondering if anyone else has seen this behavior or has suggestions on how to fix this.
I’ve tried several variations and have seen the behavior with several types of pipelines.

Here is a CI configuration I have been testing with

#
# the variables
#
variables:
  UBUNTU: ubuntu:20.04
  MAVEN: ubuntu:20.04
  #MAVEN: maven:3.6.3-openjdk-11
  CI_DEBUG_TRACE: "true"

#
# the stages
#
stages:
  - build
  - test
  - deploy

#
# the build jobs
#
build:
  stage: build
  image: $MAVEN
  script:
    - echo "Start build"
    - sleep 20
    - echo "END build"
#  rules:
#    - if: $NO_BUILD
#      when: never
#    - when: on_success

#
# the test jobs
#
test:
  stage: test
  image: $MAVEN
  script:
    - echo "Start test"
    - sleep 15
    - echo "END test"
#  rules:
#    - if: $NO_TEST
#      when: never
#    - when: on_success

#
# the deploy stage jobs
#
.deploy:
  stage: deploy
  image: $UBUNTU
  script:
    - echo "Start deploy"
    - sleep 20
    - echo "END deploy"
#  rules:
#    - if: $CI_JOB_NAME =~ /int01/
#      when: manual
#    - if: $LIMIT_DEPLOY
#      when: never
#    - when: manual

foo-int01:
  extends: .deploy
  environment:
    name: foo-int01
  when: manual

foo-int02:
  extends: .deploy
  environment:
    name: foo-int02
  when: manual

foo-int03:
  extends: .deploy
  environment:
    name: foo-int03
  when: manual

foo-int04:
  extends: .deploy
  environment:
    name: foo-int04
  when: manual

foo-int05:
  extends: .deploy
  environment:
    name: foo-int05
  when: manual

foo-int06:
  extends: .deploy
  environment:
    name: foo-int06
  when: manual

bar-int01:
  extends: .deploy
  environment:
    name: bar-int01
  when: manual

bar-int02:
  extends: .deploy
  environment:
    name: bar-int02
  when: manual

bar-int03:
  extends: .deploy
  environment:
    name: bar-int03
  when: manual

bar-int04:
  extends: .deploy
  environment:
    name: bar-int04
  when: manual

bar-int05:
  extends: .deploy
  environment:
    name: bar-int05
  when: manual

And here’s the screenshot from the pipeline execution. Sometimes when watching the execution, one or maybe 2 of the manual jobs will move from the created state to “manual action” maybe 10-15 seconds after the others that do make it to the “manual action” state.

I’ve searched the forums here and found a few similar issues, but they seem to be generally for more complex pipelines. I’m sorry if I’ve missed a previous topic that is relevant.

Hi,

I am experiencing the same issue.

We are on on premise with GitLab Enterprise Edition [v16.3.5-ee] and runners on 16.3.1.

Are you able to find the solution?

Thanks