4th stage is starting at the same time of 2nd stage, how to fix that?

uday.reddy3 · June 8, 2022, 8:39pm

Below is my pipeline.

# .gitlab-create-admin.yml used for creating new clusters and deploying packages on existing clusters
.generalgrabclustertrigger:
  rules:
    - if: '$TEST_CREATE_ADMIN && $REGION && $ROLE_ARN && $PACKAGEURL && $TEST_CREATE_ADMIN == "aws" && $SUB_PLATFORM == "aws" && $ROLE_ARN != "" && $PACKAGEURL != "" && $REGION != ""'

.ifcreateadmin:  # jd example
  rules:
    - if: '$ADMIN_SERVER_IP && $ADMIN_SERVER_IP != ""'  # If these variables are set, then don't run the job
      when: never
    - !reference [.generalgrabclustertrigger, rules]


.ifconnectadmin:  # jd example
  rules:
    - if: '$ADMIN_SERVER_IP && $ADMIN_SERVER_IP == ""'
      when: never
    - !reference [.generalgrabclustertrigger, rules]


# .ifteardownordestroy:  # Automatic if triggered from gitlab api AND destroy variable is set
#   rules:
#     - !reference [.generalgrabclustertrigger, rules]
#       when: manual  # This will only add the manual to the last (or only) rule in the !reference
#     - if: '$CI_PIPELINE_SOURCE == "triggered" && $Teardownanddestroy'

# Overall what is happening
#gitlab api triggers create admin and deploy, teardown and destroy are left manual
# ANOTHERS gitlab api trigger another pipeline, bu tthis time with the destroy variable set
#    This would only automatically run teardown and destroy

variables:
 TEST_CREATE_ADMIN:
   #value: aws
   description: "Platform, currently aws only"
 SUB_PLATFORM:
   value: aws
   description: "Platform, currently aws only"
 REGION:
   value: "us-west-2"
   description: "region where to deploy projectn"
 PACKAGEURL:
   value: "rpm url"
   description: "projectn rpm file url"
 ACCOUNT_NAME:
   value: "testsubaccount"
   description: "Account name of sub account to refer in the deployment, no need to match in AWS"
 ROLE_ARN:
   value: "aws arn"
   description: "ROLE ARN of the user account assuming: aws sts get-caller-identity"
 tfenv_version: "1.1.9"
 DEV_PUB_KEY:
   description: "Optional public key file to add access to admin server" 
 ADMIN_SERVER_IP:
   description: "Existing Admin Server IP Address"
 ADMIN_SERVER_SSH_KEY:
   description: "Existing Admin Server SSH_KEY PEM content"
  
#export variables below will cause the terraform to use the root account instead of the one specified in tfvars file
.configure_aws_cli: &configure_aws_cli
    - aws configure set region $REGION
    - aws configure set aws_access_key_id $AWS_FULL_STS_ACCESS_KEY_ID
    - aws configure set aws_secret_access_key $AWS_FULL_STS_ACCESS_KEY_SECRET
    - aws sts get-caller-identity
    - aws configure set source_profile default --profile $ACCOUNT_NAME
    - aws configure set role_arn $ROLE_ARN --profile $ACCOUNT_NAME
    - aws sts get-caller-identity --profile $ACCOUNT_NAME
    - aws configure set region $REGION --profile $ACCOUNT_NAME

.copy_remote_log: &copy_remote_log
- if [ -e outfile ]; then rm outfile; fi
- copy_command="$(cat $CI_PROJECT_DIR/scp_command.txt)"
- new_copy_command=${copy_command/"%s"/"outfile"}
- new_copy_command=${new_copy_command/"~"/"/home/ec2-user/outfile"}
- echo $new_copy_command
- new_copy_command=$(echo "$new_copy_command" | sed s'/\([^.]*\.[^ ]*\) \([^ ]*\) \(.*\)/\1 \3 \2/')
- echo $new_copy_command
- sleep 10
- eval $new_copy_command

.check_remote_log: &check_remote_log
- sleep 10
- grep Error outfile || true
- sleep 10
- returnCode=$(grep -c Error outfile) || true
- echo "Return code received $returnCode"
- if [ $returnCode -ge 1 ]; then exit 1; fi
- echo "No errors"

.prepare_ssh_key: &prepare_ssh_key
- echo $ADMIN_SERVER_SSH_KEY > $CI_PROJECT_DIR/ssh_key.pem
- cat ssh_key.pem
- sed -i -e 's/-----BEGIN RSA PRIVATE KEY-----/-bk-/g' ssh_key.pem
- sed -i -e 's/-----END RSA PRIVATE KEY-----/-ek-/g' ssh_key.pem
- perl -p -i -e 's/\s/\n/g' ssh_key.pem
- sed -i -e 's/-bk-/-----BEGIN RSA PRIVATE KEY-----/g' ssh_key.pem
- sed -i -e 's/-ek-/-----END RSA PRIVATE KEY-----/g' ssh_key.pem
- cat ssh_key.pem
- chmod 400 ssh_key.pem

connect-admin-server:
  stage: build
  allow_failure: true
  image:
    name: amazon/aws-cli:latest
    entrypoint: [ "" ]
  extends:
    - .ifconnectadmin
  script:
    - TF_IN_AUTOMATION=true
    - yum update -y
    - yum install git unzip gettext jq -y
    - echo "Your admin server key and info are added as artifacts"
    # Copy the important terraform outputs to files for artifacts to pass into other jobs
    - *prepare_ssh_key
    - echo "ssh -i ssh_key.pem ec2-user@${ADMIN_SERVER_IP}" > $CI_PROJECT_DIR/ssh_command.txt
    - echo "scp -q -i ssh_key.pem %s ec2-user@${ADMIN_SERVER_IP}:~" > $CI_PROJECT_DIR/scp_command.txt
    - test_pre_command="$(cat "$CI_PROJECT_DIR/ssh_command.txt") -o StrictHostKeyChecking=no"
    - echo $test_pre_command
    - test_command="$(echo $test_pre_command | sed -r 's/(ssh )(.*)/\1-tt \2/')"
    - echo $test_command
    - echo "sudo yum install -yq $PACKAGEURL 2>&1 | tee outfile ; exit 0" | $test_command
    - *copy_remote_log
    - echo "Now checking log file for returnCode"
    - *check_remote_log
  artifacts:
    untracked: true
    when: always
    paths:
      - "$CI_PROJECT_DIR/ssh_key.pem"
      - "$CI_PROJECT_DIR/ssh_command.txt"
      - "$CI_PROJECT_DIR/scp_command.txt"
  after_script:
    - cat $CI_PROJECT_DIR/ssh_key.pem
    - cat $CI_PROJECT_DIR/ssh_command.txt
    - cat $CI_PROJECT_DIR/scp_command.txt
    #testing
    - echo $CI_PIPELINE_ID
    - echo $CI_PIPELINE_IID
    - echo $CI_PIPELINE_SOURCE
    - echo $CI_PROJECT_ID
    - echo $CI_PROJECT_NAME
    - echo $CI_PROJECT_TITLE
    - echo $CI_PROJECT_URL
    - echo $CI_JOB_ID
    - echo $CI_JOB_NAME
    - echo $CI_JOB_STAGE
    - echo $CI_JOB_STATUS
    - echo $CI_JOB_TOKEN
    - echo $CI_JOB_URL
    - echo $CI_JOB_STARTED_AT

create-admin-server:
  stage: build
  allow_failure: false
  image:
    name: amazon/aws-cli:latest
    entrypoint: [ "" ]
  extends:
    - .ifcreateadmin
  script:
    - echo "admin server $ADMIN_SERVER_IP"
    - TF_IN_AUTOMATION=true
    - yum update -y
    - yum install git unzip gettext jq -y
    - *configure_aws_cli
    - aws sts get-caller-identity --profile $ACCOUNT_NAME #to check whether updated correctly or not
    - git clone "https://project-n-setup:$(echo $PERSONAL_GITLAB_TOKEN)@gitlab.com/projectn-oss/project-n-setup.git"
    # Install tfenv
    - git clone https://github.com/tfutils/tfenv.git ~/.tfenv
    - ln -s ~/.tfenv /root/.tfenv
    - ln -s ~/.tfenv/bin/* /usr/local/bin
    # Install terraform 1.1.9 through tfenv
    - tfenv install $tfenv_version
    - tfenv use $tfenv_version
    # Copy the tfvars temp file to the terraform setup directory
    - cp .gitlab/admin_server.temp_tfvars project-n-setup/$SUB_PLATFORM/
    - cd project-n-setup/$SUB_PLATFORM/
    - envsubst < admin_server.temp_tfvars > admin_server.tfvars
    - rm -rf .terraform || exit 0
    - cat ~/.aws/config
    - terraform init -input=false
    - terraform apply -var-file=admin_server.tfvars -input=false -auto-approve
    - echo "Your admin server key and info are added as artifacts"
    # Copy the important terraform outputs to files for artifacts to pass into other jobs
    - terraform output -raw ssh_key > $CI_PROJECT_DIR/ssh_key.pem
    - terraform output -raw ssh_command > $CI_PROJECT_DIR/ssh_command.txt
    - terraform output -raw scp_command > $CI_PROJECT_DIR/scp_command.txt
    - cp $CI_PROJECT_DIR/project-n-setup/$SUB_PLATFORM/terraform.tfstate $CI_PROJECT_DIR
    - cp $CI_PROJECT_DIR/project-n-setup/$SUB_PLATFORM/admin_server.tfvars $CI_PROJECT_DIR
  artifacts:
    untracked: true
    paths:
      - "$CI_PROJECT_DIR/ssh_key.pem"
      - "$CI_PROJECT_DIR/ssh_command.txt"
      - "$CI_PROJECT_DIR/scp_command.txt"
      - "$CI_PROJECT_DIR/terraform.tfstate"
      - "$CI_PROJECT_DIR/admin_server.tfvars"
  after_script:
    - echo $CI_PIPELINE_ID
    - echo $CI_PIPELINE_IID
    - echo $CI_PIPELINE_SOURCE
    - echo $CI_PROJECT_ID
    - echo $CI_PROJECT_NAME
    - echo $CI_PROJECT_TITLE
    - echo $CI_PROJECT_URL
    - echo $CI_JOB_ID
    - echo $CI_JOB_NAME
    - echo $CI_JOB_STAGE
    - echo $CI_JOB_STATUS
    - echo $CI_JOB_TOKEN
    - echo $CI_JOB_URL
    - echo $CI_JOB_STARTED_AT

postconfig-adminserver:
  stage: build_postconfig
  allow_failure: true
  rules:
    - if: '$DEV_PUB_KEY && $DEV_PUB_KEY == ""'
      when: never
    - !reference [.generalgrabclustertrigger, rules]
  image:
    name: ubuntu:latest
  script:   
    - echo "Public key not empty"
    - apt update -yq
    - apt install -yq openssh-client
    - mkdir ~/.ssh
    - echo $DEV_PUB_KEY >> ~/.ssh/dev_pub_key.pub
    - server_name=`awk 'END {print $NF}' $CI_PROJECT_DIR/ssh_command.txt`
    - cp $CI_PROJECT_DIR/ssh_key.pem ~/ssh_key.pem
    - chmod 400 ~/ssh_key.pem
    - ssh-copy-id -f -i ~/.ssh/dev_pub_key.pub -o StrictHostKeyChecking=no -o "IdentityFile ~/ssh_key.pem" $server_name
  after_script:
    - echo $CI_PIPELINE_ID
    - echo $CI_PIPELINE_IID
    - echo $CI_PIPELINE_SOURCE
    - echo $CI_PROJECT_ID
    - echo $CI_PROJECT_NAME
    - echo $CI_PROJECT_TITLE
    - echo $CI_PROJECT_URL
    - echo $CI_JOB_ID
    - echo $CI_JOB_NAME
    - echo $CI_JOB_STAGE
    - echo $CI_JOB_STATUS
    - echo $CI_JOB_TOKEN
    - echo $CI_JOB_URL
    - echo $CI_JOB_STARTED_AT

deployment:
  stage: test
  #when: manual
  allow_failure: true
  extends:
    - .generalgrabclustertrigger
  image:
    name: alpine:3.16  # May need to change this image
    # only really needs to copy files and ssh
    entrypoint: [ "" ]
  script:
    - apk add --no-cache openssh
    - chmod -R 777 $CI_PROJECT_DIR
    - test_pre_command="$(cat "$CI_PROJECT_DIR/ssh_command.txt") -o StrictHostKeyChecking=no"
    - test_command="$(echo $test_pre_command | sed -r 's/(ssh )(.*)/\1-tt \2/')"
    - login_command="$test_command</dev/tty"
    - redirect_command="${test_command}"
    - copy_command="$(cat $CI_PROJECT_DIR/scp_command.txt)"
    - rm -rf ~/.ssh/known_hosts
    - sleep 40
    - chmod 400 *.pem
    - echo "mkdir -p /home/ec2-user/.project-n/aws/default/infrastructure; exit 0" | $test_command
    - uat_secret="{\"default_platform\":\"aws\",\"uat_signing_secret\":\"$UAT_SIGNING_SECRET\"}"
    - echo $uat_secret > config
    - new_copy_command=${copy_command/"%s"/"config"}
    - new_copy_command=${new_copy_command/"~"/"/home/ec2-user/.project-n/config"}
    #copies the projectn config file to the remote machine
    - eval $new_copy_command
    - |+
      until eval "$test_command 'command -v projectn >/dev/null'" 2>log; do
        echo "$(cat log)"
        echo "Waiting for the Project N package to finish installing..."
        sleep 10
      done
    - |+
      # Prepare to deploy
      match='export PATH=~/.local/bin:\$PATH'
      until eval "$test_command 'grep -q '\''$match'\'' /home/ec2-user/.bash_profile'" 2> log
      do
        echo "$(cat log)"
        echo "Waiting for the AWS CLI to finish updating..."
        sleep 10
      done
    #copies the projectn config file to the remote machine
    - eval $new_copy_command
    - echo "projectn deploy --auto-approve 2>&1 | tee outfile ; exit 0" | $test_command
    - *copy_remote_log
    - echo "Now checking log file for returnCode"
    - *check_remote_log
  after_script:
    - returnCode=$(grep -c Error outfile) || true
    - echo "Return code received $returnCode"
    - echo "Pipeline ID $CI_PIPELINE_ID"

teardown:
  stage: teardown
  allow_failure: true
  when: manual
  rules:
    - !reference [.generalgrabclustertrigger, rules]
      #when: manual  # This will only add the manual to the last (or only) rule in the !reference
    - if: '$CI_PIPELINE_SOURCE == "triggered" && $Teardownanddestroy'
      when: always
  #extends:
  #  - .generalgrabclustertrigger #.ifteardownordestroy
  image:
    name: alpine:3.16  # May need to change this image
    # only really needs to copy files and ssh
    entrypoint: [ "" ]
  script:
    - apk add --no-cache openssh
    - chmod -R 777 $CI_PROJECT_DIR
    - test_pre_command="$(cat "$CI_PROJECT_DIR/ssh_command.txt") -o StrictHostKeyChecking=no"
    - test_command="$(echo $test_pre_command | sed -r 's/(ssh )(.*)/\1-tt \2/')"
    - login_command="$test_command</dev/tty"
    - redirect_command="${test_command}"
    - rm -rf ~/.ssh/known_hosts
    - chmod 400 *.pem
    - echo "projectn ls; exit 0" | $test_command
    - |+
      until eval "$test_command 'command -v projectn >/dev/null'" 2>log; do
        echo "$(cat log)"
        echo "Waiting for the Project N package to finish installing..."
        sleep 10
      done
    - |+
      # Prepare to deploy
      match='export PATH=~/.local/bin:\$PATH'
      until eval "$test_command 'grep -q '\''$match'\'' /home/ec2-user/.bash_profile'" 2> log
      do
        echo "$(cat log)"
        echo "Waiting for the AWS CLI to finish updating..."
        sleep 10
      done
    - echo "projectn teardown --auto-approve 2>&1 | tee outfile ; exit 0" | $test_command
    - *copy_remote_log
    - echo "Now checking log file for returnCode"
    - *check_remote_log

destroy-admin-server:
  stage: cleanup
  needs: ["create-admin-server"] #,"teardown"]
  when: manual
  rules:
    - !reference [.ifcreateadmin, rules]
      #when: manual  # This will only add the manual to the last (or only) rule in the !reference
    - if: '$CI_PIPELINE_SOURCE == "triggered" && $Teardownanddestroy'
      when: always
  # extends:
  #   - .generalgrabclustertrigger #.ifteardownordestroy
  allow_failure: true
  interruptible: false
  dependencies: # This is what gets the artifacts from the previous job
    - create-admin-server
  image:
    name: amazon/aws-cli:latest
    entrypoint: [ "" ]
  script:
    - TF_IN_AUTOMATION=true
    - yum update -y
    - yum install git unzip gettext -y
    - *configure_aws_cli
    - aws sts get-caller-identity
    - git clone "https://project-n-setup:$(echo $PERSONAL_GITLAB_TOKEN)@gitlab.com/projectn-oss/project-n-setup.git"
    # Install tfenv
    - git clone https://github.com/tfutils/tfenv.git ~/.tfenv
    - ln -s ~/.tfenv /root/.tfenv
    - ln -s ~/.tfenv/bin/* /usr/local/bin
    # Install terraform 1.1.9 through tfenv
    - tfenv install $tfenv_version
    - tfenv use $tfenv_version
    # Substitute in all the environment variables into the temp file, creating the main var file.
    # Copy state and var file from create-admin-server to terraform directory
    - cp $CI_PROJECT_DIR/terraform.tfstate $CI_PROJECT_DIR/project-n-setup/$SUB_PLATFORM
    - cp $CI_PROJECT_DIR/admin_server.tfvars $CI_PROJECT_DIR/project-n-setup/$SUB_PLATFORM
    - cd $CI_PROJECT_DIR/project-n-setup/$SUB_PLATFORM
    - terraform init -input=false
    - terraform destroy -var-file=admin_server.tfvars -auto-approve

Both Test and cleanup stage jobs starting at the same time, how to fix that?
Also, cleanup and teardown should be set manual when the pipeline is triggered from web and should run automatically when pipeline triggered using api call, but that is also not working.
Please suggest.
I want to modularize the rules so tried above reference links, but it seems messed up, need your suggestions on this.

snim2 · June 8, 2022, 9:41pm

I would expect you to have a stages declaration at the top of the file which would list your stages in the order in which you want them to run. This should fix your out-of-order problem.

For the web / api issue, you need some rules that use CI_PIPELINE_SOURCE:

rules:
   - if $CI_PIPELINE_SOURCE == "api"
     when: on_success
   - if $CI_PIPELINE_SOURCE == "web"
     when: manual
   - when: never

uday.reddy3 · June 9, 2022, 6:05am

It is already specified in the main .gitlab-ci.yml, where this yml is imported

uday.reddy3 · June 9, 2022, 6:07am

where to set this default set of rules also?

!reference [.generalgrabclustertrigger, rules]

uday.reddy3 · June 9, 2022, 11:20am

if you observe the rule below.


  rules:
    - !reference [.ifcreateadmin, rules] && '$CI_PIPELINE_SOURCE == "triggered" && $Teardownanddestroy'
      #when: manual  # This will only add the manual to the last (or only) rule in the !reference
    - if: 
      when: on_success
    - if: '$CI_PIPELINE_SOURCE == "web"'
      when: manual
    - when: never

And the anchor rule,

.generalgrabclustertrigger:
  rules:
    - if: '$TEST_CREATE_ADMIN && $REGION && $ROLE_ARN && $PACKAGEURL && $TEST_CREATE_ADMIN == "aws" && $SUB_PLATFORM == "aws" && $ROLE_ARN != "" && $PACKAGEURL != "" && $REGION != ""'

.ifcreateadmin:  # jd example
  rules:
    - if: '$ADMIN_SERVER_IP && $ADMIN_SERVER_IP != ""'  # If these variables are set, then don't run the job
      when: never
    - !reference [.generalgrabclustertrigger, rules]

It is not working as expected, as I thougt both anchor rule will merge with direct rules, but not happening.

Even this also.


.ifteardownordestroy:  # Automatic if triggered from gitlab api AND destroy variable is set
  rules:
    # - !reference [.generalgrabclustertrigger, rules]
    #   when: manual  # This will only add the manual to the last (or only) rule in the !reference
    # - if: '$CI_PIPELINE_SOURCE == "triggered" && $Teardownanddestroy'
    - !reference [.ifcreateadmin, rules]
    - if: '$CI_PIPELINE_SOURCE == "triggered" && $Teardownanddestroy'
      when: on_success
    - if: '$CI_PIPELINE_SOURCE == "web"'
      when: manual
    - when: never

Topic		Replies	Views
Why all the other pipeline stages are all sticked or appear in one pipeline stage Infrastructure as Code & Cloud Native	0	779	August 9, 2022
Complex dependency management GitLab CI/CD	2	523	March 15, 2019
Easy way to enable/disable pipeline stages GitLab CI/CD ci	5	49656	November 30, 2021
Gitlab-ci manual job GitLab CI/CD	0	1885	April 12, 2017
Use of rules and variables in gitlab-ci.yml GitLab CI/CD	1	5167	January 16, 2020

4th stage is starting at the same time of 2nd stage, how to fix that?

Related topics