Cluster Management Project - This job failed because the necessary resources were not successfully created

Context

  • Using Gitlab.com
  • My ultimate goal is to apply custom configuration via helm values file to the Gitlab managed Ingress application. If any of the below is not the best way to achieve that please do say.
  • AFAICT it looks like the only way to do this is by using a Cluster Management Project (CMP) as it has the admin level permissions required to administer Gitlab managed apps.
  • Following the instructions here, adding a CMP works fine but when trying to deploy the application from the Cluster Application Project (CAP - configured to use AutoDevOps) the pipeline fails (or seemingly doesn’t even start) with the error This job failed because the necessary resources were not successfully created.
  • The troubleshooting link on this pipeline error page suggests a namespace or service account creation failed but don’t know of any way to debug this (there are no logs shown and no k8s events on the cluster).
  • This troubleshooting page links to instructions on how to view the logs but this only looks relevant to self hosted Gitlabs rather than Gitlab.com as it mentions server instance file paths.
  • Connecting to the cluster directly, I can see a project namespace and several other resources have been created, including roles, rolebindings, service accounts and token secrets. I do not know what resources might be missing though.
  • The application hasn’t been installed due to the pipeline failure and reruns do not make any difference.
  • Essentially I cannot seem to have two projects (CMP and CAP) pointing at the same cluster as the second’s pipeline will always fail and there is no clear way to debug it.

Steps to reproduce

The problem only occurs on the second project’s pipeline run. I.e. if I run the CMP pipeline first the CAP AutoDevOps pipeline will fail. If I run the CAP AutoDevOps pipeline first the CMP pipeline fails.

By running the CMP first

  1. Setup a CMP following the instructions linked above
  2. Setup a CAP configured with AutoDevOps
  3. Run the pipeline on the CMP to install the GL managed apps on the cluster
  4. Run the pipeline on the CAP to attempt to install a custom application FAILS

By running up the CAP first

  1. Setup a CMP following the instructions linked above
  2. Setup a CAP configured with AutoDevOps
  3. Run the pipeline on the CAP to install a custom application
  4. Run the pipeline on the CMP to attempt to install the GL managed apps on the cluster FAILS

Configuration

CMP

include:
   - template: Managed-Cluster-Applications.gitlab-ci.yml

.gitlab/managed-apps/config.yaml

---
certManager:
  installed: true
  letsEncryptClusterIssuer:
    installed: true
    email: 'name@domain.com' # dummy email
cilium:
  installed: false
crossplane:
  installed: false
elasticStack:
  installed: false
gitlabRunner:
  installed: true
ingress:
  installed: true
jupyterhub:
  installed: false
prometheus:
  installed: false
sentry:
  installed: false
vault:
  installed: false

.gitlab/managed-apps/ingress/values.yaml

controller:
  replicaCount: 1
  config:
    use-forwarded-headers: 'true'
    compute-full-forwarded-for: 'true'
    enable-real-ip: 'true'

CAP

Things tried so far

  • Swapped the order of the pipeline runs. The first works fine but the second always fails.
  • Turned off “Gitlab managed” in the CMP as this would otherwise create a deployment namespace just like the CAP does.
  • Set the Cluster Management Project setting in the CAP before running the CMP pipeline. This is so the CMP knows it is a CMP and should only be managing a cluster for the CAP.
  • I tore down the cluster and recreated it from scratch (using Pulumi so this is easy) in case there was any bad orphaned configuration getting in the way.

Questions

  1. How can I debug this kind of failure? I.e. when a pipeline fails before it even logs anything
  2. Seeing as I get this error only on the second project’s pipeline, could there be a resource creation clash going on here? If so, is there anything I can do to prevent this clash?
  3. Is this the only/best approach to passing custom configuration values to a GL managed Ingress application?
  4. Is the Cluster Management Project considered stable enough for production use? I am guessing not since it is labelled as “alpha”. If so, does this mean whenever anyone wants to deviate the configs from the GL prescribed defaults they cannot use GL managed apps?
  5. I’ve noticed in the docs notes like this - GitLab Managed Apps with one-click installations have been deprecated, and are scheduled for removal in GitLab 14.0. Does “one-click installs” include CMP installs or only through the UI?
  6. The GL managed apps use Helm 2 which has been deprecated for several months and your docs suggest GitLab does not offer a way to migrate existing application management on existing clusters from Helm 2 to Helm 3. Does this suggest it is best to roll our own Helm installs for our production applications?

Let me know if any further detail is needed. If anyone can help or make suggestions I’d be extremely grateful.

Many thanks, Nick