I managed to fix this for my own application, so I’m going to post the problems I encountered in case anyone else has similar issues.
Compiled Language Issues
I was trying to deploy an Elixir application, which is a compiled language, and some of my build-time environment variables were being compiled to static code and therefore meant I was effectively ignoring my runtime environment variables. I had to figure out how to ensure that my build phase wasn’t baking build-time variables when I needed to use runtime variables.
Because GitLab pushes the container image (if a Dockerfile
is present in your repo) to its own container registry service, you can login to your GitLab registry and pull down the final image to inspect on your own machine. This allowed me to inspect the artifacts I intended to produce, execute the associated binaries, and ultimately determine that environment variables weren’t being respected. There could be other problems for other people (like maybe some assets aren’t getting copied to the final image), so I would highly suggest pulling down the image from GitLab container registry and inspecting it for problems.
CI vs ENV Variables
The Variables section under the CI/CD Settings are environment variables for the CI/CD jobs, not for your application. I made this mistake but the documentation was pretty explicit on the fix: just prepend your variable names with K8S_SECRET_
. The Auto Deploy phase will find all variables matching the K8S_SECRET_*
pattern, remove the prefix, and build a Kubernetes secrets file on-the-fly. However, I’ve found there is at least one environment variable that your application might need that doesn’t follow that pattern: DATABASE_URL
.
Because the Auto Deploy phase can also deploy a Postgres database (which it does by default and you have to specifically disable it), the DATABASE_URL
variable is a computed value which overrides any K8S_SECRET_DATABASE_URL
. However, it will not compute the value if it is already present as a CI/CD variable, so just set DATABASE_URL
- no prefix on this one - and it will successfully be passed down to the application deployment. I figured this out through trial-and-error because I was looking at my container logs and they were constantly having issues connecting to the database, but it was also showing the wrong connection host despite whatever K8S secret I tried to set.
Google Cloud SQL Proxy
GitLab can manage a GKE cluster for your project if you let it, and it’s actually quite proficient at doing so. You might think, like I did, to use other Google Cloud services, like Cloud SQL. If you do, keep in mind: you need to sidecar a special Cloud SQL Proxy container. Even if you plunk down the appropriate service account credentials into the environment variables you cannot connect to Cloud SQL without the proxy running sidecar (meaning there’s one proxy container running per pod your application is deployed to).
Unfortunately, there’s no great way to define sidecar services in the Auto Deploy phase (to my knowledge), so I bailed on Cloud SQL and let Auto Deploy manage my database deployment. Then I removed my custom DATABASE_URL
variable and let GitLab compute the appropriate value. My application could finally connect to the database.
Health Checks Against Root Path
This wasn’t an issue for me personally, but it bears knowing that the deployment (by default) attempts to run a health check by accessing the root path of the web application (GET /
). It expects a “success” response (100, 200, or 300, I think). If you don’t respond with that or if you take too long to respond, your application will be assumed unhealthy, which will ultimately affect the deployment. I know of no way to change this right now, which is unfortunate because I would like to set up a separate endpoint that just handles health check pings. I would also like to change the frequency of the health check.
Recap
- My build phase was constructing an incorrect application binary, but I was able to pull down the container image produced by the build phase and inspect it for issues.
- If you have environment variables required by your application, prefix them with
K8S_SECRET_
in the Variables section of your CI/CD Settings. The only exception is DATABASE_URL
which should not use any special prefix.
- If you plan to use an external database, such as Cloud SQL, ensure that you can reach the instance without any special “extras” that aren’t simple to set up in the Auto Deploy flow (e.g. Cloud SQL Proxy as a sidecar).
- Health checks default (and can’t be changed, to my knowledge) to
GET /
. I can’t seem to find the defined timeout but ultimately you need to ensure it responds “quickly” and with a non-error code (1xx, 2xx, or 3xx).
Once I fixed those four problems, my deploys began to succeed. The build phase (trying to fix compilation issues) was the most difficult to solve on my end but diagnosing the DATABASE_URL
issue probably took the most time since overriding their computed value is not documented anywhere that I could find.
Hope this helps.