Postgres validation error when upgrading from v17.5.2 to v17.6

Postgres validation error when upgrading from v17.5.2 to v17.6

Steps to reproduce

We have a script that automatically pulls new release images once available. This morning, it was loading the new image for v17.6: docker pull gitlab/gitlab-ce:17.6.0-ce.0. When starting up the container with the new image, the following error occurred (from gitlab-rails-db-migrate.log):

main: == 20241015082359 AddContainerRepositoryStatesProjectIdFk: migrating ==========
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- execute("ALTER TABLE container_repository_states ADD CONSTRAINT fk_6591698505 FOREIGN KEY (project_id) REFERENCES projects (id) ON DELETE CASCADE NOT VALID;")
main:    -> 0.0016s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0002s
main: -- execute("ALTER TABLE container_repository_states VALIDATE CONSTRAINT fk_6591698505;")
main:    -> 0.0145s
main: -- execute("RESET statement_timeout")rake aborted!
StandardError: An error has occurred, all later migrations canceled:

PG::CheckViolation: ERROR:  check constraint "check_83faf1f5e7" of relation "packages_dependencies" is violated by some row

[...]

Tasks: TOP => db:migrate
(See full trace by running task with --trace)
Running db:migrate rake task
main: == [advisory_lock_connection] object_id: 48340, pg_backend_pid: 372
main: == 20241016072342 AddNotNullConstraintToPackagesDependenciesProjectId: migrating 
main: -- current_schema(nil)
main:    -> 0.0011s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0003s
main: -- execute("ALTER TABLE packages_dependencies VALIDATE CONSTRAINT check_83faf1f5e7;")
main: -- execute("RESET statement_timeout")
main:    -> 0.0002s
main: == [advisory_lock_connection] object_id: 48340, pg_backend_pid: 372

I’ve already scanned through the community forum but did not find any matching post indicating this problem. I was a bit under pressure and decided to just start gitlab with v17.5.2 again. Since this succeeded, I had the impression that everything should be fine. However, a new issue arose:

Whenever attempting to start a new runner (using a random project with a .gitlab-ci.yml file available), this has the following error as a result:

gitlab  | ==> /var/log/gitlab/postgresql/current <==
gitlab  | 2024-11-22_14:49:29.96460 ERROR:  new row for relation "ci_pipelines_config_100" violates check constraint "check_b2a19dd79a"
gitlab  | 2024-11-22_14:49:29.96463 DETAIL:  Failing row contains (47941, 100, ---
gitlab  | 2024-11-22_14:49:29.96464     include:
gitlab  | 2024-11-22_14:49:29.96464     - local: ".gitlab-ci.yml"
gitlab  | 2024-11-22_14:49:29.96464     , null).
gitlab  | 2024-11-22_14:49:29.96464 STATEMENT:  /*application:web,correlation_id:01JDA5JXHWQ70ZW09W9SGBS4FA,endpoint_id:Projects::PipelinesController#create,db_config_database:gitlabhq_production,db_config_name:ci*/ INSERT INTO "p_ci_pipelines_config" ("pipeline_id", "partition_id", "content") VALUES (47941, 100, '---
gitlab  | 2024-11-22_14:49:29.96464     include:
gitlab  | 2024-11-22_14:49:29.96465     - local: ".gitlab-ci.yml"
gitlab  | 2024-11-22_14:49:29.96465     ') RETURNING "pipeline_id"

But note that I haven’t had such errors ever before the most recent upgrade, so I’d doubt it is an issue with DB itself.

Questions

I wonder how to find out what check_83faf1f5e7 actually does?

Any clue what could help me to further debug these issues?

Configuration

I’m running gitlab-ce on docker using docker compose v2. The host is a VM running Ubuntu 22.04.5 LTS.

Versions

  • Self-managed, gitlab v17.6
  • gitlab-runner v17.5

Gitlab system information

System information
System:		
Current User:	git
Using RVM:	no
Ruby Version:	3.2.5
Gem Version:	3.5.17
Bundler Version:2.5.11
Rake Version:	13.0.6
Redis Version:	7.0.15
Sidekiq Version:7.2.4
Go Version:	unknown

GitLab information
Version:	17.5.2
Revision:	cebb958cb73
Directory:	/opt/gitlab/embedded/service/gitlab-rails
DB Adapter:	PostgreSQL
DB Version:	14.11
URL:		https://xxx
HTTP Clone URL:	https://xxx/some-group/some-project.git
SSH Clone URL:	git@xxx:some-group/some-project.git
Using LDAP:	yes
Using Omniauth:	yes
Omniauth Providers: 

GitLab Shell
Version:	14.39.0
Repository storages:
- default: 	unix:/var/opt/gitlab/gitaly/gitaly.socket
GitLab Shell path:		/opt/gitlab/embedded/service/gitlab-shell

Gitaly
- default Address: 	unix:/var/opt/gitlab/gitaly/gitaly.socket
- default Version: 	17.5.2
- default Git Version: 	2.46.2
1 Like

Exact same on bare metal Ubuntu 20.04.6 LTS when upgrading from v17.5.2 to v17.6.0.

I had a retry after all Background migrations jobs where finished and it worked

is there any result?
actually i’m using docker image and update yesterday
I have same problem

actually i downgrade and now pipelines are not working:

ERROR: new row for relation “ci_pipelines_config_100” violates check constraint “check_b2a19dd79a”

I’m sort of relieved to hear that I’m not the only one facing this issue. I’ve created an issue bug here: https://gitlab.com/gitlab-org/gitlab/-/issues/505972

Upgrade Path (Upgrade Path) says you have to upgrade first to 17.5.3 before upgrading to 17.6.1

Experiencing the same issue on Ubuntu 24.04 LTS with an omnibus installation (gitlab-ce). Upgradepath and procedures were followed. Gitlab was originally on 17.1.1 - > 17.3.7 → 17.5.3 → 17.6.0/17.6.1. Upgrade until 17.5.3 was working well, the upgrade to either 17.6.0 or 17.6.1 was always failing on this migration, regarding this

ALTER TABLE packages_dependencies VALIDATE CONSTRAINT check_83faf1f5e7;

Downgraded again, by restoring a backup, everything is working on 17.5.3 now again, but as soon as we want to upgrade to 17.6 it results in the same error.

I have now strictly followed the upgrade path and first upgraded to 17.5.3, which succeeded. Then, when done I upgraded to 17.6.1, and this failed again with the same error (similar to what @flourhat described).
Since it seems this is not an isolated problem, could someone please pick this up, or let me know what else I could do to narrow down the core of the issue on my side?

1 Like