Error when upgrade from gitlab 13.12.15 to 14.0.12

Hello everyone.

Last someday I ran upgrade gitlab-ce from 13 to 14.0.12.
But I don’t know about the Batched background migrations. So I ran upgrade from 14.0.12 to 14.3.6 immediately and got error.

I rollbacked to gitlab-ce 14.0.12 and saw that some Batched background migrations stuck.

I select the data in psql and see the hang jobs:

2 | CopyColumnUsingBackgroundMigrationJob | events | id | [[“id”], [“id_convert_to_bigint”]]
3 | CopyColumnUsingBackgroundMigrationJob | push_event_payloads | event_id | [[“event_id”], [“event_id_convert_to_bigint”]]
4 | CopyColumnUsingBackgroundMigrationJob | ci_job_artifacts | id | [[“id”, “job_id”], [“id_convert_to_bigint”, “job_id_convert_to_bigint”]]
7 | CopyColumnUsingBackgroundMigrationJob | ci_builds | id | [[“id”, “stage_id”], [“id_convert_to_bigint”, “stage_id_convert_to_bigint”]]
10 | CopyColumnUsingBackgroundMigrationJob | taggings | id | [[“id”, “taggable_id”], [“id_convert_to_bigint”, “taggable_id_convert_to_bigint”]]
13 | CopyColumnUsingBackgroundMigrationJob | ci_stages | id | [[“id”], [“id_convert_to_bigint”]]
14 | CopyColumnUsingBackgroundMigrationJob | ci_builds_metadata | id | [[“id”], [“id_convert_to_bigint”]]

I can not force the job to run with command:

gitlab-rake gitlab:background_migrations:finalize[CopyColumnUsingBackgroundMigrationJob,events,id,‘[[“id”], [“id_convert_to_bigint”]]’]

What can I do now?

I see in these tables, there are rows that have id_convert_to_bigint is 0.
Should I update the data id_convert_to_bigint = id and then change the status of the job manual?

Hi. Can anyone help me with this? I stuck here so long.

Probably easiest is go back to 13.12.15 and restore your backup from that version. Assuming of course you do have backups?

Thanks for your reply. But I can not go back to 13.12.15 because I ran in past time. From now, I can only go on.

I have read all your post here:

And here:

But nothing helps.

I build a test environment test this upgrade in a test environment. And see the table columns are not converted to bigint when upgrade from 13.12.15 to 14.0.12.

I see that, only when upgrade from 14.0.12 to 14.3.6
then the id will be changed to bigint.

So as my understand:
13.12.15 → 14.0.12: Copy data
14.0.12 → 14.3.6: change column name

So I think I can manual update data using SQL. Please confirm me with this.

This person had similar issue: Error during 14.0.12 > 14.3.6 upgrade - #2 by unique

Not waiting for background migrations to complete before starting next upgrade is a problem and well documented in Gitlab upgrade docs. I don’t know about the manual update with SQL, search the forums for bigint, there are plenty of posts already one of which I’m pretty sure will have a manual way around this I seem to remember seeing something like this before on this forum.

Thanks for your suggest. He did the same thing as you said. Rollback to 13.12.15 then upgrade.

But I can not do it right now. I think it is safe to manually update the data by SQL.

Yes, there are other posts on here to search through that do suggest manual fixes for your issue. Searching for “bigint” and other info from your first post should help you locate them. As I said, I’m pretty sure I saw something about that during the last year or so.

Finally I do it myself. I manually update data by SQL and check all the data is ok.
And then I backup and upgrade to 14.3.6.

I’m checking for any issues. Seem ok.

Thanks.

1 Like

It would be nice to post exactly what you did to share with others in case they have the same problem.

I manually update data:

Example for job:
SELECT id, job_class_name, table_name, column_name, job_arguments, status FROM batched_background_migrations WHERE status <> 3;

7 | CopyColumnUsingBackgroundMigrationJob | ci_builds | id | [[“id”, “stage_id”], [“id_convert_to_bigint”, “stage_id_convert_to_bigint”]]

SELECT started_at, finished_at, finished_at - started_at AS duration, min_value, max_value, batch_size, sub_batch_size, status FROM batched_background_migration_jobs WHERE batched_background_migration_id = 7 ORDER BY id DESC limit 10;
\d ci_builds;

select count() from ci_builds where id != id_convert_to_bigint;
select count(
) from ci_builds where stage_id != stage_id_convert_to_bigint;

select id, id_convert_to_bigint, stage_id, stage_id_convert_to_bigint, created_at from ci_builds order by id desc limit 10;

update ci_builds set id_convert_to_bigint = id where id_convert_to_bigint != id;

update batched_background_migration_jobs set status = 3 where batched_background_migration_id = 7;
update batched_background_migrations set status = 3 where id = 7;

I do for all pending Jobs and then run upgrade.

But now I have another job hang:

Migration Progress Status
BackfillIntegrationsTypeNew: integrations 0.00% Active

id | job_class_name | table_name | column_name | job_arguments | status
----±----------------------------±-------------±------------±--------------±-------
15 | BackfillIntegrationsTypeNew | integrations | id | | 1

started_at | finished_at | duration | min_value | max_value | batch_size | sub_batch_size | status
------------±------------±---------±----------±----------±-----------±---------------±-------
| | | 35 | 65 | 1000 | 100 | 0

@iwalker : After several days at status 1. The job 15 | BackfillIntegrationsTypeNew | integrations | id | | 1

changes status to 5. I don’t know if it is ok.