yum update automatically upgraded gitlab-ci to 13.4
after upgrade finished no repo is available any more
I can see projects listed on the web, but clicking on any of them show “repository does not exist / create empty repository”
/var/opt/gitlab/git-data/repositories
here I can see that all my project directories are empty (I see group dir’s but nothing in them) and there’s a @hashed directory that is same size as all the other ones were so looks like 13.4 upgrade tried to migrate storage to hashed but after it moved/renamed/reorganized all the files something died and process is not finished.
I unfortunately have only a month old backup so dunno what to do now. Any way to restart the migration or to roll it back. All the files are moved to @hashed and all the repo’s now show that they are gone …
Maybe give it some time to complete, there is a counter in the monitoring dashboard thing.
See here: ‘You can monitor the progress in the Admin Area > Monitoring > Background Jobs page. On the Queues tab, you can watch the hashed_storage:hashed_storage_project_rollback queue to see how long the process will take to finish.’
I’ve looked at the gitlab install from @archi - I know him IRL.
Jobs failed with Error Class OpenSSL::Cipher::CipherError
I’ve checked /etc/gitlab/gitlab-secrets.json and gitlab.yml and /var/opt/gitlab/gitlab-rails/etc/secrets.yml are present. gitlab-rake check reports no problem. gitlab-ctl reinstall doesn’t alter the secrets. Restart doesn’t help. cache:clear doesn’t help. Running hashed jobs again from rake restarts the tasks, but they fail again.
I’m out of ideas. I’m guessing that the certs fail to verify since the dirs are empty - and they skip the next step, since the data has been moved to @hashed. Still, there are no links in the database, so the data is just “hanging”…
@davodm when it crashed what state was your system in? here the files were all moved from …/repositories/groupname/projectname/* to …/repositiories/@hashed/somegarblednames and when you click on any repo on the web it show that repo does not exist
trying to revert this %#%^@# back to usable state but 24 hours later and we are still not operational … I fear we lost a month of work (backup is 1month old)
I tried everything that came to mind without success, I can’t revert back to old storage nor can’t get this new hashed to work, some 10+ repositories offline and a month old backup … so trying to figure out how to move forward or backwards but for now nothing
Update from me. This time I did the migration manually before updateing and that went fine. So updating and doing the auto migration seems to be the problem.
yes, it was in the queue to setup daily backups offsite … but git backups were not working ok past month as all backups are without repositories only DB was backed up … and I set the VM to be backed up only once a month (that’s why I have a backup that’s a month old) …
anyhow I can’t believe there is no way to somehow get this hashed directory to revert back to old storage ?!?!?!?
We have quite a few projects (~3600, a few will be created after hashed storage became the choice for new projects) that will need conversion.
I’m trying to get an idea of how long that will take, is there any good way to get such an idea?
Even though I know that there’s a lot of local factors, can one of you that have been through this share some numbers on how many projects they have, and how long it took?
@steep_suzuki: If you’re asking me: NO, I had seen that point in the documentation, and except that I got the project ids from the project home page rather than through the admin area, I had done that for some of my own projects (where downtime was expected - and due to a typo, some other projects, that haven’t made anyone complain), but I’ve gotten timings from 1/16th of a second to 50 seconds pr. project. Scale that to 3600 projects and I get estimates ranging from 3-4 minutes to 2-3 days, As that is quite a wide range, I would like more knowledge/measurements/…