Upgrade to GitLab 13.4.0 (b0481767fe4) killed all repositories

Nope. That, again, fails with "OpenSSL::Cipher::CipherError: "

well you did answer but your answer does not work for me :frowning:

e.g.

[root@git backups]# gitlab-rake gitlab:storage:migrate_to_hashed ID_FROM=25 ID_TO=25
Enqueueing storage migration of elco/5kwtenk (ID=25)...
 Done!
[root@git backups]# cd /var/opt/gitlab/git-data/repositories/elco
[root@git elco]# pwd
/var/opt/gitlab/git-data/repositories/elco
[root@git elco]# ls -la
total 16
drwxr-sr-x.  4 git root 4096 Sep 23 12:14 .
drwxrws---. 10 git root 4096 Sep 24 13:56 ..
drwxr-sr-x.  6 git root 4096 Sep 23 03:27 knowledgeBase.git
drwxr-s---.  7 git root 4096 Sep 23 09:22 knowledgeBase.wiki.git
[root@git elco]#

image

what do you have in git-data/repositories for that id240 ?

[root@git arhi]# pwd
/var/opt/gitlab/git-data/repositories/arhi
[root@git arhi]# ls -la
total 8
drwxr-sr-x.  2 git root 4096 Sep 23 03:27 .
drwxrws---. 10 git root 4096 Sep 24 13:56 ..
[root@git arhi]#

if there is no data in the old storage dir … looks like this cipher fails 'cause dir is empty

looks like we solved it.

we deleted registration tokens https://docs.gitlab.com/ce/raketasks/backup_restore.html#reset-runner-registration-tokens

and then gitlab-rake gitlab:storage:migrate_to_hashed works :smiley: (so far so good)

4 Likes

Resolved according to https://docs.gitlab.com/ee/raketasks/backup_restore.html#when-the-secrets-file-is-lost

My repositories where half-migrated: the files on disk were in the new location, but in the database they were still marked legacy projects.

I wrote a little script that goes through the hash storage, looks at the config-file, reads out the legacy path from the config file, and moves the directory back to it’s old posision.

THIS IS A VERY UNSAVE SCRIPT!

DO NOT USE THIS UNLESS YOU ARE SURE YOU HAVE THE SAME PROBLEM!

Dir.chdir 'data/git-data/repositories/'

puts "=== find hashed project and fix them ==="
i = 0
Dir.glob("@hashed/*/*/*.git/config").each do |configfile|
  puts configfile
  next if configfile =~ /wiki/
  stem = configfile.gsub('.git/config', '')
  config = File.read(configfile)
  if config =~ /fullpath = (.*)/
    legacypath = $1
    puts "move #{stem} to #{legacypath}"
    if File.exists?(stem + ".wiki.git")

      if File.exists?("#{legacypath}.wiki.git")
            puts "rm -rf data/git-data/repositories/#{legacypath}.wiki.git"
            `rm -rf #{legacypath}.wiki.git`
      end
      puts "  mv #{stem}.wiki.git to #{legacypath}.wiki.git"
      File.rename( "#{stem}.wiki.git", "#{legacypath}.wiki.git")
    end

    if File.exists?(stem + ".git")
            i+=1
      puts "  mv #{stem}.git to #{legacypath}.git"
      File.rename( "#{stem}.git", "#{legacypath}.git")
    end
  else
      puts "no config file, skipping #{stem}"
  end
#   break if i>30
end
puts "=== done ==="

most of ppl in the topic had exactly the same problem. the problem was fixed by

  1. delete registration tokens as explained on the backup_restore.html
  2. restart the migration (rerun migrate_to_hashed script)
1 Like

hm … wiki’s are not back :frowning:

repo with wiki:

gitlab-rake gitlab:storage:migrate_to_hashed ID_FROM=8 ID_TO=8

supposedly worked ok

but is still on the old storage format and wiki not available :frowning:

Hi arhi,

thank you for the information. I didn’t notice it because after upgrade (12.10.14 -> 13.0.12 -> 13.4.1) I checked only the web dashboard. Beside this, why did you do it (question for understanding) because it seems that hashed repository has become “the rule” https://docs.gitlab.com/ee/administration/repository_storage_types.html#hashed-storage ?

thank you
cheers
Stefano

solved the repo with wiki by deleting wiki dir
so

rm -rf /var/opt/gitlab/git-data/repositories/arhi/repo.wiki.git
gitlab-rake gitlab:storage:migrate_to_hashed ID_FROM=8 ID_TO=8

this migrated it to hashed and restored the wiki (from who knows where)

Not sure I understand. Why did I do what?

@bjelline were you able to resolve this problem on your server? My migration jobs are also failing with:

Gitlab::Git::CommandError: 2:NoMethodError: undefined method `relative_path’ for nil:NilClass.

I can’t seem to figure out what’s causing it.

I created an issue for this bug on the gitlab issue tracker here.

I’m still looking for some way to get access to my repos again. I don’t use runners, so I can’t imagine that clearing secrets out of the database as suggested above will do me much good.

I have not used any runners neither but that solved the problem

Interesting. I went ahead and gave it a try. Unfortunately, I’m getting the same error. :frowning: I think since we saw different error messages in the Sidekiq logs, we’re experiencing different issues.

Hi all, I will look at all the information you all provided here and track the problem in https://gitlab.com/gitlab-org/gitlab/-/issues/259605. This is a high-priority issue to get fixed as it shouldn’t have caused problems in the first place. The way the migration was coded makes it very very hard to loose data, so for those of you who have it in an inconsistent state, it may be the case that the database has flagged the storage as migrated, but the repositories are still located on the legacy storage format, or the opposite, you have the repositories migrated but database for some reason failed. In any case, it’s possible to get it back to normal.

Please follow the issue to get notified of any solution (we will probably have a patch release for 14.4.x with a fix, but I will also provide instructions on how to manually fix, so you don’t have to wait)

The OpenSSL::Cipher::CipherError means some encrypted data in the database couldn’t be read with existing keys in /etc/gitlab/gitlab-secrets.json. The only reason I can think of this happening is when you have GitLab installed in either HA or sort-of HA where you are running on multiple nodes to spread the load. In that scenario, if you have sidekiq in a different node and you have forgotten to copy the secrets there, you may endup with this kind of issue.

@arhi Could you please provide additional insights here: Repositories gone after update. Broken migration to hashed storage. (#259605) · Issues · GitLab.org / GitLab · GitLab ?/

For those of you having issues related with OpenSSL::Cipher::CipherError please look at the documentation here: https://docs.gitlab.com/ee/administration/raketasks/doctor.html#verify-database-values-can-be-decrypted-using-the-current-secrets

1 Like

Hi, that was not the case :frowning:

I had gitlab running on a single node, simple yum install gitlab-ci, was running for a while, then it started having some issues with upgrade (I think “registry” was added so the https keys were not loading properly) and I had to manually after every upgrade “fix” the config by pointing registry to proper https keys… then I decided to move the gitlab from host to vm for easier backup as the whole system started to be super important, so what we do is backup-etc + backup, installed it on vm, restored etc and backup (so rb files are copied to new system), restored backup and everything worked ok, did few upgrades, everything worked ok, and then this upgrade came trying to do a filesystem migration that crashed big time. I had ~25 repos there, 2 of them had wiki pages, 2-3 of them were already on the hash system and all others on the old system. After migration, all repos in the “old system” were empty (empty folders) but no repos except those that were on hashed before were visible. starting migration again was crashing with this crypto error, I then

UPDATE projects SET runners_token = null, runners_token_encrypted = null;
UPDATE namespaces SET runners_token = null, runners_token_encrypted = null;
UPDATE application_settings SET runners_registration_token_encrypted = null;
UPDATE ci_runners SET token = null, token_encrypted = null;

and started the migrate again
all repos became visible now, the 2 repos with wiki had empty wiki
I checked filesystem, those 2 repos that had wiki now had repo.wiki (empty) folder in the old filesystem repo format, I tried few things to restore wiki but could not do it, the repos were also in the old format and migrate were not moving them to hashed. then I deleted the .wiki folder and run migration again, the repos migrated successfully to hashed and wiki automagically got restored…

so I’m now fully functional but can’t say I’m very confident in gitlab after the whole ordeal :frowning: so I’m manually backing up the whole VM non stop and storing those backups… not very optimal but…

1 Like

When you backup and restore, you also need to copy the secrets, as they are not part of the backup : https://docs.gitlab.com/ee/raketasks/backup_restore.html#storing-configuration-files

I think we should consider a solution to be able to store them inside the backup files as well. I will create an issue to follow up on that.