Projects pages gives HTTP 500 error after data migration to another server

I moved my GitLab (v16.4.1) data from one Debian 10 server to a Debian 12 server (same GitLab Linux official package gitlab-ce and same version 16.4.1-ce.0) using the gitlab-backup create and gitlab-backup restore BACKUP=… commands (also copying of my gitlab-secrets.json file).

Restoration went fine and I could login as an admin or using LDAP credentials.

But I had (had, because I fixed some, but not all) HTTP 500 errors on:

  • Admin Area > Runners (/admin/runners)
  • All project (disclaimer: I tested a lot but not all projects) pages (all pages):
    • home page (/group/subgroup/project/-/settings/repository)
    • settings (/group/subgroup/project/-/settings/repository)
    • sub settings page (/group/subgroup/project/-/settings/repository)
    • merge requests (/group/subgroup/project/-/merge_requests)

I managed to “fix” the 500 error on Runners page using the 3 following solutions I’ve found on this forum, StackOverflow or GitLab.com:

Run the following SQL using gitlab-rails dbconsole:

UPDATE projects SET runners_token = null, runners_token_encrypted = null;
UPDATE namespaces SET runners_token = null, runners_token_encrypted = null;
UPDATE application_settings SET runners_registration_token_encrypted = null;
UPDATE application_settings SET encrypted_ci_jwt_signing_key = null;
UPDATE ci_runners SET token = null, token_encrypted = null;

TRUNCATE web_hooks CASCADE;

Run the following Ruby using gitlab-rails console:

ApplicationSetting.first.delete
ApplicationSetting.first

But the 500 is still present on projects pages.

The /var/log/gitlab/gitlab-rails/production_json.log complains about "exception.cause_class": "OpenSSL::Cipher::CipherError",

Full log line from `production_json.log`
{
  "method": "GET",
  "path": "/group/subgroup/project/-/merge_requests",
  "format": "html",
  "controller": "Projects::MergeRequestsController",
  "action": "index",
  "status": 500,
  "time": "2023-10-13T12:15:26.433Z",
  "params": [
    {
      "key": "namespace_id",
      "value": "group/subgroup"
    },
    {
      "key": "project_id",
      "value": "project"
    }
  ],
  "correlation_id": "01HCMFD0A53A0J225DBP8AMTWS",
  "meta.caller_id": "Projects::MergeRequestsController#index",
  "meta.remote_ip": "78.192.54.143",
  "meta.feature_category": "code_review_workflow",
  "meta.user": "cduv",
  "meta.user_id": 2,
  "meta.project": "group/subgroup/project",
  "meta.root_namespace": "parc_informatique",
  "meta.client_id": "user/2",
  "remote_ip": "78.192.54.143",
  "user_id": 2,
  "username": "cduv",
  "ua": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0",
  "queue_duration_s": 0.045187,
  "request_urgency": "low",
  "target_duration_s": 5,
  "redis_calls": 48,
  "redis_allowed_cross_slot_calls": 1,
  "redis_duration_s": 0.007925,
  "redis_read_bytes": 5309,
  "redis_write_bytes": 5133,
  "redis_cache_calls": 6,
  "redis_cache_duration_s": 0.001123,
  "redis_cache_read_bytes": 113,
  "redis_cache_write_bytes": 2355,
  "redis_feature_flag_calls": 23,
  "redis_feature_flag_duration_s": 0.003771,
  "redis_feature_flag_read_bytes": 4532,
  "redis_feature_flag_write_bytes": 1397,
  "redis_repository_cache_calls": 12,
  "redis_repository_cache_duration_s": 0.001231,
  "redis_repository_cache_read_bytes": 449,
  "redis_repository_cache_write_bytes": 621,
  "redis_sessions_calls": 3,
  "redis_sessions_allowed_cross_slot_calls": 1,
  "redis_sessions_duration_s": 0.001195,
  "redis_sessions_read_bytes": 210,
  "redis_sessions_write_bytes": 593,
  "redis_shared_state_calls": 4,
  "redis_shared_state_duration_s": 0.000605,
  "redis_shared_state_read_bytes": 5,
  "redis_shared_state_write_bytes": 167,
  "db_count": 51,
  "db_write_count": 1,
  "db_cached_count": 11,
  "db_replica_count": 0,
  "db_primary_count": 51,
  "db_main_count": 51,
  "db_ci_count": 0,
  "db_main_replica_count": 0,
  "db_ci_replica_count": 0,
  "db_replica_cached_count": 0,
  "db_primary_cached_count": 11,
  "db_main_cached_count": 11,
  "db_ci_cached_count": 0,
  "db_main_replica_cached_count": 0,
  "db_ci_replica_cached_count": 0,
  "db_replica_wal_count": 0,
  "db_primary_wal_count": 0,
  "db_main_wal_count": 0,
  "db_ci_wal_count": 0,
  "db_main_replica_wal_count": 0,
  "db_ci_replica_wal_count": 0,
  "db_replica_wal_cached_count": 0,
  "db_primary_wal_cached_count": 0,
  "db_main_wal_cached_count": 0,
  "db_ci_wal_cached_count": 0,
  "db_main_replica_wal_cached_count": 0,
  "db_ci_replica_wal_cached_count": 0,
  "db_replica_duration_s": 0,
  "db_primary_duration_s": 0.044,
  "db_main_duration_s": 0.044,
  "db_ci_duration_s": 0,
  "db_main_replica_duration_s": 0,
  "db_ci_replica_duration_s": 0,
  "cpu_s": 0.649799,
  "mem_objects": 768235,
  "mem_bytes": 76154142,
  "mem_mallocs": 364768,
  "mem_total_bytes": 106883542,
  "pid": 2506031,
  "worker_id": "puma_10",
  "rate_limiting_gates": [],
  "exception.class": "ActionView::Template::Error",
  "exception.message": "",
  "exception.backtrace": [
    "app/models/concerns/integrations/has_data_fields.rb:15:in `project_url'",
    "app/models/integrations/base_issue_tracker.rb:67:in `issue_tracker_path'",
    "lib/sidebars/projects/menus/external_issue_tracker_menu.rb:9:in `link'",
    "lib/sidebars/menu.rb:158:in `serialize_as_menu_item_args'",
    "lib/sidebars/projects/menus/external_issue_tracker_menu.rb:53:in `serialize_as_menu_item_args'",
    "lib/sidebars/concerns/super_sidebar_panel.rb:25:in `block in transform_old_menus'",
    "lib/sidebars/concerns/super_sidebar_panel.rb:20:in `each'",
    "lib/sidebars/concerns/super_sidebar_panel.rb:20:in `transform_old_menus'",
    "lib/sidebars/projects/super_sidebar_panel.rb:35:in `configure_menus'",
    "lib/sidebars/panel.rb:16:in `initialize'",
    "app/helpers/sidebars_helper.rb:122:in `new'",
    "app/helpers/sidebars_helper.rb:122:in `super_sidebar_nav_panel'",
    "app/views/layouts/_page.html.haml:8",
    "app/views/layouts/application.html.haml:20",
    "app/views/layouts/project.html.haml:29",
    "app/controllers/application_controller.rb:161:in `render'",
    "app/controllers/application_controller.rb:569:in `block in allow_gitaly_ref_name_caching'",
    "lib/gitlab/gitaly_client.rb:352:in `allow_ref_name_caching'",
    "app/controllers/application_controller.rb:568:in `allow_gitaly_ref_name_caching'",
    "app/controllers/application_controller.rb:520:in `set_current_admin'",
    "lib/gitlab/session.rb:11:in `with_session'",
    "app/controllers/application_controller.rb:511:in `set_session_storage'",
    "lib/gitlab/i18n.rb:107:in `with_locale'",
    "lib/gitlab/i18n.rb:113:in `with_user_locale'",
    "app/controllers/application_controller.rb:502:in `set_locale'",
    "app/controllers/application_controller.rb:495:in `set_current_context'",
    "lib/gitlab/metrics/elasticsearch_rack_middleware.rb:16:in `call'",
    "lib/gitlab/middleware/memory_report.rb:13:in `call'",
    "lib/gitlab/middleware/speedscope.rb:13:in `call'",
    "lib/gitlab/database/load_balancing/rack_middleware.rb:23:in `call'",
    "lib/gitlab/middleware/rails_queue_duration.rb:33:in `call'",
    "lib/gitlab/etag_caching/middleware.rb:21:in `call'",
    "lib/gitlab/metrics/rack_middleware.rb:16:in `block in call'",
    "lib/gitlab/metrics/web_transaction.rb:46:in `run'",
    "lib/gitlab/metrics/rack_middleware.rb:16:in `call'",
    "lib/gitlab/jira/middleware.rb:19:in `call'",
    "lib/gitlab/middleware/go.rb:20:in `call'",
    "lib/gitlab/middleware/query_analyzer.rb:11:in `block in call'",
    "lib/gitlab/database/query_analyzer.rb:37:in `within'",
    "lib/gitlab/middleware/query_analyzer.rb:11:in `call'",
    "lib/gitlab/middleware/multipart.rb:173:in `call'",
    "lib/gitlab/middleware/read_only/controller.rb:50:in `call'",
    "lib/gitlab/middleware/read_only.rb:18:in `call'",
    "lib/gitlab/middleware/same_site_cookies.rb:27:in `call'",
    "lib/gitlab/middleware/basic_health_check.rb:25:in `call'",
    "lib/gitlab/middleware/handle_malformed_strings.rb:21:in `call'",
    "lib/gitlab/middleware/handle_ip_spoof_attack_error.rb:25:in `call'",
    "lib/gitlab/middleware/request_context.rb:15:in `call'",
    "lib/gitlab/middleware/webhook_recursion_detection.rb:15:in `call'",
    "config/initializers/fix_local_cache_middleware.rb:11:in `call'",
    "lib/gitlab/middleware/compressed_json.rb:44:in `call'",
    "lib/gitlab/middleware/rack_multipart_tempfile_factory.rb:19:in `call'",
    "lib/gitlab/middleware/sidekiq_web_static.rb:20:in `call'",
    "lib/gitlab/metrics/requests_rack_middleware.rb:79:in `call'",
    "lib/gitlab/middleware/release_env.rb:13:in `call'"
  ],
  "exception.cause_class": "OpenSSL::Cipher::CipherError",
  "db_duration_s": 0.18521,
  "view_duration_s": 0,
  "duration_s": 0.65254
}

The issues I can find with this OpenSSL::Cipher::CipherError is usually related to missing/incorrect gitlab-secrets.json which is not my case (because I double checked the SHA1 of gitlab-secrets.json on both server and I guess I could not login with the admin account if keys were not OK?).

So I am a bit lost on what I can do: everything works fine on the other/old server (still available for tests) :-/

For more context (if relevant): I am using the Official Linux package with it’s embedded Nginx but I am using the Træfik reverse-proxy in front of the Nginx, with HTTPS TLS termination on Træfik and an TLS certificate signed by a self-signed/internal CA, which is present in /etc/gitlab/trusted-certs/.

Looking at the backtrace:

  "exception.class": "ActionView::Template::Error",
  "exception.message": "",
  "exception.backtrace": [
    "app/models/concerns/integrations/has_data_fields.rb:15:in `project_url'",

Could the “Integrations” be in cause?

I can access the instance level settings (“Instance-level integration management” /admin/application_settings/integrations) and list which projects has custom settings, but I cannot see those custom settings nor change/disable them because it’s a “project” page which returns an HTTP 500 error.

Maybe I could try disabling/deleting those settings using gitlab-rails dbconsole and the correct SQL requests?

It was the Integrations: When I manually disable all integrations of a given project (UPDATE integrations SET active = false WHERE project_id = xx) the project pages stops suffering from 500 HTTP errors.

I also can load the page listing all the integrations of a project, but I cannot open the settings of a given integrations.

In the end I deleted all integrations I had (DELETE FROM integration) and re-created them using the “old” server as a model.

I only had Mattermost, Redmine and Custom Issue Tracker integrations.

What remains unknown: How does the same data set works on one instance but not on another. Maybe the data migration process has some flaw with theses data?

This is very interesting. I will note that very, very old versions (10, 11) had issues going from EE to CE. Unsure if you did that but you did have to run console commands to remove entries that were not compatible between EE and CE.

Were you ever on EE then downgraded?

No, it always was CE.

I’ve just upgraded to v16.5.0 and I am again getting “500” errors on pages related to CI/CD:

  • Runners list (/admin/runners)
  • CI/CD variables (/group/subgroup/project/-/settings/ci_cd)

I’m curious about the restore procedures followed. From what you mention, the problems look like it was due to the gitlab-secrets.json not being in place before the restore started. So was this file put in place after the restore? If you can clarify that would be great.

When I’ve restored, never had issues with secrets, but then I normally would do:

  1. Install gitlab-ce package - same version as server backup.
  2. Put /etc/gitlab/gitlab.rb in place from old server, as well as /etc/gitlab/gitlab-secrets.json.
  3. Run gitlab-ctl reconfigure
  4. Follow restore procedures as per Gitlab docs.
  5. After restore, run gitlab-ctl reconfigure and gitlab-ctl restart

So just wondering at which stage you put the secrets file in place, before or after restore.

I always ensure that each day a backup is created, that a backup of gitlab-secrets.json is also made, since tokens/secrets could have been added during each day of using Gitlab.

gitlab-rake gitlab:doctor:secrets VERBOSE=1 is indeed now reporting issues with CI variables and runners.

I might (but not sure) have put the gitlab-secrets.json after running the restore command.

This is why I “repaired” my integrations and CI/CD variables by manually recreating them from my old instance.

But I can’t understand why, once repaired and working, it breaks again after an upgrade (during which the gitlab-secrets.json file was not changed/moved/etc.).

Error in production.log:

OpenSSL::Cipher::CipherError ():                                                                                                                                                                                                                                                                                                              
  lib/gitlab/crypto_helper.rb:28:in `aes256_gcm_decrypt'                                                                                                                                                                                                                                                                                      
  app/models/concerns/token_authenticatable_strategies/encryption_helper.rb:16:in `decrypt_token'                                                                                                                                                                                                                                             
  app/models/concerns/token_authenticatable_strategies/encrypted.rb:78:in `get_encrypted_token'                                                                                                                                                                                                                                               
  app/models/concerns/token_authenticatable_strategies/encrypted.rb:50:in `get_token'                                                                                                                                                                                                                                                         
  app/models/concerns/token_authenticatable.rb:41:in `block in add_authentication_token_field'                                                                                                                                                                                                                                                
  app/models/ci/runner.rb:401:in `short_sha'

It would probably have been better using the Gitlab official docs for resetting those items rather than the stackoverflow post: Back up GitLab | GitLab

There are chances that those DB updates haven’t completely reset everything and this is why you are experiencing problems - either that or some steps haven’t been reset. If it was me, I would do the restore of 16.4.1 again making sure the latest version of gitlab-secrets.json and backup taken at the same time of that secrets file are restored on the new server and reconfigured before starting the restore - as similar to the summary steps in my post above.

I basically ran the provided SQL commands to delete the tokens/variables.

If I get it right, there was a mismatch between the secrets/keys (from gitlab-secrets.json file) and the data from the backup.

But problem continues when I change gitlab-secrets.json with keys from when the restore occurred (I kept the other file).

You change it after restore? That won’t work. It needs to be in place before restore. Either that or your secrets file is not from the same time that the backup was created, or is damaged in some way if the restore was repeating from scratch and not attempting to use it on an existing restored install.