Upgrade to 12.6 gives panic runtime error

So far, I never had any issues upgrading gitlab. However, the last upgrade from 12.4.x->12.6.x (I skipped one in November), upgrade fails for some reason. See more info below.
I have no clue how to debug this one, since I don’t really understand what’s wrong… Any ideas?

  1. I’m getting a 500 http status

  2. running sudo gitlab-ctl tail:


    ==> /var/log/gitlab/prometheus/current <==
    2020-01-11_12:29:38.99855 level=info ts=2020-01-11T12:29:38.998Z caller=main.go:293 msg=“no time or size retention was set so using the default time retention” duration=15d
    2020-01-11_12:29:38.99864 level=info ts=2020-01-11T12:29:38.998Z caller=main.go:329 msg=“Starting Prometheus” version="(version=2.12.0, branch=master, revision=)"
    2020-01-11_12:29:38.99869 level=info ts=2020-01-11T12:29:38.998Z caller=main.go:330 build_context="(go=go1.12.13, user=GitLab-Omnibus, date=)"
    2020-01-11_12:29:38.99875 level=info ts=2020-01-11T12:29:38.998Z caller=main.go:331 host_details="(Linux 4.9.0-11-rt-amd64 #1 SMP PREEMPT RT Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64 GitLab0 (none))"
    2020-01-11_12:29:38.99881 level=info ts=2020-01-11T12:29:38.998Z caller=main.go:332 fd_limits="(soft=50000, hard=50000)"
    2020-01-11_12:29:38.99886 level=info ts=2020-01-11T12:29:38.998Z caller=main.go:333 vm_limits="(soft=unlimited, hard=unlimited)"
    2020-01-11_12:29:39.00167 panic: runtime error: slice bounds out of range
    2020-01-11_12:29:39.00172
    2020-01-11_12:29:39.00174 goroutine 1 [running]:
    2020-01-11_12:29:39.00180 github.com/prometheus/prometheus/promql.parseBrokenJson(0xc0008c3000, 0x4e21, 0x4e21, 0x2a74ce0, 0xc0005307b0, 0x0, 0x0, 0xc0002dc0f0)
    2020-01-11_12:29:39.00200 /var/cache/omnibus/src/prometheus/src/github.com/prometheus/prometheus/promql/query_logger.go:45 +0x131
    2020-01-11_12:29:39.00209 github.com/prometheus/prometheus/promql.logUnfinishedQueries(0xc0002dc0f0, 0x2e, 0x4e21, 0x2a74ce0, 0xc0005307b0)
    2020-01-11_12:29:39.00222 /var/cache/omnibus/src/prometheus/src/github.com/prometheus/prometheus/promql/query_logger.go:70 +0x552
    2020-01-11_12:29:39.00230 github.com/prometheus/prometheus/promql.NewActiveQueryTracker(0x7fff4c7b7eb6, 0x1f, 0x14, 0x2a74ce0, 0xc0005307b0, 0x2a74ce0)
    2020-01-11_12:29:39.00245 /var/cache/omnibus/src/prometheus/src/github.com/prometheus/prometheus/promql/query_logger.go:108 +0x131
    2020-01-11_12:29:39.00255 main.main()
    2020-01-11_12:29:39.00259 /var/cache/omnibus/src/prometheus/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:361 +0x52bd
    ^CTraceback (most recent call last):
    5: from /opt/gitlab/embedded/bin/omnibus-ctl:23:in <main>' 4: from /opt/gitlab/embedded/bin/omnibus-ctl:23:in load’
    3: from /opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/omnibus-ctl-0.6.0/bin/omnibus-ctl:31:in <top (required)>' 2: from /opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/omnibus-ctl-0.6.0/lib/omnibus-ctl.rb:746:in run’
    1: from /opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/omnibus-ctl-0.6.0/lib/omnibus-ctl.rb:584:in tail' /opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/omnibus-ctl-0.6.0/lib/omnibus-ctl.rb:584:in system’: Interrupt

  3. gitlab-ctl reconfigure also seems to give some issues:

    • storage_directory[/var/opt/gitlab/backups] action create

      • ruby_block[directory resource: /var/opt/gitlab/backups] action run

        ================================================================================
        Error executing action run on resource ‘ruby_block[directory resource: /var/opt/gitlab/backups]’

        Errno::EPERM

        ‘root’ cannot chown /var/opt/gitlab/backups. If using NFS mounts you will need to re-export them in ‘no_root_squash’ mode and try again.
        Operation not permitted @ apply2files - /var/opt/gitlab/backups

        ================================================================================
        Error executing action create on resource ‘storage_directory[/var/opt/gitlab/backups]’
        ================================================================================

      Errno::EPERM

      ruby_block[directory resource: /var/opt/gitlab/backups] (/opt/gitlab/embedded/cookbooks/cache/cookbooks/package/resources/storage_directory.rb line 34) had an error: Errno::EPERM: ‘root’ cannot chown /var/opt/gitlab/backups. If using NFS mounts you will need to re-export them in ‘no_root_squash’ mode and try again.
      Operation not permitted @ apply2files - /var/opt/gitlab/backups

      System Info:

      chef_version=14.13.11
      platform=debian
      platform_version=9.11
      ruby=ruby 2.6.3p62 (2019-04-16 revision 67580) [x86_64-linux]
      program_name=/opt/gitlab/embedded/bin/chef-client
      executable=/opt/gitlab/embedded/bin/chef-client

    Running handlers:
    There was an error running gitlab-ctl reconfigure:

    storage_directory[/var/opt/gitlab/backups] (gitlab::gitlab-rails line 116) had an error: Errno::EPERM: ruby_block[directory resource: /var/opt/gitlab/backups] (/opt/gitlab/embedded/cookbooks/cache/cookbooks/package/resources/storage_directory.rb line 34) had an error: Errno::EPERM: ‘root’ cannot chown /var/opt/gitlab/backups. If using NFS mounts you will need to re-export them in ‘no_root_squash’ mode and try again.
    Operation not permitted @ apply2files - /var/opt/gitlab/backups

    Running handlers complete
    Chef Client failed. 0 resources updated in 06 seconds

Hi,

please don’t create multiple topics for one problem, this doesn’t speed up answers.

In terms of the prometheus error, try disabling the integration prior to upgrading and see whether it can complete. This sounds like a problem with Prometheus and its newly introduced query logging (maybe disable that as well).

In terms of the permission problem for /var/opt/gitlab/backups verify that the Gitlab user is the owner of that directory and no other instances lock this. If that is really on an NFS it may be read-only for some reason.

Cheers,
Michael

This seems to be caused by a bug in prometheus:

As suggested in the GitHub issue I renamed my queries.active file to queries.active.old and Gitlab started fine again.

1 Like