Gitlab selfhosted database size over 27 GB

Hi,
We have a selfhosted Gitlab and the database size increased a while ago over 27GB.
There are 3 huge tables:

                           Table                           |    Size
-----------------------------------------------------------+------------
 pm_package_versions                                       | 13 GB
 pm_package_version_licenses                               | 9949 MB
 pm_packages                                               | 2223 MB
 pm_affected_packages                                      | 348 MB
 pm_advisories                                             | 127 MB
 raw_usage_data                                            | 3720 kB
 events                                                    | 2824 kB

I’ve seen this post and followed the instructions, but it doesn’t change anything:

The tables don’t shrink…

I also tried this:

gitlabhq_production=# VACUUM pm_package_versions;
ERROR:  canceling statement due to statement timeout
CONTEXT:  while vacuuming index "pm_package_versions_pkey" of relation "public.pm_package_versions"
gitlabhq_production=# VACUUM pm_affected_packages;
VACUUM
gitlabhq_production=# VACUUM pm_advisories;
VACUUM
gitlabhq_production=# VACUUM pm_packages;
VACUUM
gitlabhq_production=# VACUUM pm_package_version_licenses;
ERROR:  canceling statement due to statement timeout
gitlabhq_production=#

The output from the Ruby console, concerning this:

--------------------------------------------------------------------------------
 Ruby:         ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-linux]
 GitLab:       16.7.3-ee (256ff8e8dae) EE
 GitLab Shell: 14.32.0
 PostgreSQL:   13.12
-----------------------------------------------------------[ booted in 126.97s ]
Feature.enabled?(:package_metadata_synchronization) && Feature.disable(:package_metadata_synchronization)
Loading production environment (Rails 7.0.8)
irb(main):001:0> Feature.enabled?(:package_metadata_synchronization) && Feature.
disable(:package_metadata_synchronization)
=> false
irb(main):002:0> PackageMetadata::PackageVersionLicense.count
=> 0
irb(main):003:0> PackageMetadata::PackageVersion.count
irb(main):004:0*
ActiveRecord::Base.connection.execute('SET statement_timeout TO 0')
PackageMetadata::PackageVersionLicense.delete_all
PackageMetadata::PackageVersion.delete_all
irb(main):005:0> ActiveRecord::Base.connection.execute('SET statement_timeout TO
 0')
=> #<PG::Result:0x00007fe2711b0cf0 status=PGRES_COMMAND_OK ntuples=0 nfields=0 cmd_tuples=0>
irb(main):006:0> PackageMetadata::PackageVersionLicense.delete_all
irb(main):007:0> PackageMetadata::PackageVersion.delete_all
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/activerecord-7.0.8/lib/active_record/connection_adapters/postgresql_adapter.rb:768:in `exec_params': PG::ConnectionBad: PQconsumeInput() FATAL:  terminating connection due to administrator command (ActiveRecord::StatementInvalid)
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
/opt/gitlab/embedded/lib/ruby/gems/3.1.0/gems/activerecord-7.0.8/lib/active_record/connection_adapters/postgresql_adapter.rb:768:in `exec_params': PQconsumeInput() FATAL:  terminating connection due to administrator command (PG::ConnectionBad)
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
irb(main):008:0>

Any ideas how to get rid of the 3 huge tables?

Best,
Chris

Hi Chris,

Maybe all this space is used by index of those tables.
If you run this you should be able to see the size of the index compared to the data itself in the table:

SELECT
   relname  as table_name,
   pg_size_pretty(pg_total_relation_size(relid)) As "Total Size",
   pg_size_pretty(pg_indexes_size(relid)) as "Index Size",
   pg_size_pretty(pg_relation_size(relid)) as "Actual Size"
   FROM pg_catalog.pg_statio_user_tables 
ORDER BY pg_total_relation_size(relid) DESC;

If that is the issue, you might have to rebuild those index.

Hope this helps.

Kind regards,

1 Like

Thanks for your answer, meanwhile I didn’t have enough time to test.
OK, now I’ve shrinked some tables, but now I have 2 giant tables…

gitlabhq_production=# SELECT                                                                                                              relname  as table_name,                                                                                                                pg_size_pretty(pg_total_relation_size(relid)) As "Total Size",                                                                         pg_size_pretty(pg_indexes_size(relid)) as "Index Size",                                                                                pg_size_pretty(pg_relation_size(relid)) as "Actual Size"                                                                               FROM pg_catalog.pg_statio_user_tables                                                                                               ORDER BY pg_total_relation_size(relid) DESC;
                        table_name                         | Total Size | Index Size | Actual Size
-----------------------------------------------------------+------------+------------+-------------
 pm_package_versions                                       | 14 GB      | 9373 MB    | 5129 MB
 pm_package_versions                                       | 14 GB      | 9373 MB    | 5129 MB
 pm_packages                                               | 1658 MB    | 493 MB     | 1158 MB
 pm_affected_packages                                      | 273 MB     | 92 MB      | 180 MB
 pm_advisories                                             | 123 MB     | 5152 kB    | 110 MB
 raw_usage_data                                            | 4016 kB    | 32 kB      | 24 kB
 events                                                    | 2936 kB    | 2616 kB    | 280 kB
 ci_builds                                                 | 2472 kB    | 1696 kB    | 512 kB
 merge_request_diff_files                                  | 2032 kB    | 48 kB      | 792 kB
 analytics_usage_trends_measurements                       | 1920 kB    | 816 kB     | 1080 kB

And if I do this:
gitlabhq_production=# REINDEX TABLE pm_package_versions;
I get an error after a short time:
ERROR: canceling statement due to statement timeout

Do I have to rise Memory? Cores?

Best,
Chris