Gitlab database noticeable size increase since 16.3.3

Hi There,

I am self-hosting Gitlab EE Ultimate for personal projects, homelab, etc.
On the 18th of September, Gitlab EE 16.3.3 was installed by my automation. Because this is done automatically, I did not look much into this upgrade and Gitlab worked fine afterwards.
However, what I did not notice is that the Gitlab database was constantly increasing until it filled up the whole disk. Postgresql database size went from around 2 GB to 4.8 GB.
I have now upgraded to Gitlab ee 16.3.4. Database usage appears to have stopped.
I looked into the db itself, it might be due to pm_packages, pm_packages_versions and pm_packages_versions_licenses.
Would any of you noticed the same behavior? Is there anything I could do to reclaim this space? Or is it working as intended?
I have already tried to do a vacuum but it did not free up anything.

Kind regards,

Those tables have started to increase again.
Looking at the data, it seems that Gitlab selfhosted is pulling data from gitlab.com.
Here is a sample from pm_packages

id	purl_type	"name"	created_at	updated_at	licenses
2165253	4	github.com/tencentcloud/tencentcloud-sdk-go/tencentcloud/rum	2023-09-18 09:42:00.918 +1200	2023-09-21 17:38:18.422 +1200	[[2], "1.0.221", "1.0.752", []]
2167029	4	github.com/tencentcloud/tencentcloud-sdk-go/tencentcloud/eb	2023-09-18 09:42:10.084 +1200	2023-09-21 17:50:50.485 +1200	[[2], "1.0.326", "1.0.752", []]
2165211	4	github.com/kiegroup/kogito-operator/test	2023-09-18 09:41:59.877 +1200	2023-09-21 17:50:54.344 +1200	[[2], "0.0.0-20210916151339-fa5dd0381930", "0.0.0-20230919182244-28b2d3dc945e", []]
119539	1	designcafe/cafeapi	2023-09-18 05:41:43.245 +1200	2023-09-20 02:25:55.537 +1200	[[11], "1.0.0", "1.0.0", []]
119536	1	designbycode/guardian	2023-09-18 05:41:43.245 +1200	2023-09-20 02:25:55.537 +1200	[[11], "1.0.0", "1.4.1", []]

Any idea what is happening? I can give more storage to my instance but don’t see/understand the usefulness of this data.

Ok, last post from myself as I resolved this issue. Hoping this could help anybody else.
In 16.3.3, package metadata sync are enabled by default:
Backport Enable sync with package metadata db by default

To disable this:

  1. Go to Admin Area > Settings > Security and Compliance
  2. Expand License Compliance
  3. Untick all options (or keep those you need) and save changes
  4. You can then follow this documentation to delete downloaded packages: License scanning of CycloneDX files | GitLab
  5. Finally on postgresql server, you can run a vacuum or wait for auto vacuum

Signing off :slight_smile:

2 Likes