I am running a test suite composed of 741 tests and 3390 assertions with PHPUnit in the context of a Laravel application. Most of those tests are integration tests which use a database connexion, redis and s3 services in the CI. I am producing a code coverage report with this test run. Here is the command I run in the CI.
I am using Gitlab.com CI with the shared runners.
vendor/bin/phpunit --verbose --testdox --colors=never --exclude slow --stop-on-error --stop-on-failure --coverage-clover coverage.xml
Everythings works fine. The testing process is long, but expected. Every test takes between 0.5 and 2.5 seconds in general. But later on, when I have about 1000 lines of output in my CI job console, my tests suddenly take about 5-7 minutes each to complete. Eventually, I end up hitting the timeout and the job fails. Here is an excerpt of the PHPUnit output with the timings.
Reservable Controller
✔ List reservables for organization [1433.97 ms]
✔ List reservables for organization does not show private scoped reservables [407.01 ms]
✔ List reservables for organization filter by category [398.82 ms]
User Controller
✔ Index [1093.55 ms]
✔ Index with user type filter [720.10 ms]
✔ Index with university affiliation filter [8091.16 ms]
✔ Index with profile and formation [15232.75 ms]
Organization Controller
✔ Me [1549.88 ms]
Publication Controller
✔ List publication for specific user [555.23 ms]
✔ List publication for specific user with dois [439.23 ms]
Activity Report
✔ Get new or updated dataset count for user projects [2517.39 ms]
✔ Get new or updated dataset count for user projects with multiple projects [2703.64 ms]
✔ Get new or updated dataset count for user projects count only updated [2061.95 ms]
✔ Get team most active user [5881.47 ms]
✔ Get team most active user ex aequo [18263.78 ms]
✔ Get team most active user none [445702.22 ms]
✔ Get team most active user past activity doesnt count [232573.56 ms]
ERROR: Job failed: execution took longer than 1h0m0s seconds
We can clearly see that the time it takes to complete drastically changes at Get team most active user ex aequo
, which is very similar in terms of work load than the test just before.
I also noted that:
- When I run the tests on my local machine, there is no drastical change in test time to complete.
- I also tagged all the tests from Activity Report as
@group slow
and when running the tests again by excluding those, other unrelated tests were slow after, again, a thousand lines of output.
Here is the relevant part of my .gitlab.ci.yml
file
services:
- mariadb:10.3
- name: minio/minio
alias: minio
command: ['server', '--address', ':9005', '/data']
- redis
- name: docker:dind
entrypoint: [ "env", "-u", "DOCKER_HOST" ]
command: [ "dockerd-entrypoint.sh" ]
backend_tests:
image: registry.gitlab.com/redacted/redacted
stage: test
interruptible: true
dependencies:
- composer
artifacts:
when: always
paths:
- src/api/storage/logs/
expire_in: 1 day
variables:
XDEBUG_MODE: coverage
SENTRY_RELEASE: $CI_COMMIT_SHORT_SHA
rules:
- if: $CI_MERGE_REQUEST_IID
changes:
- src/api/*
- if: $CI_COMMIT_TAG
changes:
- src/api/*
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
changes:
- src/api/*
script:
- cd src/api
- cp .env.gitlab .env
- rm phpunit.xml && cp phpunit.gitlab.xml phpunit.xml
- php artisan key:generate
- php artisan migrate:fresh
- php artisan db:seed --class TestingSeeder
- vendor/bin/phpunit --verbose --testdox --colors=never --exclude slow --stop-on-error --stop-on-failure --coverage-clover coverage.xml
- bash <(curl -s https://codecov.io/bash)
I don’t really have a good idea where to start to diagnose this problem. It takes a lot of time to try something out as each iteration takes 20 min to run and also I am burning my CI minutes very fast doing so.
My gut feeling is that it seems like if the runner was suddenly starting to swap memory on disk. But this is only a wild guest based on nothing.
Anyone has experienced something similar or has an idea to where to start looking?