Advanced search - Elasticsearch migration halted before migrating to 15.11.4-ee

Recently, we wanted to migrate to version of GL 15.11.4-ee and in Advanced search, we discover an error Elasticsearch migration halted:

From Advanced search migrations, I checked when the last migration was successfully completed:

curl -s "http://localhost:9200/gitlab-production-migrations/_search?q=*"  | jq .hits.hits[0]
{
  "_index": "gitlab-production-migrations",
  "_id": "20230111142636",
  "_score": 1,
  "_source": {
    "completed": true,
    "state": {},
    "name": "AddInternalToNotes",
    "started_at": "2023-02-24T07:00:03.553Z",
    "completed_at": "2023-02-24T07:00:03.553Z"
  }
}

From elasticsearch logs:

[2023-05-15T09:46:12,674][WARN ][o.e.t.LoggingTaskListener] [gl-advanced-search] 18016975 failed with exception
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of coordinating operation [coordinating_and_primary_bytes=0, replica_bytes=0, all_bytes=0, coordinating_operation_bytes=442971850, max_coordinating_and_primary_bytes=415655526]
	at org.elasticsearch.index.IndexingPressure.markCoordinatingOperationStarted(IndexingPressure.java:84) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:192) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:88) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:86) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.action.support.ActionFilter$Simple.apply(ActionFilter.java:53) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:84) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.tasks.TaskManager.registerAndExecute(TaskManager.java:202) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.client.internal.node.NodeClient.executeLocally(NodeClient.java:112) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.client.internal.node.NodeClient.doExecute(NodeClient.java:90) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.client.internal.support.AbstractClient.execute(AbstractClient.java:380) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.client.internal.FilterClient.doExecute(FilterClient.java:57) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.client.internal.ParentTaskAssigningClient.doExecute(ParentTaskAssigningClient.java:55) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.client.internal.support.AbstractClient.execute(AbstractClient.java:380) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.client.internal.support.AbstractClient.bulk(AbstractClient.java:460) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.action.bulk.Retry$RetryHandler.execute(Retry.java:207) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.action.bulk.Retry$RetryHandler.lambda$retry$3(Retry.java:138) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:825) ~[elasticsearch-8.5.3.jar:?]
	at org.elasticsearch.threadpool.ThreadPool$1.run(ThreadPool.java:436) ~[elasticsearch-8.5.3.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:577) ~[?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:317) ~[?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
	at java.lang.Thread.run(Thread.java:1589) ~[?:?]

Its looks like ES has not enough memory. Current configuration is 4GB for HEAP. I found similar issue here Elasticsearch Bulk Index Requests Rejection but it does not work, because I could not set indexing_pressure.memory.limit for ES. From docs, it looks like the configuration is allowed only for Elasticsearch with paid plan. Then we increase memory to 8GB of HEAP and run migration. From logs, its looks its ok but index gitlab-production-migrations does not show any successful migration, the last one is still the with same id 20230111142636.

Current resources usage of ES is:

  • MEM: 4.6GB HEAP
  • Disk usage: 6.3G

Questions:

  • Why is there no migration information in index gitlab-production-migrations? How to determine if migration has been completed?
  • 4.6G Heap for 6.3G data looks a lot. I cant imagine HEAP usage in case of larger indices. Is HEAP usage like this normal?
  • Why I could not change indexing_pressure.memory.limit ?
  • Is there better way how to solve this issue?
  • Does anybody have experience with rollover indices for gitlab? Right now, there are one index per commit, migration, users, etc… But this way, it will rising till the limit of ES.

Thank you in advance.