Workspaces agent fails reconciliation loop

I’ve spun up a microkube k8s swarm and gotten the agent working, it seems to connect to my gitlab instance just fine, but it’s failing its reconciliation loop because the api call it is making is 404ing, and it doesn’t know what to do about that

I’ve got everything behind a reverse proxy so things might be a little wonky because of that, but this particular problem doesn’t seem related to that necessarily.

Here’s the logs from the agent trying to run its reconciliation

{"level":"info","time":"2024-01-24T23:59:11.946Z","msg":"starting partial update","mod_name":"remote_development","agent_id":1}
{"level":"debug","time":"2024-01-24T23:59:11.946Z","msg":"Running reconciliation loop","mod_name":"remote_development","agent_id":1}
{"level":"debug","time":"2024-01-24T23:59:11.946Z","msg":"Making GitLab request","mod_name":"remote_development","agent_id":1}
{"level":"debug","time":"2024-01-24T23:59:12.035Z","msg":"Made request to the Rails API","mod_name":"remote_development","status_code":404,"request_id":"01HMYYKP36E9Y8R60R29Z239CH","duration_in_ms":88,"agent_id":1}
{"level":"debug","time":"2024-01-24T23:59:12.035Z","msg":"Reconciliation loop ended","mod_name":"remote_development","agent_id":1}
{"level":"error","time":"2024-01-24T23:59:12.035Z","msg":"Remote Dev - partial sync cycle ended with error","mod_name":"remote_development","error":"unexpected status code: 404","agent_id":1}

Specifically it’s looking for api/v4/internal/kubernetes/modules/remote_development/reconcile according to my reverse proxy logs.

When I go there myself, I get a 404, and in the documentation I can’t find any mention of that API path.

What’s gone wrong? I appear to be the only person online that has this problem, since my googling has turned up nothing

Here is another bit of info, I’ve tracked this all through the logs and it seems like the request gets to gitlab-rails and 404s

	"time": "2024-03-20T19:38:34.908Z",
	"severity": "INFO",
	"duration_s": 0.00119,
	"db_duration_s": 0.0,
	"view_duration_s": 0.00119,
	"status": 404,
	"method": "POST",
	"path": "/api/v4/internal/kubernetes/modules/remote_development/reconcile",
	"params": [
			"key": "update_type",
			"value": "full"
			"key": "workspace_agent_infos",
			"value": []
	"host": "gitlab.hostname.tld",
	"remote_ip": ",,",
	"ua": "gitlab-kas/v16.9.2/d5b98591",
	"route": "/api/:version/*path",
	"db_count": 1,
	"db_write_count": 0,
	"db_cached_count": 0,
	"db_replica_count": 0,
	"db_primary_count": 1,
	"db_main_count": 1,
	"db_ci_count": 0,
	"db_main_replica_count": 0,
	"db_ci_replica_count": 0,
	"db_replica_cached_count": 0,
	"db_primary_cached_count": 0,
	"db_main_cached_count": 0,
	"db_ci_cached_count": 0,
	"db_main_replica_cached_count": 0,
	"db_ci_replica_cached_count": 0,
	"db_replica_wal_count": 0,
	"db_primary_wal_count": 0,
	"db_main_wal_count": 0,
	"db_ci_wal_count": 0,
	"db_main_replica_wal_count": 0,
	"db_ci_replica_wal_count": 0,
	"db_replica_wal_cached_count": 0,
	"db_primary_wal_cached_count": 0,
	"db_main_wal_cached_count": 0,
	"db_ci_wal_cached_count": 0,
	"db_main_replica_wal_cached_count": 0,
	"db_ci_replica_wal_cached_count": 0,
	"db_replica_duration_s": 0.0,
	"db_primary_duration_s": 0.007,
	"db_main_duration_s": 0.007,
	"db_ci_duration_s": 0.0,
	"db_main_replica_duration_s": 0.0,
	"db_ci_replica_duration_s": 0.0,
	"cpu_s": 0.029511,
	"mem_objects": 8861,
	"mem_bytes": 1348226,
	"mem_mallocs": 3266,
	"mem_total_bytes": 1702666,
	"pid": 1525,
	"worker_id": "puma_13",
	"rate_limiting_gates": [],
	"correlation_id": "01HSENYQFNZ1K6CSFH68EPHNA3",
	"meta.caller_id": "* /api/:version/*path",
	"meta.remote_ip": "",
	"meta.feature_category": "not_owned",
	"meta.client_id": "ip/",
	"content_length": "49",
	"request_urgency": "default",
	"target_duration_s": 1

I’ve added a comment into Workspace reconcile should be resilient to any individual workspace errors (#440241) · Issues · / GitLab · GitLab as an example to improve the error messages. I’d suggest adding your feedback there, too.

From the original problem, I suspect the reverse proxy causing troubles with the connection. Maybe there is an option to run a Kubernetes cluster with an external cluster IP address/domain instead, to verify that the GitLab instance and remote dev setup works as expected?

I’ve started a discord thread in the support server here with some more info Discord

Copied below as of now

I have 'grep -Rnw . -e ‘01HSEQ2MDYTA7SC3V9M1BP6R3T’` and gotten nothing other than the logs linked there, I’ll put my docker-compose file there as well for completeness

Further, when I hop into the docker container running gitlab-ce, and do “GITLAB_LOG_LEVEL=0 gitlab-ctl restart” - I get some debug output, but nothing new with any of the correlation IDs I grab (after the log update of course)

In gitlab (web UI for adding clusters in the repo) there appears to be an initial handshake, but after that it’s all the reconcile thing so it stops being considered connected

A new repo I made in the gitlab instance appears to connect fine - will continue testing with that

to be clear, even with that done, and a workspace config set up, these instructions don’t work (I only see the web ide as an option) Workspaces | GitLab

In the logs again, still in debug mode, nothing shows up for the search “workspace” except some graphql entries which appear unrelated

config.yaml just has

enabled: true

Removing that stops the errors fwiw

updated to the following and it’s still not happy

enabled: true
namespace: “gitlab-workspaces”

Tried changing the agent folder name, that made it stop throwing errors but I assume because it didn’t recognize any config.yaml to use

To be clear here, there’s no gitlab_workspaces_proxy installed, but it’s not clear that that would prevent the reconciliation thing that’s throwing the error with the agent

tried adding a devfile from a random gitlab example project, no luck

I’ve now installed and can connect to the workspaces proxy over workspaces.hostname.tld (though not yet the wildcard subdomain due to letsencrypt restrictions)

So now I’m not sure what to try.

Please get me going in the right direction and I’m happy to debug for you but as of now this seems impossible. I’ve spent at least 20 hours on this across the past few days and gotten nowhere

Hi @garrett .

Thanks for the feedback and testing!

Based on this:

the docker container running gitlab-ce

…I assume you may not be running a Free-tier GitLab installation, but Workspaces is only available on Premium or Ultimate: Workspaces | GitLab

Is this the case? If so, that’s the issue.

As far as the logging, based on this line: ee/lib/ee/api/internal/kubernetes.rb · 6ece5c344bd79d417e3240efd7c6dd49a7f62980 · / GitLab · GitLab

…I would expect you to see the message "remote_development" licensed feature is not available somewhere in the logs.

However, it may be getting swallowed somewhere higher in the stack when it is turned into a 404.

I agree that the error messaging in this case could definitely be improved, as Michael mentioned above. We are well aware of this, and already have several issues related to revisiting these areas.

Let us know if you have more questions.

– Chad

Ah, yes I was under the impression that it was available for free users as well so long as it was self hosted.

Is that something that is ever planned to change or is it likely to stay premium+ only?

FWIW I ran root@f47a0ad1025e:/var/log/gitlab# grep -Rnw . -e ‘licensed feature’ and got nothing in the docker container running this, so yeah that is getting eaten somewhere.

Well that feels bad, I’m not sure where I misread the docs about the tiers it was available in but I sure wasted a bunch of time.

I have a kubernetes cluster now and feel like I should use it, CI/CD deployments look like they’re available to free users, but I can’t set that up from what I can tell.

On the operate->environments screen, I have no options in the dropdown despite the agent supposedly being connected. Where do I head to troubleshoot that?

On the operate->environments screen, I have no options in the dropdown despite the agent supposedly being connected. Where do I head to troubleshoot that?

If the agent is correctly configured and connected, you should see the agent under Operate -> Kubernetes

This is documented here: Managing the agent for Kubernetes instances | GitLab

It should look like this:

If the agent isn’t connected, see the other docs in that section to make sure everything is correctly configured.

As for the free tier, we appreciate your feedback, and are sorry that you wasted time trying to set up the feature. I can let you know that we have opened an internal discussion based on your feedback.

Don’t get me wrong, it’s on me for wasting my time. I greatly appreciate the work y’all do and I can’t expect everything to be free. I just have no idea if the MO for features is premium first → free later or what so I figured I’d ask

The agent is connected - it looks exactly like the remotedev line in yours, except no warning triangle about the version.

But then you see what it looks like when I go to select the agent to be the host for the environment.

To that end, the only thing I have in my config.yaml for it is

    level: debug
    grpc_level: warn

And the docs are not clear at all about what I have to put in that config file to get it to show up as an agent for my environment.

I’m not sure about that, I’ve only used the agent in the context of Remote Development Workspaces. The agent itself is a separate feature owned by a separate group.

I’ll forward a link to your message to the Environments Group which owns the GitLab Agent for Kubernetes feature.

You need to configure user access documented at Grant users Kubernetes access | GitLab

I agree with you that it’s not easy to connect from this page: Dashboard for Kubernetes | GitLab to the page I linked. We tried to mention it in the prerequisites section. If you have ideas for improvement, please, open an MR to contribute them to GitLab.

1 Like