Self-hosted gitlab-kas behind SSL Terminating proxy gives 'GRPC:unimplemented' when registering agent

We’re using self hosted gitlab behind a SSL terminating load balancer and trying to register an agent with a cluster. When we try to register an agent in our project, we receive an error saying GRPC:unimplemented.

Looking at the gitlab-rails log (tail -Fqn10000 /var/log/gitlab/gitlab-rails/*json.log | grep gitlab-kas ). we see error messages like this:

{"time":"2024-05-06T14:41:58.096Z","severity":"INFO","duration_s":0.00259,"db_duration_s":0.00041,"view_duration_s":0.00218,"status":401,"method":"GET","path":"/api/v4/internal/kubernetes/agent_info","params":[],"host":"127.0.0.1","remote_ip":"127.0.0.1","ua":"gitlab-kas/v16.11.1/v16.11.1","route":"/api/:version/internal/kubernetes/agent_info","db_count":1,"db_write_count":0,"db_cached_count":0,"db_txn_count":0,"db_replica_txn_count":0,"db_primary_txn_count":0,"db_main_txn_count":0,"db_ci_txn_count":0,"db_main_replica_txn_count":0,"db_ci_replica_txn_count":0,"db_replica_count":0,"db_primary_count":1,"db_main_count":1,"db_ci_count":0,"db_main_replica_count":0,"db_ci_replica_count":0,"db_replica_cached_count":0,"db_primary_cached_count":0,"db_main_cached_count":0,"db_ci_cached_count":0,"db_main_replica_cached_count":0,"db_ci_replica_cached_count":0,"db_replica_wal_count":0,"db_primary_wal_count":0,"db_main_wal_count":0,"db_ci_wal_count":0,"db_main_replica_wal_count":0,"db_ci_replica_wal_count":0,"db_replica_wal_cached_count":0,"db_primary_wal_cached_count":0,"db_main_wal_cached_count":0,"db_ci_wal_cached_count":0,"db_main_replica_wal_cached_count":0,"db_ci_replica_wal_cached_count":0,"db_replica_txn_max_duration_s":0.0,"db_primary_txn_max_duration_s":0.0,"db_main_txn_max_duration_s":0.0,"db_ci_txn_max_duration_s":0.0,"db_main_replica_txn_max_duration_s":0.0,"db_ci_replica_txn_max_duration_s":0.0,"db_replica_txn_duration_s":0.0,"db_primary_txn_duration_s":0.0,"db_main_txn_duration_s":0.0,"db_ci_txn_duration_s":0.0,"db_main_replica_txn_duration_s":0.0,"db_ci_replica_txn_duration_s":0.0,"db_replica_duration_s":0.0,"db_primary_duration_s":0.0,"db_main_duration_s":0.0,"db_ci_duration_s":0.0,"db_main_replica_duration_s":0.0,"db_ci_replica_duration_s":0.0,"cpu_s":0.009433,"mem_objects":6379,"mem_bytes":407744,"mem_mallocs":1696,"mem_total_bytes":662904,"pid":1427,"worker_id":"puma_2","rate_limiting_gates":[],"correlation_id":"2eeadc2c-9531-477d-afc1-e682c1807500","meta.caller_id":"GET /api/:version/internal/kubernetes/agent_info","meta.remote_ip":"127.0.0.1","meta.feature_category":"deployment_management","meta.client_id":"ip/127.0.0.1","request_urgency":"low","target_duration_s":5}
{"time":"2024-05-06T14:43:05.283Z","severity":"INFO","duration_s":0.00302,"db_duration_s":0.00041,"view_duration_s":0.00261,"status":401,"method":"GET","path":"/api/v4/internal/kubernetes/agent_info","params":[],"host":"127.0.0.1","remote_ip":"127.0.0.1","ua":"gitlab-kas/v16.11.1/v16.11.1","route":"/api/:version/internal/kubernetes/agent_info","db_count":1,"db_write_count":0,"db_cached_count":0,"db_txn_count":0,"db_replica_txn_count":0,"db_primary_txn_count":0,"db_main_txn_count":0,"db_ci_txn_count":0,"db_main_replica_txn_count":0,"db_ci_replica_txn_count":0,"db_replica_count":0,"db_primary_count":1,"db_main_count":1,"db_ci_count":0,"db_main_replica_count":0,"db_ci_replica_count":0,"db_replica_cached_count":0,"db_primary_cached_count":0,"db_main_cached_count":0,"db_ci_cached_count":0,"db_main_replica_cached_count":0,"db_ci_replica_cached_count":0,"db_replica_wal_count":0,"db_primary_wal_count":0,"db_main_wal_count":0,"db_ci_wal_count":0,"db_main_replica_wal_count":0,"db_ci_replica_wal_count":0,"db_replica_wal_cached_count":0,"db_primary_wal_cached_count":0,"db_main_wal_cached_count":0,"db_ci_wal_cached_count":0,"db_main_replica_wal_cached_count":0,"db_ci_replica_wal_cached_count":0,"db_replica_txn_max_duration_s":0.0,"db_primary_txn_max_duration_s":0.0,"db_main_txn_max_duration_s":0.0,"db_ci_txn_max_duration_s":0.0,"db_main_replica_txn_max_duration_s":0.0,"db_ci_replica_txn_max_duration_s":0.0,"db_replica_txn_duration_s":0.0,"db_primary_txn_duration_s":0.0,"db_main_txn_duration_s":0.0,"db_ci_txn_duration_s":0.0,"db_main_replica_txn_duration_s":0.0,"db_ci_replica_txn_duration_s":0.0,"db_replica_duration_s":0.0,"db_primary_duration_s":0.0,"db_main_duration_s":0.0,"db_ci_duration_s":0.0,"db_main_replica_duration_s":0.0,"db_ci_replica_duration_s":0.0,"cpu_s":0.010945,"mem_objects":6438,"mem_bytes":483120,"mem_mallocs":2431,"mem_total_bytes":740640,"pid":30000,"worker_id":"puma_3","rate_limiting_gates":[],"correlation_id":"79bea46d-b3de-4fd1-a943-cccfcb42cf0f","meta.caller_id":"GET /api/:version/internal/kubernetes/agent_info","meta.remote_ip":"127.0.0.1","meta.feature_category":"deployment_management","meta.client_id":"ip/127.0.0.1","request_urgency":"low","target_duration_s":5}
{"time":"2024-05-06T14:44:19.855Z","severity":"INFO","duration_s":0.00305,"db_duration_s":0.00071,"view_duration_s":0.00234,"status":401,"method":"GET","path":"/api/v4/internal/kubernetes/agent_info","params":[],"host":"127.0.0.1","remote_ip":"127.0.0.1","ua":"gitlab-kas/v16.11.1/v16.11.1","route":"/api/:version/internal/kubernetes/agent_info","db_count":1,"db_write_count":0,"db_cached_count":0,"db_txn_count":0,"db_replica_txn_count":0,"db_primary_txn_count":0,"db_main_txn_count":0,"db_ci_txn_count":0,"db_main_replica_txn_count":0,"db_ci_replica_txn_count":0,"db_replica_count":0,"db_primary_count":1,"db_main_count":1,"db_ci_count":0,"db_main_replica_count":0,"db_ci_replica_count":0,"db_replica_cached_count":0,"db_primary_cached_count":0,"db_main_cached_count":0,"db_ci_cached_count":0,"db_main_replica_cached_count":0,"db_ci_replica_cached_count":0,"db_replica_wal_count":0,"db_primary_wal_count":0,"db_main_wal_count":0,"db_ci_wal_count":0,"db_main_replica_wal_count":0,"db_ci_replica_wal_count":0,"db_replica_wal_cached_count":0,"db_primary_wal_cached_count":0,"db_main_wal_cached_count":0,"db_ci_wal_cached_count":0,"db_main_replica_wal_cached_count":0,"db_ci_replica_wal_cached_count":0,"db_replica_txn_max_duration_s":0.0,"db_primary_txn_max_duration_s":0.0,"db_main_txn_max_duration_s":0.0,"db_ci_txn_max_duration_s":0.0,"db_main_replica_txn_max_duration_s":0.0,"db_ci_replica_txn_max_duration_s":0.0,"db_replica_txn_duration_s":0.0,"db_primary_txn_duration_s":0.0,"db_main_txn_duration_s":0.0,"db_ci_txn_duration_s":0.0,"db_main_replica_txn_duration_s":0.0,"db_ci_replica_txn_duration_s":0.0,"db_replica_duration_s":0.0,"db_primary_duration_s":0.001,"db_main_duration_s":0.001,"db_ci_duration_s":0.0,"db_main_replica_duration_s":0.0,"db_ci_replica_duration_s":0.0,"cpu_s":0.009863,"mem_objects":6427,"mem_bytes":483056,"mem_mallocs":2429,"mem_total_bytes":740136,"pid":30000,"worker_id":"puma_3","rate_limiting_gates":[],"correlation_id":"8a99884e-6993-4a8d-a694-0ab2b43f0367","meta.caller_id":"GET /api/:version/internal/kubernetes/agent_info","meta.remote_ip":"127.0.0.1","meta.feature_category":"deployment_management","meta.client_id":"ip/127.0.0.1","request_urgency":"low","target_duration_s":5}

The gitlab-kas process itself has logs like this:

2024-05-06_14:43:20.42933 {"level":"info","time":"2024-05-06T14:43:20.429Z","msg":"Running KAS gitlab-kas/v16.11.1/v16.11.1"}
2024-05-06_14:43:20.43335 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"Using own private API URL","url":"grpc://localhost:8155"}
2024-05-06_14:43:20.43344 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Channel #1]Channel created"}
2024-05-06_14:43:20.43345 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Channel #1]original dial target is: \"passthrough:pipe\""}
2024-05-06_14:43:20.43348 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Channel #1]parsed dial target is: resolver.Target{URL:url.URL{Scheme:\"passthrough\", Opaque:\"pipe\", User:(*url.Userinfo)(nil), Host:\"\", Path:\"\", RawPath:\"\", OmitHost:false, ForceQuery:false, RawQuery:\"\", Fragment:\"\", RawFragment:\"\"}}"}
2024-05-06_14:43:20.43349 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Channel #1]Channel authority set to \"pipe\""}
2024-05-06_14:43:20.43359 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Server #2]Server created"}
2024-05-06_14:43:20.43359 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Server #3]Server created"}
2024-05-06_14:43:20.43368 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Server #4]Server created"}
2024-05-06_14:43:20.43375 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Server #5]Server created"}
2024-05-06_14:43:20.43378 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Channel #6]Channel created"}
2024-05-06_14:43:20.43378 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Channel #6]original dial target is: \"passthrough:pipe\""}
2024-05-06_14:43:20.43379 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Channel #6]parsed dial target is: resolver.Target{URL:url.URL{Scheme:\"passthrough\", Opaque:\"pipe\", User:(*url.Userinfo)(nil), Host:\"\", Path:\"\", RawPath:\"\", OmitHost:false, ForceQuery:false, RawQuery:\"\", Fragment:\"\", RawFragment:\"\"}}"}
2024-05-06_14:43:20.43382 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Channel #6]Channel authority set to \"pipe\""}
2024-05-06_14:43:20.43385 {"level":"info","time":"2024-05-06T14:43:20.433Z","msg":"[core] [Server #7]Server created"}
2024-05-06_14:43:20.43509 {"level":"info","time":"2024-05-06T14:43:20.435Z","msg":"Kubernetes API endpoint is up","mod_name":"kubernetes_api","net_network":"tcp","net_address":"127.0.0.1:8154"}
2024-05-06_14:43:20.43526 {"level":"info","time":"2024-05-06T14:43:20.435Z","msg":"[core] [Server #7 ListenSocket #8]ListenSocket created"}
2024-05-06_14:43:20.43531 {"level":"info","time":"2024-05-06T14:43:20.435Z","msg":"[core] [Server #3 ListenSocket #9]ListenSocket created"}
2024-05-06_14:43:20.43547 {"level":"info","time":"2024-05-06T14:43:20.435Z","msg":"Observability endpoint is up","mod_name":"observability","net_network":"tcp","net_address":"127.0.0.1:8151"}
2024-05-06_14:43:20.43548 {"level":"info","time":"2024-05-06T14:43:20.435Z","msg":"Agentk API endpoint is up","net_network":"tcp","net_address":"127.0.0.1:8150","is_websocket":true}
2024-05-06_14:43:20.43562 {"level":"info","time":"2024-05-06T14:43:20.435Z","msg":"[core] [Server #4 ListenSocket #10]ListenSocket created"}
2024-05-06_14:43:20.43570 {"level":"info","time":"2024-05-06T14:43:20.435Z","msg":"API endpoint is up","net_network":"tcp","net_address":"127.0.0.1:8153"}
2024-05-06_14:43:20.43571 {"level":"info","time":"2024-05-06T14:43:20.435Z","msg":"Private API endpoint is up","net_network":"tcp","net_address":"127.0.0.1:8155"}
2024-05-06_14:43:20.43576 {"level":"info","time":"2024-05-06T14:43:20.435Z","msg":"[core] [Server #2 ListenSocket #12]ListenSocket created"}
2024-05-06_14:43:20.43578 {"level":"info","time":"2024-05-06T14:43:20.435Z","msg":"[core] [Server #5 ListenSocket #11]ListenSocket created"}

The gitlab-kas process also seems to have a bunch of errors when its initially starting up after a gitlab-ctl reconfigure. We are including them here but suspect they’re just the process failing to connect while rails was starting up.

2024-05-03_21:51:55.82389 {"level":"error","time":"2024-05-03T21:51:55.823Z","msg":"AgentInfo()","grpc_service":"gitlab.agent.reverse_tunnel.rpc.ReverseTunnel","grpc_method":"Connect","error":"Get \"http://127.0.0.1:8080/api/v4/internal/kubernetes/agent_info\": dial tcp 127.0.0.1:8080: connect: connection refused"}
2024-05-03_21:51:55.97758 {"level":"error","time":"2024-05-03T21:51:55.977Z","msg":"AgentInfo()","grpc_service":"gitlab.agent.reverse_tunnel.rpc.ReverseTunnel","grpc_method":"Connect","error":"Get \"http://127.0.0.1:8080/api/v4/internal/kubernetes/agent_info\": dial tcp 127.0.0.1:8080: connect: connection refused"}
2024-05-03_21:51:56.00996 {"level":"error","time":"2024-05-03T21:51:56.009Z","msg":"AgentInfo()","grpc_service":"gitlab.agent.reverse_tunnel.rpc.ReverseTunnel","grpc_method":"Connect","error":"Get \"http://127.0.0.1:8080/api/v4/internal/kubernetes/agent_info\": dial tcp 127.0.0.1:8080: connect: connection refused"}
2024-05-03_21:51:56.15261 {"level":"error","time":"2024-05-03T21:51:56.152Z","msg":"AgentInfo()","grpc_service":"gitlab.agent.reverse_tunnel.rpc.ReverseTunnel","grpc_method":"Connect","error":"Get \"http://127.0.0.1:8080/api/v4/internal/kubernetes/agent_info\": dial tcp 127.0.0.1:8080: connect: connection refused"}
2024-05-03_21:51:56.16499 {"level":"error","time":"2024-05-03T21:51:56.164Z","msg":"AgentInfo()","grpc_service":"gitlab.agent.reverse_tunnel.rpc.ReverseTunnel","grpc_method":"Connect","error":"Get \"http://127.0.0.1:8080/api/v4/internal/kubernetes/agent_info\": dial tcp 127.0.0.1:8080: connect: connection refused"}

Here’s our relevant config:

gitlab_rails['gitlab_kas_enabled'] = true
gitlab_rails['gitlab_kas_internal_url'] = 'grpc://localhost:8153'
gitlab_rails['gitlab_kas_external_k8s_proxy_url'] = 'https://<hostname>/-/kubernetes-agent/'
gitlab_rails['gitlab_kas_external_url'] = 'wss://<hostname>/-/kubernetes-agent/'
gitlab_kas_external_url "wss://<hostname>/-/kubernetes-agent/"
gitlab_kas['enable'] = true
gitlab_kas['log_level'] = 'info'
gitlab_kas['grpc_log_level'] = 'info'
gitlab_kas['gitlab_address'] = 'http://127.0.0.1:8080'
gitlab_kas['listen_address'] = 'localhost:8150'
gitlab_kas['listen_network'] = 'tcp'
gitlab_kas['listen_websocket'] = true

Notably, we’re not even at the point of connecting an agent. Just trying to get a registration token to begin that process. This seems like an internal to gitlab issue.

graphql errors:

If helpful, The graphql query getAgents that occurs when I go to my projects -/clusters url returns a list of ‘agents’ we’ve created and a bunch of the following errors:

    "errors": [
        {
            "message": "GRPC::Unimplemented",
            "locations": [
                {
                    "line": 56,
                    "column": 3
                }
            ],
            "path": [
                "project",
                "clusterAgents",
                "nodes",
                0,
                "connections"
            ]
        },

What I’d like to see

I’d like to be able to register an agent but can’t get a registration token.

Versions

  • Self-managed
    We’re using 16.11.1-ee. Omnibus Linux Package. License is Premium. Single node installation.

Any help would be greatly appreciated!

Hello,

Same here, the agent page is blank, and response from GraphQL request is full of “GRPC::Unimplemented”, same path as original post.

Gitlab Omnibus, v16.11.1-ee, Premium

Thanks

Replying to myself:

explanation and workaround here:

The workaround did in fact work. Also was fixed in 16.11.2!