Issues connecting gitlab-agent with my Gitlab CE instance

Hello!

I am currenty configuring a gitlab-agent that is running on my kubernetes cluster at DigitalOcean. I am trying to connect it to my GitLab CE instance (15.4 omnibus), which is running behind an NGINX reverse proxy.

The problem that I’m having right now is that my gitlab-agent seems not being able to make a connection with my GitLab CE instance. The way that I have installed my gitlab-agent on the cluster is through helm with the following command:

helm upgrade --install my-cluster gitlab/gitlab-agent --namespace gitlab-agent --create-namespace --set image.tag=v15.4.0 --set config.token=<token> --set config.kasAddress=wss://<my-instance>/-/kubernetes-agent/

When I am looking into the logs of the gitlab-agent I see the following error message coming up when trying to connect:

{"level":"error","time":"2022-11-06T18:23:37.960Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"Connect(): rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing failed to WebSocket dial: expected handshake response status code 101 but got 302\""}

My suspicion is that I might be missing something in my NGINX configurations, but I am already on this problem for two days, and I cannot see what is going wrong now. This is the configuration I am currently using:

upstream gitlab-workhorse {
  # On GitLab versions before 13.5, the location is
  # `/var/opt/gitlab/gitlab-workhorse/socket`. Change the following line
  # accordingly.
  server unix:/var/opt/gitlab/gitlab-workhorse/sockets/socket fail_timeout=0;
}

server {
  listen 0.0.0.0:80;
  listen [::]:80 ipv6only=on;
  server_name  <my-domain>;
  server_tokens off;

  return 301 https://$http_host$request_uri;   
  
# individual nginx logs for this gitlab vhost
  access_log  /var/log/nginx/gitlab_access.log;
  error_log   /var/log/nginx/gitlab_error.log;
}

## HTTPS host
server {
  listen 0.0.0.0:443 ssl;
  listen [::]:443 ipv6only=on ssl;
  server_name <my-domain>;
  server_tokens off; ## Don't show the nginx version number, a security best practice
  root /opt/gitlab/embedded/service/gitlab-rails/public;

  ## Strong SSL Security
  ## https://raymii.org/s/tutorials/Strong_SSL_Security_On_nginx.html & https://cipherli.st/
    ssl_certificate /etc/letsencrypt/live/<my-domain>/fullchain.pem; # managed by Certbot
    ssl_certificate_key /etc/letsencrypt/live/<my-domain>/privkey.pem; # managed by Certbot
  include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
  ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot


  # GitLab needs backwards compatible ciphers to retain compatibility with Java IDEs
  #ssl_ciphers "ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA:ECDHE-RSA-AES128-SHA:ECDHE-RSA-DES-CBC3-SHA:AES256-GCM-SHA384:AES128-GCM-SHA256:AES256-SHA256:AES128-SHA256:AES256-SHA:AES128-SHA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!MD5:!PSK:!RC4";
  #ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
  #ssl_prefer_server_ciphers on;
  #ssl_session_cache shared:SSL:10m;
  #ssl_session_timeout 5m;

  ## See app/controllers/application_controller.rb for headers set

  ## [Optional] Enable HTTP Strict Transport Security
  ## HSTS is a feature improving protection against MITM attacks
  ## For more information see: https://www.nginx.com/blog/http-strict-transport-security-hsts-and-nginx/
  # add_header Strict-Transport-Security "max-age=31536000; includeSubDomains";

  ## [Optional] If your certficate has OCSP, enable OCSP stapling to reduce the overhead and latency of running SSL.
  ## Replace with your ssl_trusted_certificate. For more info see:
  ## - https://medium.com/devops-programming/4445f4862461
  ## - https://www.ruby-forum.com/topic/4419319
  ## - https://www.digitalocean.com/community/tutorials/how-to-configure-ocsp-stapling-on-apache-and-nginx
  # ssl_stapling on;
  # ssl_stapling_verify on;
  # ssl_trusted_certificate /etc/nginx/ssl/stapling.trusted.crt;
  # resolver 208.67.222.222 208.67.222.220 valid=300s; # Can change to your DNS resolver if desired
  # resolver_timeout 5s;

  ## [Optional] Generate a stronger DHE parameter:
  ##   sudo openssl dhparam -out /etc/ssl/certs/dhparam.pem 4096
  ##
  # ssl_dhparam /etc/ssl/certs/dhparam.pem;

  ## Individual nginx logs for this GitLab vhost
  access_log  /var/log/nginx/gitlab_access.log;
  error_log   /var/log/nginx/gitlab_error.log;

  location / {
    client_max_body_size 0;
    gzip off;

    ## https://github.com/gitlabhq/gitlabhq/issues/694
    ## Some requests take more than 30 seconds.
    proxy_read_timeout      300;
    proxy_connect_timeout   300;
    proxy_redirect          off;

    proxy_http_version 1.1;

    proxy_set_header    Host                $http_host;
    proxy_set_header    X-Real-IP           $remote_addr;
    proxy_set_header    X-Forwarded-Ssl     on;
    proxy_set_header    X-Forwarded-For     $proxy_add_x_forwarded_for;
    proxy_set_header    X-Forwarded-Proto   $scheme;

    proxy_pass http://gitlab-workhorse;
  }
}

Things that I have tried already so far:

#1 Enabling Gitlab KAS

gitlab_kas['enable'] = true

#2 adding an extra location block in the NGINX config

 location /-/kubernetes-agent/ {
  proxy_pass http://gitlab-workhorse;
  proxy_http_version 1.1;
  proxy_set_header X-Real-IP $remote_addr;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_set_header X-NginX-Proxy true;
  proxy_set_header Host $host;
  proxy_set_header Sec-WebSocket-Protocol $http_sec_websocket_protocol;
  proxy_set_header Sec-WebSocket-Extensions $http_sec_websocket_extensions;
  proxy_set_header Sec-WebSocket-Key $http_sec_websocket_key;
  proxy_set_header Sec-WebSocket-Version $http_sec_websocket_version;
  proxy_set_header Upgrade $http_upgrade;
  proxy_set_header Connection $connection_upgrade;
  proxy_cache_bypass $http_upgrade;
 }

#3 using ws:// instead of wss://

#4 tailing the logs of the server

It seems that the omnibus instance of gitlab seems to get the request from the server. These are the logs that I have got out of it:

{"severity":"INFO","time":"2022-11-06T22:48:23.520Z","retry":0,"queue":"default","version":0,"status_expiration":1800,"class":"Ci::DeleteObjectsWork                                                                                         er","args":[],"jid":"55a1c644a0a38774619100ac","created_at":"2022-11-06T22:48:22.974Z","meta.caller_id":"Ci::ScheduleDeleteObjectsCronWorker","corre                                                                                         lation_id":"622389d998d8ee97dfbbca5f1ea427bd","meta.root_caller_id":"Cronjob","meta.feature_category":"continuous_integration","meta.client_id":"ip/                                                                                         ","worker_data_consistency":"always","size_limiter":"validated","enqueued_at":"2022-11-06T22:48:23.086Z","job_size_bytes":2,"pid":1608,"message":"Ci                                                                                         ::DeleteObjectsWorker JID-55a1c644a0a38774619100ac: done: 0.398426 sec","job_status":"done","scheduling_latency_s":0.035892,"redis_calls":5,"redis_d                                                                                         uration_s":0.007452,"redis_read_bytes":5,"redis_write_bytes":487,"redis_queues_calls":1,"redis_queues_duration_s":0.000255,"redis_queues_read_bytes"                                                                                         :1,"redis_queues_write_bytes":63,"redis_shared_state_calls":4,"redis_shared_state_duration_s":0.007197,"redis_shared_state_read_bytes":4,"redis_shar                                                                                         ed_state_write_bytes":424,"db_count":4,"db_write_count":3,"db_cached_count":0,"db_replica_count":0,"db_primary_count":4,"db_main_count":4,"db_main_r                                                                                         eplica_count":0,"db_replica_cached_count":0,"db_primary_cached_count":0,"db_main_cached_count":0,"db_main_replica_cached_count":0,"db_replica_wal_co                                                                                         unt":0,"db_primary_wal_count":0,"db_main_wal_count":0,"db_main_replica_wal_count":0,"db_replica_wal_cached_count":0,"db_primary_wal_cached_count":0,                                                                                         "db_main_wal_cached_count":0,"db_main_replica_wal_cached_count":0,"db_replica_duration_s":0.0,"db_primary_duration_s":0.047,"db_main_duration_s":0.0                                                                                         47,"db_main_replica_duration_s":0.0,"cpu_s":0.015457,"mem_objects":3814,"mem_bytes":329736,"mem_mallocs":750,"mem_total_bytes":482296,"worker_id":"s                                                                                         idekiq_0","rate_limiting_gates":[],"duration_s":0.398426,"completed_at":"2022-11-06T22:48:23.520Z","load_balancing_strategy":"primary","db_duration_                                                                                         s":0.235641}

==> /var/log/gitlab/gitlab-rails/production.log <==
Started GET "/-/kubernetes-agent/" for 128.199.58.185 at 2022-11-06 23:48:23 +0100
Processing by ApplicationController#route_not_found as HTML
  Parameters: {"unmatched_route"=>"-/kubernetes-agent"}
Redirected to http://<domain>/users/sign_in
Completed 302 Found in 17ms (ActiveRecord: 0.0ms | Elasticsearch: 0.0ms | Allocations: 4482)

I would appreciate it very much if I can get some help with this issue. Thanks in advance!

You should try to connect to GRPC directly by using the following configuration:

gitlab_kas['listen_address'] = '0.0.0.0:8150'
gitlab_kas['listen_network'] = 'tcp'
gitlab_kas['listen_websocket'] = false

Then connect with:

helm .... --set config.kasAddress=grpc://<hostname|ip>:8150/

(Mind the trailing slash)