Gitlab ELK

Hi John,

It is good that finally there is someone working on this. Will this feature in community version as well?

As I was using Gitlab community version and don’t have elastic search integration with Gitlab, I installed filebeat on Gitlab servers, and send logs to a separate ELK server over SSL to logstash.

In filebeat confiuation, I created two new fields called: log_group & log_id for each of gitlab log as below:

fields:
log_group: gitlab_research
log_id: audit_json

This is used to differentiate different GitLab instances I managed. Also, I will want to monitor other none Gitlab instances and that’s why log_group is important to me. I read the following logs to get all the information I want:

user over ssh commit: /var/log/gitlab/gitlab-shell/gitlab-shell.log
user over https commit: /var/log/gitlab/gitlab-rails/production_json.log
user login: /var/log/gitlab/gitlab-rails/audit_json.log
user geoinformation: /var/log/gitlab/gitlab-rails/production_json.log

I created a disk usage log (as I can’t find this information in gitlab logs) in order to get remaining diskspace from gitaly repository like below.

#!/bin/bash

df -P | grep data |   jq -R -c '
      split("\n") |
      .[] |
      if test("^/") then
        gsub(" +"; " ") | split(" ") | {time: (now |todate), mount: .[0], spacetotal: .[1], spaceavail: .[3], mounton: .[5]}
      else
        empty
      end
    '

On the ELK server, do the following change to logstash. 1. identify what kind of gitlab instances this is. 2. change fields to what I want such as date field to @timestamp; change remote ip to geoiop so that I can use geo map; change diskspace to int as the default field they - I think they means logstash - think is string.

filter {
  if [log_group] =~ "gitlab_research" {
    date {
      match => [ "time", "ISO8601" ]
      target => "@timestamp"
    }
    if [log_id] =~ "production_json" {
      geoip {
        source => "remote_ip"
      }
    }
    if [log_id] == "diskusage_json" {
      mutate {
        convert => { 
           "spaceavail" => "integer" 
           "spacetotal" => "integer" 
        }
      }
    }
  }
}

It is important to add some debug approach when things went wrong which is always. So add stdout to the output can identify a lot of problems.

output {
  stdout {codec => rubydebug }
  elasticsearch { 
    hosts => ["localhost:9200"] 
  }
}

I didn’t change others in elasticsearch or kinbana other than building those charts. I created Gitlab User Login over timestamp, Gitlab Geo information, Gitlab User commits over HTTPS, Gitlab User Commits over SSH, and Gitlab available diskspace on user repository.

It’s working so far okay. But I am still testing their accuracy before I can apply these settings to other Gitlab instances. The final goal is to ansible all these steps.

Hope this helps.