Now Reading
Explaining fashionable server monitoring stacks for self-hosting

Explaining fashionable server monitoring stacks for self-hosting

2022-09-12 11:04:55

Written by Solène, on 11 September 2022.
Tags:
#nixos

#monitoring

#efficiency

Hey ????????, it has been a very long time I did not have to check out monitoring servers. I’ve arrange a Grafana server six years in the past, and I used to be utilizing Munin for my private servers.

Nonetheless, I just lately moved my server to a small digital machine which has CPU and reminiscence constraints (1 core / 1 GB of reminiscence), and Munin did not work very nicely. I used to be curious to be taught if the Grafana stack modified because the final time I used it, and YES.

There’s that challenge named Prometheus which is used completely in all places, it was time for me to find out about it. And as I prefer to go towards the circulate, I attempted numerous adjustments to the trade commonplace stack through the use of VictoriaMetrics.

On this article, I am utilizing NixOS configuration for the examples, nevertheless it needs to be apparent sufficient that you could nonetheless perceive the elements if you do not know something about NixOS.

VictoriaMetrics is a Prometheus drop-in substitute that’s much more environment friendly (quicker and use much less sources), which additionally supplies numerous API resembling Graphite or InfluxDB. It is the element storing knowledge. It comes with numerous applications like VictoriaMetrics agent to switch numerous elements of Prometheus.

VictoriaMetrics official website

Prometheus is a time collection database, which additionally present a gathering agent named Node Exporter. It is also in a position to pull (scrape) knowledge from distant providers providing a Prometheus API.

Prometheus official website

Node Exporter GitHub page

NixOS is an working system constructed with the Nix package deal supervisor, it has a declarative method that requires to reconfigure the system when it’s essential make a change.

NixOS official website

Collectd is a agent gathering metrics from the system and sending it to a distant appropriate database.

Collectd official website

Grafana is a strong Net interface pulling knowledge from time collection databases to render them beneath helpful charts for evaluation.

Grafana official website

Node exporter full Grafana dashboard

On this setup, a Prometheus server is working on a server together with Grafana, and connects to distant servers working node_exporter to assemble knowledge.

Operating it on my server, Grafana takes 67 MB, the native node_exporter 12.5 MB and Prometheus 63 MB.

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
grafana   837975  0.1  6.7 1384152 67836 ?       Ssl  01:19   1:07 grafana-server
node-ex+  953784  0.0  1.2 941292 12512 ?        Ssl  16:24   0:01 node_exporter
prometh+  983975  0.3  6.3 1226012 63284 ?       Ssl  17:07   0:00 prometheus

Setup 1 diagram

  • mannequin: pull, Prometheus is connecting to all servers

Professionals §

  • it is the trade commonplace
  • can use the “node exporter full” Grafana dashboard

Cons §

  • makes use of reminiscence
  • you want to have the ability to attain all of the distant nodes

Server §

{
  providers.grafana.allow = true;
  providers.prometheus.exporters.node.allow = true;

  providers.prometheus = {
    allow = true;
    scrapeConfigs = [
      {
        job_name = "kikimora";
        static_configs = [
          {targets = ["10.43.43.2:9100"];}
        ];
      }
      {
        job_name = "interbus";
        static_configs = [
          {targets = ["127.0.0.1:9100"];}
        ];
      }
    ];
  };
}

Shopper §

{
  networking.firewall.allowedTCPPorts = [9100];
  providers.prometheus.exporters.node.allow = true;
}

On this setup, a VictoriaMetrics server is working on a server together with Grafana. A VictoriaMetrics agent is working regionally to assemble knowledge from distant servers working node_exporter.

Operating it on my server, Grafana takes 67 MB, the native node_exporter 12.5 MB, VictoriaMetrics 30 MB and its agent 13.8 MB.

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
grafana   837975  0.1  6.7 1384152 67836 ?       Ssl  01:19   1:07 grafana-server
node-ex+  953784  0.0  1.2 941292 12512 ?        Ssl  16:24   0:01 node_exporter
victori+  986126  0.1  3.0 1287016 30052 ?       Ssl  18:00   0:03 victoria-metric
root      987944  0.0  1.3 1086276 13856 ?       Sl   18:30   0:00 vmagent

Setup 2 diagram

  • mannequin: pull, VictoriaMetrics agent is connecting to all servers

Professionals §

  • can use the “node exporter full” Grafana dashboard
  • light-weight and extra performant than Prometheus

Cons §

  • you want to have the ability to attain all of the distant nodes

Server §

let
  configure_prom = builtins.toFile "prometheus.yml" ''
    scrape_configs:
    - job_name: 'kikimora'
      stream_parse: true
      static_configs:
      - targets:
        - 10.43.43.1:9100
    - job_name: 'interbus'
      stream_parse: true
      static_configs:
      - targets:
        - 127.0.0.1:9100
  '';
in {
  providers.victoriametrics.allow = true;
  providers.grafana.allow = true;

  systemd.providers.export-to-prometheus = {
    path = with pkgs; [victoriametrics];
    allow = true;
    after = ["network-online.target"];
    wantedBy = ["multi-user.target"];
    script = "vmagent -promscrape.config=${configure_prom} -remoteWrite.url=http://127.0.0.1:8428/api/v1/write";
  };
}

Shopper §

{
  networking.firewall.allowedTCPPorts = [9100];
  providers.prometheus.exporters.node.allow = true;
}

On this setup, a VictoriaMetrics server is working on a server together with Grafana, on every server node_exporter and VictoriaMetrics agent are working to export knowledge to the central VictoriaMetrics server.

See Also

Operating it on my server, Grafana takes 67 MB, the native node_exporter 12.5 MB, VictoriaMetrics 30 MB and its agent 13.8 MB, which is precisely the identical because the setup 2, besides the VictoriaMetrics agent is working on all distant servers.

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
grafana   837975  0.1  6.7 1384152 67836 ?       Ssl  01:19   1:07 grafana-server
node-ex+  953784  0.0  1.2 941292 12512 ?        Ssl  16:24   0:01 node_exporter
victori+  986126  0.1  3.0 1287016 30052 ?       Ssl  18:00   0:03 victoria-metric
root      987944  0.0  1.3 1086276 13856 ?       Sl   18:30   0:00 vmagent

Setup 3 diagram

  • mannequin: push, every agent is connecting to the VictoriaMetrics server

Professionals §

  • can use the “node exporter full” Grafana dashboard
  • reminiscence environment friendly
  • can bypass firewalls simply

Cons §

  • you want to have the ability to attain all of the distant nodes
  • extra upkeep as you’ve one additional agent on every distant
  • could also be dangerous for safety, it’s essential enable distant servers to write down to your VictoriaMetrics server

Server §

{
  networking.firewall.allowedTCPPorts = [8428];
  providers.victoriametrics.allow = true;
  providers.grafana.allow = true;
  providers.prometheus.exporters.node.allow = true;
}

Shopper §

let
  configure_prom = builtins.toFile "prometheus.yml" ''
    scrape_configs:
    - job_name: '${config.networking.hostName}'
      stream_parse: true
      static_configs:
      - targets:
        - 127.0.0.1:9100
  '';
in {
  providers.prometheus.exporters.node.allow = true;
  
  systemd.providers.export-to-prometheus = {
    path = with pkgs; [victoriametrics];
    allow = true;
    after = ["network-online.target"];
    wantedBy = ["multi-user.target"];
    script = "vmagent -promscrape.config=${configure_prom} -remoteWrite.url=http://victoria-server.area:8428/api/v1/write";
  };
}

On this setup, a VictoriaMetrics server is working on a server together with Grafana, servers are working Collectd sending knowledge to VictoriaMetrics graphite API.

Operating it on my server, Grafana takes 67 MB, VictoriaMetrics 30 MB and Collectd 172 kB (sure).

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
grafana   837975  0.1  6.7 1384152 67836 ?       Ssl  01:19   1:07 grafana-server
victori+  986126  0.1  3.0 1287016 30052 ?       Ssl  18:00   0:03 victoria-metric
collectd  844275  0.0  0.0 610432   172 ?        Ssl  02:07   0:00 collectd

Setup 4 diagram

  • mannequin: push, VictoriaMetrics receives knowledge from the Collectd servers

Professionals §

  • tremendous reminiscence environment friendly
  • can bypass firewalls simply

Cons §

  • you’ll be able to’t use the “node exporter full” Grafana dashboard
  • could also be dangerous for safety, it’s essential enable distant servers to write down to your VictoriaMetrics server
  • it’s essential configure Collectd for every host

Server §

The server requires VictoriaMetrics to run exposing its graphite API on ports 2003.

Observe that in Grafana, you’ll have to escape “-” characters utilizing “-” within the queries. I additionally did not discover a strategy to robotically uncover hosts within the knowledge to make use of variables within the dashboard.

UPDATE: Utilizing write_tsdb exporter in collectd, and exposing a TSDB API with VictoriaMetrics, you’ll be able to set a label to every host, after which use the question “label_values(standing)” in Grafana to automated uncover hosts.

{
  networking.firewall.allowedTCPPorts = [2003];
  providers.victoriametrics = {
    allow = true;
    extraOptions = [
      "-graphiteListenAddr=:2003"
    ];
  };
  providers.grafana.allow = true;
  
}

Shopper §

We solely have to allow Collectd on the shopper:

{
  providers.collectd = {
    allow = true;
    autoLoadPlugin = true;
    extraConfig = ''
      Interval 30
    '';
    plugins = {
      "write_graphite" = ''
        <Node "${config.networking.hostName}">
          Host "victoria-server.fqdn"
          Port "2003"
          Protocol "tcp"
          LogSendErrors true
          Prefix "collectd_"
        </Node>
      '';
      cpu = ''
        ReportByCpu false
      '';
      reminiscence = "";
      df = ''
        Mountpoint "/"
        Mountpoint "/nix/retailer"
        Mountpoint "/residence"
        ValuesPercentage True
        ValuesAbsolute False
      '';
      load = "";
      uptime = "";
      swap = ''
        ReportBytes false
        ReportIO false
        ValuesPercentage true
      '';
      interface = ''
        ReportInactive false
      '';
    };
  };
}

The primary part named #!/bin/introduction” is on objective and never a mistake. It felt tremendous enjoyable once I began writing the article, and wished to maintain it that method.

The Collectd setup is essentially the most minimalistic whereas nonetheless highly effective, but it surely requires lot of labor to make the dashboards and configure the plugins appropriately.

The setup I like finest is the setup 2.

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top