Ref

https://prometheus.io/docs/guides/cadvisor/

https://prometheus.io/docs/guides/node-exporter/

 

cAdvisor: metric agent for docker swarm cluster

Node_exporter: metric agent for linux host

Prometheus

Server for collecting metric from each agents

config

configure prometheus scraping jobs

prometheus.yml
...
- job_name: 'dsg-container'
        scrape_interval: 60s
        static_configs:
        - targets: ['192.168.0.2:8080', '192.168.0.3:8080']

      - job_name: 'dsg-node_exporter'
        scrape_interval: 60s
        static_configs:
        - targets: ['192.168.0.2:9100', '192.168.0.3:9100']

Docker swarm

Deploy metric agent on the cluster

cAdvisor

cadvisor port number: 8080

# docker command should be executed on the manager node
# docker command for deploy cadvisor container on each nodes
docker service create --name cadvisor --mode=global --publish target=8080,mode=host --mount type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock,ro --mount type=bind,src=/,dst=/rootfs,ro --mount type=bind,src=/var/run,dst=/var/run --mount type=bind,src=/sys,dst=/sys,ro --mount type=bind,src=/var/lib/docker,dst=/var/lib/docker,ro google/cadvisor -docker_only

Node_exporter

node exporter port number: 9100

# docker command for deploy node_exporter on each node
docker service create --name node_exporter --mode=global --publish 9100:9100 --mount type=bind,src=/,dst=/host,ro,bind-propagation=rslave quay.io/prometheus/node-exporter --path.rootfs=/host

Check each services are correctly running on each node

# docker service ls |egrep 'node_exporter|cadvisor'
[CONTAINERID]        cadvisor                                                     global              2/2                    google/cadvisor:latest                                                *:8080->8080/tcp
[CONTAINERID]        node_exporter                                                global              2/2                    quay.io/prometheus/node-exporter:latest                               *:9100->9100/tcp

Also can check with metric URLs on the web browser

- http://192.168.0.2:8080/metrics

- http://192.168.0.3:9100/metrics

Check on prometheus

from the gui, Status > Targets can see the scraping jobs you configured before

Grafana

Create or import new dashboard

import existing dashboard from community (Grafana Labs. https://grafana.com/grafana/dashboards?pg=dashboards&plcmt=featured-sub1) 

 

Grafana Dashboards - discover and share dashboards for Grafana.

Grafana.com provides a central repository where the community can come together to discover and share dashboards.

grafana.com

Sample query

Sample queries for monitoring docker swarm cluster

Docker node count

count(cadvisor_version_info)

System load on each node

$instance: grafana variable that you can configure dashboard settings with query(label_values(instance))

node_load5{instance=~"$instance"}

Available memory on node

node_memory_MemAvailable_bytes{instance=~"$instance"}

Memory usage per container

label_replace(topk($topk, sum(container_memory_usage_bytes{container_label_com_docker_stack_namespace=~".+",container_label_com_docker_swarm_service_name =~"$service_name",container_label_com_docker_swarm_node_id=~"$node"}) by (name, container_label_com_docker_swarm_task_name)), "task_name", "$1", "container_label_com_docker_swarm_task_name", "(.*\\.[0-9]*).*\\..*")

 

+ Recent posts