天天看点

SpringBoot+Actuator+Prometheus+Consul+Grafana监控系列(二)

上一篇搭建起了一个简单的健康指标检查.这一节继续整合;

项目地址在下方

所需工具下载

搭建监控平台所需要的工具:

grafana: 监控数据的视图展示;

官网下载地址: https://grafana.com/

prometheus: 监控数据采集;

官网下载地址: https://prometheus.io/download/#prometheus

node_exporter : 数据导出器

官网下载地址: https://prometheus.io/download/#node_exporter

consul: 服务发现

官网下载地址: https://www.consul.io/

服务器搭建监控平台

安装卸载脚本编写

为了方便环境迁移或者他人用起来方便,这里我做成一键安装部署, 一键启动,一键卸载;

将下载好的工具上传至服务器指定目录.在这里我的目录是/data/monitor,便于管理;

目录下有 install目录, exporter-install两个目录;

1: 将grafana,prometheus, consul安装包上传至/install 目录下, 在此处写安装脚本;
脚本名: install-monitor.sh
#!/bin/bash

installdir="/data/monitor"
shpath=$0
toolsdir=${shpath%/*}
echo "$toolsdir"
cd $toolsdir
echo "======================unpackaging monitor app to $installdir/app ========================="
mkdir -p $installdir/app
rm -r -f $installdir/app/*
mkdir -p $installdir/app/grafana
mkdir -p $installdir/app/prometheus
mkdir -p $installdir/app/consul

tar -zxvf grafana-6.1.3.linux-amd64.tar.gz -C $installdir/app/grafana
mv $installdir/app/grafana/grafana*/*   $installdir/app/grafana
rmdir $installdir/app/grafana/grafana*

tar -zxvf prometheus-2.8.1.linux-amd64.tar.gz -C $installdir/app/prometheus
mv $installdir/app/prometheus/prometheus*/* $installdir/app/prometheus
rmdir $installdir/app/prometheus/prometheus-*

unzip consul_1.4.4_linux_amd64.zip -d $installdir/app/consul

mkdir -p $installdir/data
rm -r -f $installdir/data/*
mkdir -p $installdir/data/consul
mkdir -p $installdir/data/prometheus
mkdir -p $installdir/data/grafana
mkdir -p $installdir/cfg
rm -r -f $installdir/cfg/*
mkdir -p $installdir/cfg/consul
mkdir -p $installdir/cfg/prometheus
mkdir -p $installdir/cfg/grafana

mkdir -p $installdir/log
mkdir -p $installdir/bin
echo "==============================unpackage monitor app success  ============================="

2: 顺道把卸载脚本也编写,卸载之前需要先一键停止, 启动停止脚本下面会编写;
脚本名: uninstall-monitor.sh
#!/bin/bash

installdir="/data/monitor"
shpath=$0
toolsdir=${shpath%/*}
echo "$toolsdir"
cd $toolsdir

echo "==========================uninstall app to $installdir/app=============================="

rm -r -f $installdir/app
rm -r -f $installdir/cfg
rm -r -f $installdir/data
rm -r -f $installdir/log

echo "==========================uninstall app success=============================="

3: 退回上一次目录,在exporter-install目录下编写node-exporter安装与卸载脚本:
脚本名: install-node-exporter.sh
#!/bin/bash

installdir="/data/monitor"
shpath=$0
toolsdir=${shpath%/*}
echo "$toolsdir"
cd $toolsdir
echo "======================unpackaging monitor exporter to $installdir/exporter ========================="
mkdir -p $installdir/exporter
mkdir -p $installdir/exporter/node_exporter
rm -r -f $installdir/exporter/node_exporter/*
mkdir -p $installdir/log
mkdir -p $installdir/bin

tar -zxvf node_exporter-0.17.0.linux-amd64.tar.gz -C $installdir/exporter/node_exporter
mv $installdir/exporter/node_exporter/node_exporter*/*   $installdir/exporter/node_exporter
rmdir $installdir/exporter/node-exporter/node_exporter-*


echo "==============================unpackage monitor exporter success  ============================="
4: exporter 卸载脚本:
脚本名: uninstall-node-exporter.sh
#!/bin/bash

installdir="/data/monitor"
shpath=$0
toolsdir=${shpath%/*}
echo "$toolsdir"
cd $toolsdir
echo "======================uninstall monitor exporter to $installdir/exporter ========================="

rm -r -f $installdir/exporter/node_exporter

echo "==============================unpackage monitor exporter success  ============================="
           

配置文件

将安装包中的配置文件各复制一份放于data/monitor/cfg/下对应的目录中, 保持原始配置文件不变;

1: grafana配置文件,这里主要修改host,默认3000端口,这里就先不修改了;需要修改的可以参考:
https://grafana.com/docs/installation/configuration/
2: prometheus 配置文件,配置需要监控的任务: 参考:https://prometheus.io/docs/prometheus/latest/configuration/configuration/
核心配置代码:
scrape_configs:
    - job_name: 'prometheus'
      static_configs:
          - targets: ['localhost:7002']
   # 监控服务器的指标
    - job_name: 'node_server'
      static_configs:
          - targets: ['localhost:9100']
    # 通过consul 注册中心获取拉取地址
    - job_name: 'metrics'
      metrics_path: /monitor/actuator/prometheus
      consul_sd_configs:
          - server: localhost:7001
            tag: metrics
      relabel_configs:
          - source_labels: ["__meta_consul_service"]
            regex: "(.*)"
            replacement: $1
            action: replace
            target_label: "service"
3: consul 配置文件: 这里配置主要部分,对于key的描述,可以参考官网:
https://www.consul.io/docs/agent/options.html

{
  "bootstrap_expect" : 1,
  "data_dir" : "/data/monitor/data/consul",
  "pid_file" : "/data/monitor/data/consul/consul.pid",
  "node_name" : "agent-one",
  "bind_addr" : "127.0.0.1",
  "client_addr" : "0.0.0.0",
  "ports" : {
    "http" : 7001
  },
  "ui" : true
}
           

到此为止,配置文件编写完成;

一键启动,停止脚本

1: grafana 启动脚本
#!/bin/bash

monitorpath="/data/monitor"
cd ${monitorpath}/app/grafana
echo "start grafana at : `date` " >${monitorpath}/log/grafana_runtime.log
nohup ${monitorpath}/app/grafana/bin/grafana-server --config ${monitorpath}/cfg/grafana/custom.ini > ${monitorpath}/log/grafana.log 2>&1 &
echo $! > ${monitorpath}/log/grafana.pid
2:grafana 停止脚本
#!/bin/bash

echo "stop program at : `date` " > /data/monitor/log/grafana_runtime.log
if [ -f "/data/monitor/log/grafana.pid" ]; then
    kill -9 `cat /data/monitor/log/grafana.pid`
    rm /data/monitor/log/grafana.pid
else
    echo "file grafana.pid not exist"
fi

3: prometheus启动脚本
#!/bin/bash
monitorpath="/data/monitor"
cd ${monitorpath}/app/prometheus

echo "start prometheus at : `date` " > ${monitorpath}/log/prometheus_runtime.log
nohup ${monitorpath}/app/prometheus/prometheus --web.listen-address=:7002 \
        --config.file=${monitorpath}/cfg/prometheus/prometheus.yml \
        --web.read-timeout=5m \
        --web.enable-admin-api \
        --web.max-connection=10 \
        --query.timeout=2m \
        --query.max-concurreny=20 \
        --storage.tsdb.path=${monitorpath}/data/prometheus/ > ${monitorpath}/log/prometheus.log 2>&1 &
echo $! > ${monitorpath}/log/prometheus.pid
4: prometheus停止脚本
#!/bin/bash
echo "stop prometheus at : `date` " > /data/monitor/log/prometheus_runtime.log
if [ -f "/data/monitor/log/prometheus.pid" ]; then
    kill -9 `cat /data/monitor/log/prometheus.pid`
    rm /data/monitor/log/prometheus.pid
else
    echo "file prometheus.pid not exist"
fi
5: node_exporter 启动脚本
#!/bin/bash
monitorpath="/data/monitor"
cd ${monitorpath}/exporter/node_exporter
echo "start node_exporter at : `date` " > ${monitorpath}/log/node_exporter_runtime.log
nohup ${monitorpath}/exporter/node_exporter/node_exporter > ${monitorpath}/log/node_exporter.log 2>&1 &
echo $! > ${monitorpath}/log/node_exporter.pid
6: node_exporter 停止脚本
#!/bin/bash
echo "stop node_exporter at : `date` " > /data/monitor/log/node_exporter_runtime.log
if [ -f "/data/monitor/log/node_exporter.pid" ]; then
    kill -9 `cat /data/monitor/log/node_exporter.pid`
    rm /data/monitor/log/node_exporter.pid
else
    echo "file node_exporter.pid not exist"
fi
7: consul 启动脚本
#!/bin/bash

monitorpath="/data/monitor"
cd ${monitorpath}/app/consul

echo "start consul at : `date` " > ${monitorpath}/log/consul_runtime.log
nohup ${monitorpath}/app/consul/consul agent -server -config-dir="${monitorpath}/cfg/consul" > ${monitorpath}/log/consul.log 2>&1 &
echo $! > ${monitorpath}/log/consul.pid
8: consul停止脚本
#!/bin/bash

echo "stop consul at: `date` " > /data/monitor/log/consul_runtime.log
if [ -f "/data/monitor/log/consul.pid" ]; then
    kill -9 `cat /data/monitor/log/consul.pid`
    rm /data/monitor/log/consul.pid
else
    echo "file consul.pid not exist"
fi
9: 一键启动脚本
#!/bin/bash

cd `dirname $0`
# start node_exporter
echo "start node_exporter"
./start_node_exporter.sh

#start consul
echo "start consul"
./start_consul.sh

#start prometheus
echo "start prometheus"
./start_prometheus.sh

#start grafana
echo "start grafana"
./start_grafana.sh

echo "see logs at /data/monitor/log"

10: 一键停止
#!/bin/bash

cd `dirname $0`

# stop grafana
echo "stop grafana"
./stop_grafana.sh

# stop prometheus
echo "stop prometheus"
./stop_prometheus.sh

# stop consul
echo "stop consul"
./stop_consul.sh


# stop node_exporter
echo "stop node_exporter"
./stop_node_exporter.sh
echo "see logs at /data/monitor/log"
           

到此所有基础脚本编写完毕. 进入/data/monitor/bin目录,一键启动,可以检查是否正常运行了,

ps -ef | grep consul ,

ps -ef | grep prometheus ,

ps -ef | grep grafana,

ps -ef | grep node_export,

也可以访问查看数据, curl localhost:9100/metrics .

由于篇幅问题, 将紧接着的内容写在下一篇;

GitHub: https://github.com/qinyunsurd/monitor