相關推薦:
使用Prometheus Operator監控kubetnetes叢集
一、前言
在之前文章中為大家講解了如何在Kubernetes叢集中部署Prometheus,已經可以通過Prometheus監控Kubernetes中Pod的狀态、核心元件的狀态等資料。那麼如何将應用自身的資料進行內建呢?
首先prometheus對收集的資料格式是有一定的要求的,具體格式如下,隻有符合此格式的prometheus才會正常的采集,是以在應用中我們需要能把關鍵性的監控資料以此格式抛出來。
nginx_http_connections{state="active"} 2
nginx_http_connections{state="reading"} 0
nginx_http_connections{state="waiting"} 1
nginx_http_connections{state="writing"} 1
nginx_http_request_bytes_sent{host="10.46.0.4"} 11055968
nginx_http_request_bytes_sent{host="testservers"} 4640
nginx_http_request_time_bucket{host="10.46.0.4",le="00.005"} 3960
...
至于怎麼把監控名額資料抛出來,可能需要從應用本身動手。Prometheus社群也提供了大量的官方以及第三方Exporters,可以滿足Prometheus的采納者快速實作對關鍵業務,以及基礎設施的監控需求。
官方以及第三方Exporters請參考此連結。
此處我們建議直接使用Prometheus的Client進行注冊監控接口。Promehtues的Client目前支援大部分程式設計語言,支援清單可以參考如下文章。
prometheus client支援的語言清單。
二、應用監控
本次是使用nginx demo鏡像來示範。
1. 建立deployment和service
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nginx-demo
labels:
app: nginx-demo
spec:
replicas: 1
selector:
matchLabels:
app: nginx-demo
template:
metadata:
labels:
app: nginx-demo
spec:
containers:
- name: nginx-demo
image: billy98/nginx-prometheus-metrics:latest
ports:
- name: http-metrics
containerPort: 9527
- name: web
containerPort: 80
- name: test
containerPort: 1314
imagePullPolicy: IfNotPresent
---
apiVersion: v1
kind: Service
metadata:
labels:
app: nginx-demo
name: nginx-demo
namespace: default
spec:
ports:
- name: http-metrics
port: 9527
protocol: TCP
targetPort: 9527
- name: web
port: 80
protocol: TCP
targetPort: 80
- name: test
port: 1314
protocol: TCP
targetPort: 1314
selector:
app: nginx-demo
type: ClusterIP
2. 建立ServiceMonitor
由于prometheus裡指定了serviceMonitor的标簽必須
release: p
才會去抓取資料,是以在我們應用的ServiceMonitor中必須加上此标簽。
[root@node-01 ~]# kubectl -n monitoring get prometheus p-prometheus -o yaml
...
serviceMonitorSelector:
matchLabels:
release: p
...(其餘内容省略)
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app: nginx-demo
release: p
name: nginx-demo
namespace: monitoring
#prometheus的namespace
spec:
endpoints:
- interval: 15s
port: http-metrics
namespaceSelector:
matchNames:
- default
#nginx demo的namespace
selector:
matchLabels:
app: nginx-demo
此處需要特别做些解釋,serviceMonitor是Prometheus Operator中抽象的概念,他的作用就是講配置Prometheus采集Target的配置變化成為動态發現的方式,可以serviceMonitor通過Deployment對應的Service配置進行挂鈎,通過label selector選擇Service,并自動發現後端容器。其中需要注意的是namespace字段永遠為monitoring,而namespaceSelector中則是選擇的應用所在的namespace。
建立完成可以看到endpoints
[root@node-01 ~]# kubectl get ep
NAME ENDPOINTS AGE
heketi 10.45.0.3:8080 36d
heketi-storage-endpoints 10.31.90.204:1,10.31.90.205:1,10.31.90.206:1 36d
kubernetes 10.31.90.201:6443 38d
nginx-demo 10.46.0.4:9527,10.46.0.4:80,10.46.0.4:1314 6h
tomcat 10.34.0.0:8080 38d
然後我們通路
10.46.0.4:1314
,生成一些測試名額。
[root@node-01 ~]# curl 10.46.0.4:1314
hello world
通路
10.46.0.4:9527/metrics
,檢視監控資料
[root@node-01 ~]# curl 10.46.0.4:9527/metrics
# HELP nginx_http_connections Number of HTTP connections
# TYPE nginx_http_connections gauge
nginx_http_connections{state="active"} 2
nginx_http_connections{state="reading"} 0
nginx_http_connections{state="waiting"} 1
nginx_http_connections{state="writing"} 1
# HELP nginx_http_request_bytes_sent Number of HTTP request bytes sent
# TYPE nginx_http_request_bytes_sent counter
nginx_http_request_bytes_sent{host="10.46.0.4"} 16885674
nginx_http_request_bytes_sent{host="testservers"} 4800
# HELP nginx_http_request_time HTTP request time
# TYPE nginx_http_request_time histogram
nginx_http_request_time_bucket{host="10.46.0.4",le="00.005"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="00.010"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="00.020"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="00.030"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="00.050"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="00.075"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="00.100"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="00.200"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="00.300"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="00.400"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="00.500"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="00.750"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="01.000"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="01.500"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="02.000"} 4437
nginx_http_request_time_bucket{host="10.46.0.4",le="03.000"} 4443
nginx_http_request_time_bucket{host="10.46.0.4",le="04.000"} 4443
nginx_http_request_time_bucket{host="10.46.0.4",le="05.000"} 4443
nginx_http_request_time_bucket{host="10.46.0.4",le="10.000"} 4443
nginx_http_request_time_bucket{host="10.46.0.4",le="+Inf"} 4443
nginx_http_request_time_bucket{host="testservers",le="00.005"} 30
nginx_http_request_time_bucket{host="testservers",le="00.010"} 30
nginx_http_request_time_bucket{host="testservers",le="00.020"} 30
nginx_http_request_time_bucket{host="testservers",le="00.030"} 30
nginx_http_request_time_bucket{host="testservers",le="00.050"} 30
nginx_http_request_time_bucket{host="testservers",le="00.075"} 30
nginx_http_request_time_bucket{host="testservers",le="00.100"} 30
nginx_http_request_time_bucket{host="testservers",le="00.200"} 30
nginx_http_request_time_bucket{host="testservers",le="00.300"} 30
nginx_http_request_time_bucket{host="testservers",le="00.400"} 30
nginx_http_request_time_bucket{host="testservers",le="00.500"} 30
nginx_http_request_time_bucket{host="testservers",le="00.750"} 30
nginx_http_request_time_bucket{host="testservers",le="01.000"} 30
nginx_http_request_time_bucket{host="testservers",le="01.500"} 30
nginx_http_request_time_bucket{host="testservers",le="02.000"} 30
nginx_http_request_time_bucket{host="testservers",le="03.000"} 30
nginx_http_request_time_bucket{host="testservers",le="04.000"} 30
nginx_http_request_time_bucket{host="testservers",le="05.000"} 30
nginx_http_request_time_bucket{host="testservers",le="10.000"} 30
nginx_http_request_time_bucket{host="testservers",le="+Inf"} 30
nginx_http_request_time_count{host="10.46.0.4"} 4443
nginx_http_request_time_count{host="testservers"} 30
nginx_http_request_time_sum{host="10.46.0.4"} 12.044999361038
nginx_http_request_time_sum{host="testservers"} 0
# HELP nginx_http_requests Number of HTTP requests
# TYPE nginx_http_requests counter
nginx_http_requests{host="10.46.0.4",status="200"} 4443
nginx_http_requests{host="testservers",status="200"} 30
# HELP nginx_http_upstream_bytes_received Number of HTTP upstream bytes received
# TYPE nginx_http_upstream_bytes_received counter
nginx_http_upstream_bytes_received{addr="10.12.13.14:80"} 0
nginx_http_upstream_bytes_received{addr="127.0.0.1:80"} 4800
# HELP nginx_http_upstream_connect_time HTTP upstream connect time
# TYPE nginx_http_upstream_connect_time histogram
nginx_http_upstream_connect_time_bucket{addr="10.12.13.14:80",le="03.000"} 6
nginx_http_upstream_connect_time_bucket{addr="10.12.13.14:80",le="04.000"} 6
nginx_http_upstream_connect_time_bucket{addr="10.12.13.14:80",le="05.000"} 6
nginx_http_upstream_connect_time_bucket{addr="10.12.13.14:80",le="10.000"} 6
nginx_http_upstream_connect_time_bucket{addr="10.12.13.14:80",le="+Inf"} 6
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="00.005"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="00.010"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="00.020"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="00.030"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="00.050"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="00.075"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="00.100"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="00.200"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="00.300"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="00.400"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="00.500"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="00.750"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="01.000"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="01.500"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="02.000"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="03.000"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="04.000"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="05.000"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="10.000"} 30
nginx_http_upstream_connect_time_bucket{addr="127.0.0.1:80",le="+Inf"} 30
nginx_http_upstream_connect_time_count{addr="10.12.13.14:80"} 6
nginx_http_upstream_connect_time_count{addr="127.0.0.1:80"} 30
nginx_http_upstream_connect_time_sum{addr="10.12.13.14:80"} 12.015
nginx_http_upstream_connect_time_sum{addr="127.0.0.1:80"} 0.006
# HELP nginx_http_upstream_header_time HTTP upstream header time
# TYPE nginx_http_upstream_header_time histogram
nginx_http_upstream_header_time_bucket{addr="10.12.13.14:80",le="03.000"} 6
nginx_http_upstream_header_time_bucket{addr="10.12.13.14:80",le="04.000"} 6
nginx_http_upstream_header_time_bucket{addr="10.12.13.14:80",le="05.000"} 6
nginx_http_upstream_header_time_bucket{addr="10.12.13.14:80",le="10.000"} 6
nginx_http_upstream_header_time_bucket{addr="10.12.13.14:80",le="+Inf"} 6
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="00.005"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="00.010"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="00.020"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="00.030"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="00.050"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="00.075"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="00.100"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="00.200"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="00.300"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="00.400"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="00.500"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="00.750"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="01.000"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="01.500"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="02.000"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="03.000"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="04.000"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="05.000"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="10.000"} 30
nginx_http_upstream_header_time_bucket{addr="127.0.0.1:80",le="+Inf"} 30
nginx_http_upstream_header_time_count{addr="10.12.13.14:80"} 6
nginx_http_upstream_header_time_count{addr="127.0.0.1:80"} 30
nginx_http_upstream_header_time_sum{addr="10.12.13.14:80"} 12.015
nginx_http_upstream_header_time_sum{addr="127.0.0.1:80"} 0.03
# HELP nginx_http_upstream_requests Number of HTTP upstream requests
# TYPE nginx_http_upstream_requests counter
nginx_http_upstream_requests{addr="10.12.13.14:80",status="504"} 6
nginx_http_upstream_requests{addr="127.0.0.1:80",status="200"} 30
# HELP nginx_http_upstream_response_time HTTP upstream response time
# TYPE nginx_http_upstream_response_time histogram
nginx_http_upstream_response_time_bucket{addr="10.12.13.14:80",le="03.000"} 6
nginx_http_upstream_response_time_bucket{addr="10.12.13.14:80",le="04.000"} 6
nginx_http_upstream_response_time_bucket{addr="10.12.13.14:80",le="05.000"} 6
nginx_http_upstream_response_time_bucket{addr="10.12.13.14:80",le="10.000"} 6
nginx_http_upstream_response_time_bucket{addr="10.12.13.14:80",le="+Inf"} 6
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="00.005"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="00.010"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="00.020"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="00.030"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="00.050"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="00.075"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="00.100"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="00.200"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="00.300"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="00.400"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="00.500"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="00.750"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="01.000"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="01.500"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="02.000"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="03.000"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="04.000"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="05.000"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="10.000"} 30
nginx_http_upstream_response_time_bucket{addr="127.0.0.1:80",le="+Inf"} 30
nginx_http_upstream_response_time_count{addr="10.12.13.14:80"} 6
nginx_http_upstream_response_time_count{addr="127.0.0.1:80"} 30
nginx_http_upstream_response_time_sum{addr="10.12.13.14:80"} 12.015
nginx_http_upstream_response_time_sum{addr="127.0.0.1:80"} 0.03
# HELP nginx_metric_errors_total Number of nginx-lua-prometheus errors
# TYPE nginx_metric_errors_total counter
nginx_metric_errors_total 0
3. 驗證
通路Prometheus,驗證資料采集,打開Status下的Service Discovery,active的數目等于Pod資料即表示采集正常
![](https://img.laitimes.com/img/9ZDMuAjOiMmIsIjOiQnIsISPrdEZwZ1Rh5WNXp1bwNjW1ZUba9VZwlHdsATOfd3bkFGazxCMx8VesATMfhHLlN3XnxCMwEzX0xiRGZkRGZ0Xy9GbvNGLpZTY1EmMZVDUSFTU4VFRR9Fd4VGdsYTMfVmepNHLrJXYtJXZ0F2dvwVZnFWbp1zczV2YvJHctM3cv1Ce-cmbw5CZlZTMxE2YhRWO2ETY3YGNmBDMzcjZxU2MmV2N3kzN58CX0AzLcNDM5EDMy8CXn9Gbi9CXzV2Zh1WavwVbvNmLvR3YxUjLxM3Lc9CX6MHc0RHaiojIsJye.png)
打開Graph頁面,選擇我們剛才推送的資料名額名稱,點選Execute,即可檢視到采集上來的資料。prometheus查詢文法請參考prometheus查詢語句示例。