天天看点

kube-state-metrics 常用指标及含义

节点监控指标

获取​节点数:
sum(kube_node_info)

不可用的节点:
sum(kube_node_spec_unschedulable)

获取节点cpu核数:
sum(kube_node_status_capacity{resource="cpu"})by(node)

获取节点内存大小:
sum(kube_node_status_capacity{resource="memory"})by(node)

磁盘资源短缺的节点:
​kube_node_status_condition{condition="DiskPressure",status="true"}

内存资源短缺的节点:
kube_node_status_condition{condition="MemoryPressure",status="true"}

PID 资源短缺的节点:
kube_node_status_condition{condition="PIDPressure",status="true"}
           

Deployment 监控指标

获取各个deployment的副本数:
kube_deployment_status_replicas

获取总的replicas:
sum(kube_deployment_status_replicas)

更新了的replicas
kube_deployment_status_replicas_updated

不可用的replicas
kube_deployment_status_replicas_unavailable
           

Pods监控指标

pod 的状态
kube_pod_status_phase{phase="Running"}
kube_pod_status_phase{phase="Failed"}
kube_pod_status_phase{phase="Succeeded"}
kube_pod_status_phase{phase="Pending"}
kube_pod_status_phase{phase="Unknown"}

30分钟内重启过的pod
changes(kube_pod_container_status_restarts_total[30m])
           

容器监控指标

容器的状态
kube_pod_container_status_running
kube_pod_container_status_waiting
kube_pod_container_status_terminated

30分钟内重启过的容器
changes(kube_pod_container_status_restarts_total[30m])

请求cpu核数
kube_pod_container_resource_requests{resource="cpu"}

请求内存大小
kube_pod_container_resource_requests{resource="memory"}
           

PV/PVC 监控指标

pvc状态
kube_persistentvolumeclaim_status_phase{phase="Bound"}
kube_persistentvolumeclaim_status_phase{phase="Pending"}
kube_persistentvolumeclaim_status_phase{phase="Lost"}

pvc请求大小
sum(kube_persistentvolumeclaim_resource_requests_storage_bytes/1024/1024/1024)by(namespace,persistentvolumeclaim)

pv 大小
sum(kube_persistentvolume_capacity_bytes)by(persistentvolume)

volume利用率(已用/容量)
kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes

volume可用
sum(kubelet_volume_stats_available_bytes)by(persistentvolumeclaim)
           

继续阅读