天天看點

通過定制化Prometheus實作定制化HPA

問題1:為啥要定制化HPA

以前,無論是OCP還是K8S通過CPU的使用率來實作HPA。通過記憶體使用率也可以實作HPA,但相對沒有CPU那麼有效(Java應用的記憶體變化并不像CPU那麼明顯)。

僅通過CPU使用率做HPA太單一,是以需要定制化HPA,比如通過配置https的通路量來實作HPA,這樣才更貼近應用。

問題2:HPA的scaleup和scaledown時間各是多少?

針對CPU使用率的HPA而言,當檢測到使用率超過門檻值時,進行pod增加。從業務角度,增加是越敏感越好。

那麼,當CPU使用率低于HPA設定的門檻值時,是否立刻scale down?

從業務的角度,肯定是不希望立刻減少。也就是需要一個類似“緩沖時間,讓子彈飛一會。

OCP和K8S HPA scale down的緩沖時間是5分鐘。

--horizontal-pod-autoscaler-cpu-initialization-period = 5 minutes

--horizontal-pod-autoscaler-downscale-stabilization = 5 minutes

--horizontal-pod-autoscaler-initial-readiness-delay = 30 seconds

--horizontal-pod-autoscaler-sync-period = 15 seconds

--horizontal-pod-autoscaler-tolerance = 0.1

上面的HPA的flag參數,可以修改,但是需要API v2beta1,隻有 隻有1.18 以上版本的k8s才有這個功能。而且這個API目前處于beta1,是以在OCP上,不建議進行修改,修改了不生效我不保證800支援。而且5分鐘時間上也夠了。我們知道有這檔子事就可以,後面這個API正式GA後再用不遲。

修改的方式我列出來:

通過定制化Prometheus實作定制化HPA

問題3:OCP上如何定制化HPA

OCP上預設有Prometheus,但如果定制化參數,我建議新部署一個Prometheus。讓OCP自帶的Prometheus做它預設幹的事情。本質上,custom hpa并不是說必須使用新的prometheus,使用ocp自帶的監控系統也是可以的。隻不過,處于業務監控和基礎設施監控分離的原則,可以建立promethues專門負責采集業務應用層面的名額,比如實驗中和http request。

prometheus本質是去抓應用的名額,也就是說,如果從prometheus的一個名額中擷取應用的名額,那麼這個應用就必須公開這個名額(寫應用的時候公開,或者用sidecar)。

通過定制化Prometheus實作定制化HPA

我們想在OCP中通過客戶化名額,如http_requests實作HPA,那就需要:

1. 應用開放了這個名額,否則Promthues 沒法抓。

2. 需要部署一個Promthues adapter,這個adapter有http_requests的名額。部署完adapter後,它會挺API。Promthues抓到應用http_requests的資料後,會讓在adpter中,然後HPA和這個adapter的API對接,才能通過http_requests進行HPA。http_requests是讓HPA能夠用http_requests名額的橋梁。

下面步驟的邏輯是:

一個新的HPA=====>一個新名額=======>一個新的Promthues adapter============>1個新的Promthues執行個體=====>1個新的service monitor執行個體===>一個新的namespace===>一個新的被監控的應用

配置步驟如下:

部署Prometheus Operator,通過UI部署Prometheus Operator

[[email protected] ~]$ oc new-project my-prometheus

在OCP的OperatorHub中安裝Prometheus Operator到my-prometheus項目中 click Operators > Installed Operators > Prometheus Operator

建立Service Monitor執行個體

(Watch pods based on a matchLabel selector

Be watched by a Prometheus instance based on the Service Monitor's labels)

apiVersion: monitoring.coreos.com/v1

kind: ServiceMonitor

metadata:

  name: pod-autoscale

  labels:

    lab: custom-hpa

spec:

  namespaceSelector:

    matchNames:

      - my-prometheus

      - my-hpa

  selector:

    matchLabels:

      app: pod-autoscale

  endpoints:

  - port: 8080-tcp

    interval: 30s

建立Prometheus執行個體

kind: Prometheus

  name: my-prometheus

    prometheus: my-prometheus

  namespace: my-prometheus

  replicas: 2

  serviceAccountName: prometheus-k8s

  securityContext: {}

  serviceMonitorSelector:

      lab: custom-hpa

建立并檢視prometheus的路由:

[[email protected] ~]$ oc expose svc prometheus-operated -n my-prometheus

route.route.openshift.io/prometheus-operated exposed

[[email protected] ~]$ oc get route prometheus-operated -o jsonpath='{.spec.host}{"\n"}' -n my-prometheus

prometheus-operated-my-prometheus.apps.weixinyucluster.bluecat.ltd

現在已經部署了ServiceMonitor和Prometheus執行個體,我們應該能夠在Prometheus UI中查詢http_requests_total名額,但是沒有資料。缺少兩個關鍵要素:

Prometheus沒有适當的RBAC權限來查詢其他名稱空間。

沒有設定可以轉換Kubernetes HPA的Prometheus名額的擴充卡

首先,可以通過為my-prometheus命名空間中的Prometheus使用的ServiceAccount賦予對my-hpa命名空間的适當通路權限來解決RBAC權限。

$ echo "---

kind: RoleBinding

apiVersion: rbac.authorization.k8s.io/v1

  name: my-prometheus-hpa

  namespace: my-hpa

subjects:

  - kind: ServiceAccount

    name: prometheus-k8s

    namespace: my-prometheus

roleRef:

  apiGroup: rbac.authorization.k8s.io

  kind: ClusterRole

  name: view" | oc create -f -

傳回Prometheus使用者界面,再次對http_requests_total執行查詢,您應該會看到結果。如果沒有立即看到結果,請耐心等待。

通過定制化Prometheus實作定制化HPA

Prometheus可以正常工作了,現在就可以将其連接配接到Kubernetes,以便HPA可以根據自定義名額進行操作。對象清單如下:

APIService

ServiceAccount

ClusterRole - custom metrics-server-resources

ClusterRole - custom-metrics-resource-reader

ClusterRoleBinding - custom-metrics:system:auth-delegator

ClusterRoleBinding - custom-metrics-resource-reader

ClusterRoleBinding - hpa-controller-custom-metrics

RoleBinding - custom-metrics-auth-reader

Secret

ConfigMap

Deployment

Service

建立所有對象

[[email protected] ~]$ oc create -f https://raw.githubusercontent.com/redhat-gpte-devopsautomation/ocp_advanced_deployment_resources/master/ocp4_adv_deploy_lab/custom_hpa/custom_adapter_kube_objects.yaml

apiservice.apiregistration.k8s.io/v1beta1.custom.metrics.k8s.io created

serviceaccount/my-metrics-apiserver created

clusterrole.rbac.authorization.k8s.io/my-metrics-server-resources created

clusterrole.rbac.authorization.k8s.io/my-metrics-resource-reader created

clusterrolebinding.rbac.authorization.k8s.io/my-metrics:system:auth-delegator created

clusterrolebinding.rbac.authorization.k8s.io/my-metrics-resource-reader created

clusterrolebinding.rbac.authorization.k8s.io/my-hpa-controller-custom-metrics created

rolebinding.rbac.authorization.k8s.io/my-metrics-auth-reader created

secret/cm-adapter-serving-certs created

configmap/adapter-config created

deployment.apps/custom-metrics-apiserver created

service/my-metrics-apiserver created

檢視API service被建立:

[[email protected] ~]$ oc get apiservice v1beta1.custom.metrics.k8s.io

NAME                            SERVICE                              AVAILABLE   AGE

v1beta1.custom.metrics.k8s.io   my-prometheus/my-metrics-apiserver   True        19s

檢視API中包含pods/http名額:

[[email protected] ~]$ oc get --raw /apis/custom.metrics.k8s.io/v1beta1/ | jq -r '.resources[] | select(.name | contains("pods/http"))'

{

  "name": "pods/http_requests",

  "singularName": "",

  "namespaced": true,

  "kind": "MetricValueList",

  "verbs": [

    "get"

  ]

}

================================================================

驗證應用的Custom HPA

[[email protected] ~]$ echo "---

 kind: HorizontalPodAutoscaler

 apiVersion: autoscaling/v2beta1

 metadata:

   name: pod-autoscale-custom

   namespace: my-hpa

 spec:

   scaleTargetRef:

     kind: DeploymentConfig

     name: pod-autoscale

     apiVersion: apps.openshift.io/v1

   minReplicas: 1

   maxReplicas: 5

   metrics:

     - type: Pods

       pods:

         metricName: http_requests

         targetAverageValue: 500m" | oc create -f -

horizontalpodautoscaler.autoscaling/pod-autoscale-custom created

為了生成負載,請打開另一個SSH終端并且運作:

[[email protected] ~]$ AUTOSCALE_ROUTE=$(oc get route pod-autoscale -n my-hpa -o jsonpath='{ .spec.host}')

[[email protected] ~]$ while true;do curl http://$AUTOSCALE_ROUTE;sleep .5;done

Hello! My name is pod-autoscale-2-pvrw8. I have served 19 requests so far.

Hello! My name is pod-autoscale-2-pvrw8. I have served 20 requests so far.

Hello! My name is pod-autoscale-2-pvrw8. I have served 21 requests so far.

Hello! My name is pod-autoscale-2-pvrw8. I have served 22 requests so far.

Hello! My name is pod-autoscale-2-pvrw8. I have served 23 requests so far.

檢視hpa的狀态:

[[email protected] ~]$  oc describe hpa pod-autoscale-custom -n my-hpa

Name:                       pod-autoscale-custom

Namespace:                  my-hpa

Labels:                     <none>

Annotations:                <none>

CreationTimestamp:          Fri, 31 Jul 2020 12:58:08 +0000

Reference:                  DeploymentConfig/pod-autoscale

Metrics:                    ( current / target )

  "http_requests" on pods:  2 / 500m

Min replicas:               1

Max replicas:               5

DeploymentConfig pods:      1 current / 4 desired

Conditions:

  Type            Status  Reason              Message

  ----            ------  ------              -------

  AbleToScale     True    SucceededRescale    the HPA controller was able to update the target scale to 4

  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from pods metric http_requests

  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range

Events:

  Type    Reason             Age               From                       Message

  ----    ------             ----              ----                       -------

  Normal  SuccessfulRescale  8s (x3 over 38s)  horizontal-pod-autoscaler  New size: 4; reason: pods metric http_requests above target

  Type    Reason             Age                From                       Message

  ----    ------             ----               ----                       -------

  Normal  SuccessfulRescale  12s (x3 over 42s)  horizontal-pod-autoscaler  New size: 4; reason: pods metric http_requests above target

确認pod已經擴容,并且擴容的原因是:pods metric http_requests above target

[[email protected] ~]$  oc get pods -n my-hpa

NAME                     READY   STATUS              RESTARTS   AGE

pod-autoscale-1-deploy   0/1     Completed           0          26m

pod-autoscale-2-2vrgc    0/1     ContainerCreating   0          1s

pod-autoscale-2-deploy   0/1     Completed           0          24m

pod-autoscale-2-dqdrg    0/1     ContainerCreating   0          1s

pod-autoscale-2-pvrw8    1/1     Running             0          24m

pod-autoscale-2-t52hd    0/1     ContainerCreating   0          1s

下一篇: 路由掃描