
1. 前提條件
本最佳實踐的軟體環境要求如下:
應用環境:
①容器服務ACK基于專有雲V3.10.0版本。
②公共雲雲企業網服務CEN。
③公共雲彈性伸縮組服務ESS。
配置條件:
1)使用專有雲的容器服務或者在ECS上手動部署靈活PaaS。
2)開通雲專線,打通容器服務所在VPC與公共雲上的VPC。
3)開通公共雲彈性伸縮組服務(ESS)。
2. 背景資訊
本實踐基于K8s的業務叢集運作在專有雲上,對測試業務進行壓力測試,主要基于以下三種産品和能力:
①利用阿裡雲的雲企業網專線打通專有雲和公共雲,實作兩朵雲上VPC網絡互通。
②利用K8s(Kubernetes)的HPA能力,實作容器的水準伸縮。
③利用K8s的Cluster Autoscaler和阿裡雲彈性伸縮組ESS能力實作節點的自動伸縮。
HPA(Horizontal Pod Autoscaler)是K8s的一種資源對象,能夠根據CPU、記憶體等名額對statefulset、deployment等對象中的pod數量進行動态伸縮,使運作在上面的服務對名額的變化有一定的自适應能力。
當被測試業務名額達到上限時,觸發HPA自動擴容業務pod;當業務叢集無法承載更多pod時,觸發公共雲的ESS服務,在公共雲内擴容出ECS并自動添加到專有雲的K8s叢集。
圖 1:架構原理圖
3. 配置HPA
本示例建立了一個支援HPA的nginx應用,建立成功後,當Pod的使用率超過本例中設定的20%使用率時,則會進行水準擴容,低于20%的時候會進行縮容。
1.若使用自建K8s叢集,則通過yaml檔案配置HPA
1)建立一個nginx應用,必須為應用設定request值,否則HPA不會生效。
apiVersion:
app/v1beta2
kind: Deployment
spec:
template:
metadata:
creationTimestamp: null
labels:
app: hpa-test
spec:
dnsPolicy: ClusterFirst
terminationGracePeriodSeconds:30
containers:
image: '192.168.**.***:5000/admin/hpa-example:v1'
imagePullPolicy: IfNotPresent
terminationMessagePolicy:File
terminationMessagePath:/dev/termination-log
name: hpa-test
resources:
requests:
cpu: //必須設定request值
securityContext: {}
restartPolicy:Always
schedulerName:default-scheduler
replicas: 1
selector:
matchLabels:
app: hpa-test
revisionHistoryLimit: 10
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
progressDeadlineSeconds: 600
2)建立HPA。
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
annotations:
autoscaling.alpha.kubernetes.io/conditions:'[{"type":"AbleToScale","status":"True","lastTransitionTime":"2020-04-29T06:57:28Z","reason":"ScaleDownStabilized","message":"recent
recommendations were higher than current one, applying the highest recent
recommendation"},{"type":"ScalingActive","status":"True","lastTransitionTime":"2020-04-29T06:57:28Z","reason":"ValidMetricFound","message":"theHPA
was able to successfully calculate a replica count from cpu resource
utilization(percentage of
request)"},{"type":"ScalingLimited","status":"False","lastTransitionTime":"2020-04-29T06:57:28Z","reason":"DesiredWithinRange","message":"thedesired
count is within the acceptable range"}]'
autoscaling.alpha.kubernetes.io/currentmetrics:'[{"type":"Resource","resource":{"name":"cpu","currentAverageUtilization":0,"currentAverageValue":"0"}}]'
creationTimestamp: 2020-04-29T06:57:13Z
name: hpa-test
namespace: default
resourceVersion: "3092268"
selfLink:
/apis/autoscaling/v1/namespaces/default/horizontalpodautoscalers/hpa01
uid: a770ca26-89e6-11ea-a7d7-00163e0106e9
spec:
maxReplicas: //設定pod數量
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1beta2
kind: Deployment
name: centos
targetCPUUtilizationPercentage://設定CPU門檻值
2.若使用阿裡雲容器服務,需要在部署應用時選擇配置HPA
圖2:通路設定
4. 配置Cluster Autoscaler
資源請求(Request)的正确、合理設定,是彈性伸縮的前提條件。節點自動伸縮元件基于K8s資源排程的配置設定情況進行伸縮判斷,節點中資源的配置設定通過資源請(Request)進行計算。
當Pod由于資源請求(Request)無法滿足并進入等待(Pending)狀态時,節點自動伸縮元件會根據彈性伸縮組配置資訊中的資源規格以及限制配置,計算所需的節點數目。
如果可以滿足伸縮條件,則會觸發伸縮組的節點加入。而當一個節點在彈性伸縮組中且節點上Pod的資源請求低于門檻值時,節點自動伸縮元件會将節點進行縮容。
1.配置彈性伸縮組ESS
1)建立ESS彈性伸縮組,記錄最小執行個體數和最大執行個體數。
圖3:修改伸縮組
2)建立伸縮配置,記錄伸縮配置的id。
圖4:伸縮配置
#!/bin/sh
yum install -y ntpdate && ntpdate -u ntp1.aliyun.com && curl http:// example.com/public/hybrid/attach_local_node_aliyun.sh | bash -s -- --docker-version 17.06.2-ce-3 --token
9s92co.y2gkocbumal4fz1z --endpoint 192.168.**.***:6443 --cluster-dns 10.254.**.**
--region cn-huhehaote
echo "{" > /etc/docker/daemon.json
echo "\"registry-mirrors\": [" >>
/etc/docker/daemon.json
echo "\"https://registry-vpc.cn-huhehaote.aliyuncs.com\"" >> /etc/docker/daemon.json
echo "]," >> /etc/docker/daemon.json
echo "\"insecure-registries\": [\"https://192.168.**.***:5000\"]" >> /etc/docker/daemon.json
echo "}" >> /etc/docker/daemon.json
systemctl restart docker
2.K8s叢集部署autoscaler
kubectl apply -f ca.yml
參考ca.yml建立autoscaler,注意修改如下配置與實際環境相對應。
access-key-id: "TFRBSWlCSFJyeHd2QXZ6****"
access-key-secret: "bGIyQ3NuejFQOWM0WjFUNjR4WTVQZzVPRXND****"
region-id: "Y24taHVoZWhh****"
ca.yal代碼如下:
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-autoscaler
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
resources: ["events","endpoints"]
verbs: ["create", "patch"]
- apiGroups: [""]
resources: ["pods/eviction"]
verbs: ["create"]
- apiGroups: [""]
resources: ["pods/status"]
verbs: ["update"]
- apiGroups: [""]
resources: ["endpoints"]
resourceNames: ["cluster-autoscaler"]
verbs: ["get","update"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["watch","list","get","update"]
- apiGroups: [""]
resources: ["pods","services","replicationcontrollers","persistentvolumeclaims","persistentvolumes"]
verbs: ["watch","list","get"]
- apiGroups: ["extensions"]
resources: ["replicasets","daemonsets"]
verbs: ["watch","list","get"]
- apiGroups: ["policy"]
resources: ["poddisruptionbudgets"]
verbs: ["watch","list"]
- apiGroups: ["apps"]
resources: ["statefulsets"]
verbs: ["watch","list","get"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["watch","list","get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["create","list","watch"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-autoscaler
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-autoscaler
subjects:
- kind: ServiceAccount
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: cluster-autoscaler
subjects:
- kind: ServiceAccount
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: v1
kind: Secret
metadata:
name: cloud-config
namespace: kube-system
type: Opaque
data:
access-key-id: "TFRBSWlCSFJyeHd2********"
access-key-secret: "bGIyQ3NuejFQOWM0WjFUNjR4WTVQZzVP*********"
region-id: "Y24taHVoZW********"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
app: cluster-autoscaler
spec:
replicas: 1
selector:
matchLabels:
app: cluster-autoscaler
template:
metadata:
labels:
app: cluster-autoscaler
spec:
dnsConfig:
nameservers:
- 100.XXX.XXX.XXX
- 100.XXX.XXX.XXX
nodeSelector:
ca-key: ca-value
priorityClassName: system-cluster-critical
serviceAccountName: admin
containers:
- image: 192.XXX.XXX.XXX:XX/admin/autoscaler:v1.3.1-7369cf1
name: cluster-autoscaler
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 300Mi
command:
- ./cluster-autoscaler
- '--v=5'
- '--stderrthreshold=info'
- '--cloud-provider=alicloud'
- '--scan-interval=30s'
- '--scale-down-delay-after-add=8m'
- '--scale-down-delay-after-failure=1m'
- '--scale-down-unready-time=1m'
- '--ok-total-unready-count=1000'
- '--max-empty-bulk-delete=50'
- '--expander=least-waste'
- '--leader-elect=false'
- '--scale-down-unneeded-time=8m'
- '--scale-down-utilization-threshold=0.2'
- '--scale-down-gpu-utilization-threshold=0.3'
- '--skip-nodes-with-local-storage=false'
- '--nodes=0:5:asg-hp3fbu2zeu9bg3clraqj'
imagePullPolicy: "Always"
env:
- name: ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: cloud-config
key: access-key-id
- name: ACCESS_KEY_SECRET
valueFrom:
secretKeyRef:
name: cloud-config
key: access-key-secret
- name: REGION_ID
valueFrom:
secretKeyRef:
name: cloud-config
key: region-id
5. 執行結果
模拟業務通路:
啟動busybox鏡像,在pod内執行如下指令通路以上應用的service,可以同時啟動多個pod增加業務負載。while true;do wget -q -O-
http://hpa-test/index.html;done
觀察HPA:
加壓前
圖 5:加壓前
加壓後
當CPU值達到門檻值後,會觸發pod的水準擴容。
圖 6:加壓後1
圖 7:加壓後2
觀察Pod:
當叢集資源不足時,新擴容出的pod處于pending狀态,此時将觸發cluster autoscaler,自動擴容節點。
圖8:伸縮活動
我們是阿裡雲智能全球技術服務-SRE團隊,我們緻力成為一個以技術為基礎、面向服務、保障業務系統高可用的工程師團隊;提供專業、體系化的SRE服務,幫助廣大客戶更好地使用雲、基于雲建構更加穩定可靠的業務系統,提升業務穩定性。我們期望能夠分享更多幫助企業客戶上雲、用好雲,讓客戶雲上業務運作更加穩定可靠的技術,您可用釘釘掃描下方二維碼,加入阿裡雲SRE技術學院釘釘圈子,和更多雲上人交流關于雲平台的那些事。