簡介
ASM
ASM目前處于公測階段,歡迎掃碼入群進一步交流:
阿裡雲服務網格(Alibaba Cloud Service Mesh,簡稱
)提供了一個全托管式的服務網格平台,相容于社群 Istio 開源服務網格,用于簡化服務的治理,包括服務調用之間的流量路由與拆分管理、服務間通信的認證安全以及網格可觀測性能力,進而極大地減輕開發與運維的工作負擔。ASM的架構示意圖如下:
ASM 定位于混合雲、多雲、多叢集、非容器應用遷移等核心場景中,建構托管式統一的服務網格能力,能夠為阿裡雲使用者提供以下功能:
-
一緻的管理方式
以一緻的方式來管理運作于 ACK 托管 Kubernetes 叢集、專有 Kubernetes 叢集、ServerlessKubernetes 叢集、混合雲或多雲場景下的接入叢集上的應用服務,進而提供一緻的可觀測性和流量控制
-
統一的流量管理
支援容器或者虛拟機混合環境下統一的流量管理
-
控制平面核心元件托管化
托管控制平面的核心元件,最大限度地降低使用者資源開銷和運維成本
ArgoCD
是一個用于持續傳遞的Kubernetes配置管理工具。Argo CD 遵循 GitOps 模式,監聽目前運作中應用的狀态并與Git Repository中聲明的狀态進行比對,并自動将更新部署到環境中。ArgoCD的架構示意圖如下:
Flagger
是一個用于全自動化漸進式完成應用釋出的Kubernetes operator,它通過分析Prometheus收集到的監控名額并通過Istio、App Mesh等流量管理技術或工具完成應用的漸進式釋出。架構示意圖如下:
建立ASM執行個體
參考ASM
幫助文檔建立ASM執行個體并添加mesh01和mesh02 2個ACK叢集:
部署入口網關服務到mesh01叢集:
在控制平面建立一個命名空間test:
在控制平面建立一個Gateway:
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: public-gateway
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"
部署Flagger
分别在mesh1和mesh2 2個ACK叢集上按照以下步驟部署Flagger及其依賴元件:
部署Prometheus
$ kubectl apply -k github.com/haoshuwei/argocd-samples/flagger/prometheus/
使用ASM執行個體的kubeconfig建立secret:
$ kubectl -n istio-system create secret generic istio-kubeconfig --from-file kubeconfig
$ kubectl -n istio-system label secret istio-kubeconfig istio/multiCluster=true
helm安裝Flagger:
$ helm repo add flagger https://flagger.app
$ helm repo update
$ kubectl apply -f https://raw.githubusercontent.com/weaveworks/flagger/master/artifacts/flagger/crd.yaml
$ helm upgrade -i flagger flagger/flagger --namespace=istio-system --set crd.create=false --set meshProvider=istio --set metricsServer=http://prometheus:9090 --set istio.kubeconfig.secretName=istio-kubeconfig --set istio.kubeconfig.key=kubeconfig
部署Grafana
$ helm upgrade -i flagger-grafana flagger/grafana --namespace=istio-system --set url=http://prometheus:9090
我們可以在ASM執行個體的控制台上建立grafana服務的虛拟服務來供外部通路:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: grafana
namespace: istio-system
spec:
hosts:
- "grafana.istio.example.com"
gateways:
- public-gateway.istio-system.svc.cluster.local
http:
- route:
- destination:
host: flagger-grafana
通路服務:
建立命名空間并添加标簽
$ kubectl create ns test
$ kubectl label namespace test istio-injection=enabled
部署ArgoCD
我們可以選擇任意一個ACK叢集部署ArgoCD
部署ArgoCD Server:
$ kubectl create namespace argocd
$ kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
部署ArgoCD CLI:
$ VERSION=$(curl --silent "https://api.github.com/repos/argoproj/argo-cd/releases/latest" | grep '"tag_name"' | sed -E 's/.*"([^"]+)".*/\1/')
$ curl -sSL -o /usr/local/bin/argocd https://github.com/argoproj/argo-cd/releases/download/$VERSION/argocd-linux-amd64
$ chmod +x /usr/local/bin/argocd
擷取和修改登入密碼:
$ kubectl get pods -n argocd -l app.kubernetes.io/name=argocd-server -o name | cut -d'/' -f 2
$ argocd login ip:port
$ argocd account update-password
完成應用全自動化漸進式釋出的GitOps流程示例
ArgoCD添加叢集并部署應用
在這個示例中,我們将會把示例應用
podinfo
部署到
mesh02
叢集,把
loadtester
測試應用部署到
mesh01
叢集,統一部署在
test
命名空間下。
添加Git Repository
https://github.com/haoshuwei/gitops-demo.git到ArgoCD:
$ argocd repo add https://github.com/haoshuwei/argocd-samples.git--name argocd-samples
repository 'https://github.com/haoshuwei/argocd-samples.git' added
$ argocd repo list
TYPE NAME REPO INSECURE LFS CREDS STATUS MESSAGE
git argocd-samples https://github.com/haoshuwei/argocd-samples.git false false false Successful
使用kubeconfig添加mesh01和mesh02 2個叢集到ArgoCD:
$ argocd cluster add mesh01 --kubeconfig=mesh01
INFO[0000] ServiceAccount "argocd-manager" created in namespace "kube-system"
INFO[0000] ClusterRole "argocd-manager-role" created
INFO[0000] ClusterRoleBinding "argocd-manager-role-binding" created
$ argocd cluster add mesh02 --kubeconfig=mesh02
INFO[0000] ServiceAccount "argocd-manager" created in namespace "kube-system"
INFO[0000] ClusterRole "argocd-manager-role" created
INFO[0000] ClusterRoleBinding "argocd-manager-role-binding" created
$ argocd cluster list |grep mesh
https://xx.xx.xxx.xx:6443 mesh02 1.16+ Successful
https://xx.xxx.xxx.xx:6443 mesh01 1.16+ Successful
部署應用
podinfo
到
mesh02
叢集:
$ argocd app create --project default --name podinfo --repo https://github.com/haoshuwei/argocd-samples.git --path flagger/podinfo --dest-server https://xx.xx.xxx.xx:6443 --dest-namespace test --revision latest --sync-policy automated
以上指令行做的事情是建立一個應用
podinfo
,這個應用的Git Repository源是
https://github.com/haoshuwei/gitops-demo.git
項目
flagger/podinfo
子目錄下的檔案,分支為
latest
,應用部署到
https://xx.xx.xxx.xx:6443
叢集的
test
命名空間下,應用的同步政策是
automated
。
flagger/podinfo
子目錄下包括4個編排檔案
deployment.yaml
hpa.yaml
kustomization.yaml
和
canary.yaml
,其中
canary.yaml
檔案就是我們這個示例中完成應用全自動化漸進式金絲雀釋出的核心編排檔案,内容如下:
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: podinfo
namespace: test
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
progressDeadlineSeconds: 60
autoscalerRef:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
name: podinfo
service:
port: 9898
gateways:
- public-gateway.istio-system.svc.cluster.local
hosts:
- app.istio.example.com
trafficPolicy:
tls:
# use ISTIO_MUTUAL when mTLS is enabled
mode: DISABLE
analysis:
interval: 30s
threshold: 10
maxWeight: 50
stepWeight: 5
metrics:
- name: request-success-rate
threshold: 99
interval: 30s
- name: request-duration
threshold: 500
interval: 30s
webhooks:
- name: load-test
url: http://loadtester.test/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://podinfo-canary.test:9898/"
canary.yaml
檔案中定義了以下幾個關鍵部分
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
progressDeadlineSeconds: 60
autoscalerRef:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
name: podinfo
以上字段表示這個
canary
資源會監聽和引用名為
podinfo
的deployments資源和HorizontalPodAutoscaler資源。
service:
port: 9898
gateways:
- public-gateway.istio-system.svc.cluster.local
hosts:
- app.istio.example.com
trafficPolicy:
tls:
# use ISTIO_MUTUAL when mTLS is enabled
mode: DISABLE
以上字段表示
canary
資源會在ASM控制台自動為
podinfo
應用建立虛拟服務,名字也是
podinfo
analysis:
interval: 30s
threshold: 5
maxWeight: 50
stepWeight: 5
metrics:
- name: request-success-rate
threshold: 99
interval: 30s
- name: request-duration
threshold: 500
interval: 30s
webhooks:
- name: load-test
url: http://loadtester.test/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://podinfo-canary.test:9898/"
以上字段表示我們在釋出新版本
podinfo
應用時,要先對新版本應用做一些測試和分析,
interval: 30s
, 每隔30s測試一次
threshold: 5
, 失敗次數超過5次則認為失敗
maxWeight: 50
, 流量權重最大可以切換到50
stepWeight: 5
, 每次增權重重為5
metrics
中定義了2種名額,
request-success-rate
請求成功率不能小于
99
request-duration
RT均值不能大于500ms
用來生成測試任務的則定義在
webhooks
字段。
部署測試應用
loadtester
mesh01
$ argocd app create --project default --name loadtester --repo https://github.com/haoshuwei/argocd-samples.git --path flagger/charts/loadtester --dest-server https://xx.xxx.xxx.xx:6443 --dest-namespace test --revision latest --sync-policy automated
以上應用建立完成後,由于我們設定的sync政策為自動部署,是以應用會自動部署到mesh01和mesh02叢集中,我們可以在ArgoCD頁面上檢視應用詳細資訊:
podinfo詳情:
loadtester詳情:
在ASM的控制台我們可以檢視flagger動态建立的虛拟服務和目标規則:
GitOps自動釋出
建立分支修改應用容器鏡像版本送出,并建立指向latest分支的PullRequest:
管理者審批并merge pr後,latest分支有新代碼進入,ArgoCD會自動把更新同步叢集環境中,flagger檢測到podinfo應用有新版本變更,則開始自動化漸進式地釋出新版本應用,通過以下指令可以檢視應用釋出進度:
$ watch kubectl get canaries --all-namespaces
Every 2.0s: kubectl get canaries --all-namespaces Tue Mar 17 19:04:20 2020
NAMESPACE NAME STATUS WEIGHT LASTTRANSITIONTIME
test podinfo Progressing 10 2020-03-17T11:04:01Z
通路應用可以看到有流量切換到新版本上:
同時我們也可以在grafana面闆中檢視到新版本測試名額情況:
整個釋出過程的messages輸出如下:
"msg":"New revision detected! Scaling up podinfo.test","canary":"podinfo.test"
"msg":"Starting canary analysis for podinfo.test","canary":"podinfo.test"
"msg":"Advance podinfo.test canary weight 5","canary":"podinfo.test"
"msg":"Advance podinfo.test canary weight 10","canary":"podinfo.test"
"msg":"Advance podinfo.test canary weight 15","canary":"podinfo.test"
"msg":"Advance podinfo.test canary weight 20","canary":"podinfo.test"
"msg":"Advance podinfo.test canary weight 25","canary":"podinfo.test"
"msg":"Advance podinfo.test canary weight 30","canary":"podinfo.test"
"msg":"Advance podinfo.test canary weight 35","canary":"podinfo.test"
"msg":"Advance podinfo.test canary weight 40","canary":"podinfo.test"
"msg":"Advance podinfo.test canary weight 45","canary":"podinfo.test"
"msg":"Advance podinfo.test canary weight 50","canary":"podinfo.test"
"msg":"Copying podinfo.test template spec to podinfo-primary.test","canary":"podinfo.test"
"msg":"Halt advancement podinfo-primary.test waiting for rollout to finish: 3 of 4 updated replicas are available","canary":"podinfo.test"
"msg":"Routing all traffic to primary","canary":"podinfo.test"
"msg":"Promotion completed! Scaling down podinfo.test","canary":"podinfo.test"
應用釋出完畢後,所有流量切換到新版本上:
若新版本應用測試名額不達标,則應用自動復原到初始穩定狀态:
"msg":"New revision detected! Scaling up podinfo.test","canary":"podinfo.test"
"msg":"Starting canary analysis for podinfo.test","canary":"podinfo.test"
"msg":"Advance podinfo.test canary weight 10","canary":"podinfo.test"
"msg":"Halt advancement no values found for istio metric request-success-rate probably podinfo.test is not receiving traffic","canary":"podinfo.test"
"msg":"Halt advancement no values found for istio metric request-duration probably podinfo.test is not receiving traffic","canary":"podinfo.test"
"msg":"Halt advancement no values found for istio metric request-duration probably podinfo.test is not receiving traffic","canary":"podinfo.test"
"msg":"Halt advancement no values found for istio metric request-duration probably podinfo.test is not receiving traffic","canary":"podinfo.test"
"msg":"Halt advancement no values found for istio metric request-duration probably podinfo.test is not receiving traffic","canary":"podinfo.test"
"msg":"Synced test/podinfo"
"msg":"Rolling back podinfo.test failed checks threshold reached 5","canary":"podinfo.test"
"msg":"Canary failed! Scaling down podinfo.test","canary":"podinfo.test"
參考資料
https://medium.com/google-cloud/automated-canary-deployments-with-flagger-and-istio-ac747827f9d1 https://docs.flagger.app/ https://docs.flagger.app/dev/upgrade-guide#istio-telemetry-v2 https://github.com/weaveworks/flagger[](
https://help.aliyun.com/document_detail/149550.html)