寫在前面
- 學習
這裡整理記憶k8s
- 博文内容涉及:
-
,LivenessProbe
兩種ReadinessProbe
的一些基本理論探針
-
ExecAction
TCPSocketAction
三種HTTPGetAction
的Demo檢測方式
中秋明月,豪門有,貧家也有。極慰人心。 ——烽火戲諸侯《劍來》
*
Pod健康檢查和服務可用性檢查
健康檢查的目的
探測的目的
: 用來維持 pod的健壯性,當pod挂掉之後,deployment會生成新的pod,但如果pod是正常運作的,但pod裡面出了問題,此時deployment是監測不到的。故此需要探測(probe)-pod是不是正常提供服務的
探針類似
Kubernetes
對
Pod
的健康狀态可以通過兩類探針來檢查:
LivenessProbe
和
ReadinessProbe
, kubelet定期執行這兩類探針來診斷容器的健康狀況。都是通過deployment實作的
探針類型 | 描述 |
---|---|
LivenessProbe探針 | 用于判斷容器是否存活(Running狀态) ,如果LivenessProbe探針探測到容器不健康,則kubelet将殺掉該容器,并根據容器的重新開機政策做相應的處理。如果一個容器不包含LivenesspProbe探針,那麼kubelet認為該容器的LivenessProbe探針傳回的值永遠是Success。 |
ReadinessProbe探針 | 用于判斷容器服務是否可用(Ready狀态) ,達到Ready狀态的Pod才可以接收請求。對于被Service管理的Pod, Service與Pod Endpoint的關聯關系也将基于Pod是否Ready進行設定。如果在運作過程中Ready狀态變為False,則系統自動将其從Service的後端Endpoint清單中隔離出去,後續再把恢複到Ready狀态的Pod加回後端Endpoint清單。這樣就能保證用戶端在通路Service時不會被轉發到服務不可用的Pod執行個體上。 |
檢測方式及參數配置
LivenessProbe
ReadinessProbe
均可配置以下三種實作方式。
方式 | |
---|---|
ExecAction | 在容器内部執行一個指令,如果該指令的傳回碼為0,則表明容器健康。 |
TCPSocketAction | 通過容器的IP位址和端口号執行TC檢查,如果能夠建立TCP連接配接,則表明容器健康。 |
HTTPGetAction | 通過容器的IP位址、端口号及路徑調用HTTP Get方法,如果響應的狀态碼大于等于200且小于400,則認為容器健康。 |
對于每種探測方式,需要設定
initialDelaySeconds
timeoutSeconds
等參數,它們的含義分别如下。
參數 | |
---|---|
initialDelaySeconds: | 啟動容器後進行首次健康檢查的等待時間,機關為s。 |
timeoutSeconds: | 健康檢查發送請求後等待響應的逾時時間,機關為s。當逾時發生時, kubelet會認為容器已經無法提供服務,将會重新開機該容器。 |
periodSeconds | 執行探測的頻率,預設是10秒,最小1秒。 |
successThreshold | 探測失敗後,最少連續探測成功多少次才被認定為成功,預設是1,對于liveness必須是1,最小值是1。 |
failureThreshold | 當 Pod 啟動了并且探測到失敗,Kubernetes 的重試次數。存活探測情況下的放棄就意味着重新啟動容器。就緒探測情況下的放棄 Pod 會被打上未就緒的标簽。預設值是 3。最小值是 1 |
Kubernetes的ReadinessProbe機制可能無法滿足某些複雜應用對容器内服務可用狀态的判斷
是以Kubernetes從1.11版本開始,引入PodReady++特性對Readiness探測機制進行擴充,在1.14版本時達到GA穩定版,稱其為Pod Readiness Gates。
通過Pod Readiness Gates機制,使用者可以将自定義的ReadinessProbe探測方式設定在Pod上,輔助Kubernetes設定Pod何時達到服務可用狀态(Ready) 。為了使自定義的ReadinessProbe生效,使用者需要提供一個外部的控制器(Controller)來設定相應的Condition狀态。
Pod的Readiness Gates在Pod定義中的ReadinessGate字段進行設定。下面的例子設定了一個類型為www.example.com/feature-1的新ReadinessGate:
![]() |
---|
新增的自定義Condition的狀态(status)将由使用者自定義的外部控·制器設定,預設值為False. Kubernetes将在判斷全部readinessGates條件都為True時,才設定Pod為服務可用狀态(Ready為True) 。 |
這個不是太懂,需要以後再研究下 |
學習環境準備
┌──[[email protected]]-[~/ansible]
└─$mkdir liveness-probe
┌──[[email protected]]-[~/ansible]
└─$cd liveness-probe/
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl create ns liveness-probe
namespace/liveness-probe created
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl config current-context
kubernetes-admin@kubernetes
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl config set-context $(kubectl config current-context) --namespace=liveness-probe
Context "kubernetes-admin@kubernetes" modified.
用于判斷容器是否存活(Running狀态) ,如果LivenessProbe探針探測到容器不健康,則kubelet将殺掉該容器,并根據容器的重新開機政策做相應的處理
ExecAction方式:command
資源檔案定義
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$cat liveness-probe.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: pod-liveness
name: pod-liveness
spec:
containers:
- args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; slee 10
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5 #容器啟動的5s内不監測
periodSeconds: 5 #每5s鐘檢測一次
image: busybox
imagePullPolicy: IfNotPresent
name: pod-liveness
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
運作這個deploy。當pod建立成功後,建立檔案,并睡眠30s,删掉檔案在睡眠。使用liveness檢測檔案的存在
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl apply -f liveness-probe.yaml
pod/pod-liveness created
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
pod-liveness 1/1 Running 1 (8s ago) 41s # 30檔案沒有重新開機
運作超過30s後。檔案被删除,是以被健康檢測命中,pod根據重新開機政策重新開機
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
pod-liveness 1/1 Running 2 (34s ago) 99s
99s後已經從起了第二次
┌──[[email protected]]-[~/ansible]
└─$ansible 192.168.26.83 -m shell -a "docker ps | grep pod-liveness"
192.168.26.83 | CHANGED | rc=0 >>
00f4182c014e 7138284460ff "/bin/sh -c 'touch /…" 6 seconds ago Up 5 seconds k8s_pod-liveness_pod-liveness_liveness-probe_81b4b086-fb28-4657-93d0-bd23e67f980a_0
01c5cfa02d8c registry.aliyuncs.com/google_containers/pause:3.5 "/pause" 7 seconds ago Up 6 seconds k8s_POD_pod-liveness_liveness-probe_81b4b086-fb28-4657-93d0-bd23e67f980a_0
┌──[[email protected]]-[~/ansible]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
pod-liveness 1/1 Running 0 25s
┌──[[email protected]]-[~/ansible]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
pod-liveness 1/1 Running 1 (12s ago) 44s
┌──[[email protected]]-[~/ansible]
└─$ansible 192.168.26.83 -m shell -a "docker ps | grep pod-liveness"
192.168.26.83 | CHANGED | rc=0 >>
1eafd7e8a12a 7138284460ff "/bin/sh -c 'touch /…" 15 seconds ago Up 14 seconds k8s_pod-liveness_pod-liveness_liveness-probe_81b4b086-fb28-4657-93d0-bd23e67f980a_1
01c5cfa02d8c registry.aliyuncs.com/google_containers/pause:3.5 "/pause" 47 seconds ago Up 47 seconds k8s_POD_pod-liveness_liveness-probe_81b4b086-fb28-4657-93d0-bd23e67f980a_0
┌──[[email protected]]-[~/ansible]
└─$
檢視節點機docker中的容器ID,前後不一樣,确定是POD被殺掉後重新開機。
HTTPGetAction的方式
通過容器的IP位址、端口号及路徑調用HTTP Get方法,如果響應的狀态碼大于等于200且小于400,則認為容器健康。
建立資源檔案,即相關參數使用
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$cat liveness-probe-http.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: pod-livenss-probe
name: pod-livenss-probe
spec:
containers:
- image: nginx
imagePullPolicy: IfNotPresent
name: pod-livenss-probe
livenessProbe:
failureThreshold: 3 #當 Pod 啟動了并且探測到失敗,Kubernetes 的重試次數
httpGet:
path: /index.html
port: 80
scheme: HTTP
initialDelaySeconds: 10 #容器啟動後第一次執行探測是需要等待多少秒
periodSeconds: 10 #執行探測的頻率,預設是10秒,最小1秒
successThreshold: 1 #探測失敗後,最少連續探測成功多少次才被認定為成功
timeoutSeconds: 10 #探測逾時時間,預設1秒,最小1秒
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
運作deploy,這個的探測機制通路Ngixn的預設歡迎頁
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$vim liveness-probe-http.yaml
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl apply -f liveness-probe-http.yaml
pod/pod-livenss-probe created
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
pod-livenss-probe 1/1 Running 0 15s
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl exec -it pod-livenss-probe -- rm /usr/share/nginx/html/index.html
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
pod-livenss-probe 1/1 Running 1 (1s ago) 2m31s
當歡迎頁被删除時,通路報錯,被檢測命中,pod重新開機
TCPSocketAction方式
通過容器的IP位址和端口号執行TCP檢查,如果能夠建立TCP連接配接,則表明容器健康。
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$cat liveness-probe-tcp.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: pod-livenss-probe
name: pod-livenss-probe
spec:
containers:
- image: nginx
imagePullPolicy: IfNotPresent
name: pod-livenss-probe
livenessProbe:
failureThreshold: 3
tcpSocket:
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 10
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
通路8080端口,但是8080端口未開放,是以通路會逾時,不能建立連接配接,命中檢測,重新開機Pod
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl apply -f liveness-probe-tcp.yaml
pod/pod-livenss-probe created
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
pod-livenss-probe 1/1 Running 0 8s
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
pod-livenss-probe 1/1 Running 1 (4s ago) 44s
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$
用于判斷容器服務是否可用(Ready狀态) ,達到Ready狀态的Pod才可以接收請求。負責不能進行通路
資源檔案定義,使用鈎子建好需要檢查的檔案
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$cat readiness-probe.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: pod-liveness
name: pod-liveness
spec:
containers:
- readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5 #容器啟動的5s内不監測
periodSeconds: 5 #每5s鐘檢測一次
image: nginx
imagePullPolicy: IfNotPresent
name: pod-liveness
resources: {}
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c","touch /tmp/healthy"]
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
建立3個有Ngixn的pod,通過POD建立一個SVC做測試用
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$sed 's/pod-liveness/pod-liveness-1/' readiness-probe.yaml | kubectl apply -f -
pod/pod-liveness-1 created
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$sed 's/pod-liveness/pod-liveness-2/' readiness-probe.yaml | kubectl apply -f -
pod/pod-liveness-2 created
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-liveness 1/1 Running 0 3m1s 10.244.70.50 vms83.liruilongs.github.io <none> <none>
pod-liveness-1 1/1 Running 0 2m 10.244.70.51 vms83.liruilongs.github.io <none> <none>
pod-liveness-2 1/1 Running 0 111s 10.244.70.52 vms83.liruilongs.github.io <none> <none>
修改首頁文字
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$serve=pod-liveness
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl exec -it $serve -- sh -c "echo $serve > /usr/share/nginx/html/index.html"
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl exec -it $serve -- sh -c "cat /usr/share/nginx/html/index.html"
pod-liveness
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$serve=pod-liveness-1
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl exec -it $serve -- sh -c "echo $serve > /usr/share/nginx/html/index.html"
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$serve=pod-liveness-2
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl exec -it $serve -- sh -c "echo $serve > /usr/share/nginx/html/index.html"
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$
修改标簽
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
pod-liveness 1/1 Running 0 15m run=pod-liveness
pod-liveness-1 1/1 Running 0 14m run=pod-liveness-1
pod-liveness-2 1/1 Running 0 14m run=pod-liveness-2
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl edit pods pod-liveness-1
pod/pod-liveness-1 edited
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl edit pods pod-liveness-2
pod/pod-liveness-2 edited
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
pod-liveness 1/1 Running 0 17m run=pod-liveness
pod-liveness-1 1/1 Running 0 16m run=pod-liveness
pod-liveness-2 1/1 Running 0 16m run=pod-liveness
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$
要删除檔案檢測
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl exec -it pod-liveness -- ls /tmp/
healthy
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl exec -it pod-liveness-1 -- ls /tmp/
healthy
使用POD建立SVC
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl expose --name=svc pod pod-liveness --port=80
service/svc exposed
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl get ep
NAME ENDPOINTS AGE
svc 10.244.70.50:80,10.244.70.51:80,10.244.70.52:80 16s
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc ClusterIP 10.104.246.121 <none> 80/TCP 36s
┌──[[email protected]]-[~/ansible/liveness-probe]
└─$kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-liveness 1/1 Running 0 24m 10.244.70.50 vms83.liruilongs.github.io <none> <none>
pod-liveness-1 1/1 Running 0 23m 10.244.70.51 vms83.liruilongs.github.io <none> <none>
pod-liveness-2 1/1 Running 0 23m 10.244.70.52 vms83.liruilongs.github.io <none> <none>
測試SVC正常,三個POD會正常 負載
┌──[[email protected]]-[~/ansible]
└─$while true; do curl 10.104.246.121 ; sleep 1
> done
pod-liveness
pod-liveness-2
pod-liveness
pod-liveness-1
pod-liveness-2
^C
删除檔案測試
┌──[[email protected]]-[~/ansible]
└─$kubectl exec -it pod-liveness -- rm -rf /tmp/
┌──[[email protected]]-[~/ansible]
└─$kubectl exec -it pod-liveness -- ls /tmp/
ls: cannot access '/tmp/': No such file or directory
command terminated with exit code 2
┌──[[email protected]]-[~/ansible]
└─$while true; do curl 10.104.246.121 ; sleep 1; done
pod-liveness-2
pod-liveness-2
pod-liveness-2
pod-liveness-1
pod-liveness-2
pod-liveness-2
pod-liveness-1
^C
會發現pod-liveness的pod已經不提供服務了
kubeadm 中的一些健康檢測
kube-apiserver.yaml中的使用,兩種探針同時使用
┌──[[email protected]]-[~/ansible]
└─$cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep -A 8 readi
readinessProbe:
failureThreshold: 3
httpGet:
host: 192.168.26.81
path: /readyz
port: 6443
scheme: HTTPS
periodSeconds: 1
timeoutSeconds: 15
┌──[[email protected]]-[~/ansible]
└─$cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep -A 9 liveness
livenessProbe:
failureThreshold: 8
httpGet:
host: 192.168.26.81
path: /livez
port: 6443
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
┌──[[email protected]]-[~/ansible]
└─$