天天看點

K8S問題排查-沒有Endpoint的Service請求Reject失效問題

作者:雲原生知識星球

問題背景

客戶的防火牆抓到了沒有Endpoint的Service請求,從K8S角度來說,正常情況下不應該存在這種現象的,因為沒有Endpoint的Service請求會被iptables規則reject掉才對。

分析過程

先本地環境複現,建立一個沒有後端的服務,例如grafana-service111:

[root@node01 ~]# kubectl get svc -A
NAMESPACE     NAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
default       kubernetes               ClusterIP   10.96.0.1       <none>        443/TCP                  2d
kube-system   grafana-service          ClusterIP   10.96.78.163    <none>        3000/TCP                 2d
kube-system   grafana-service111       ClusterIP   10.96.52.101    <none>        3000/TCP                 13s

[root@node01 ~]# kubectl get ep -A
NAMESPACE     NAME                      ENDPOINTS                                                       AGE
default       kubernetes                10.10.72.15:6443                                                2d
kube-system   grafana-service           10.78.104.6:3000,10.78.135.5:3000             2d
kube-system   grafana-service111        <none>                                                    18s           

進入一個業務Pod,并請求grafana-service111,結果請求卡住并逾時終止:

[root@node01 ~]# kubectl exec -it -n kube-system   influxdb-rs1-5bdc67f4cb-lnfgt bash
root@influxdb-rs1-5bdc67f4cb-lnfgt:/# time curl http://10.96.52.101:3000
curl: (7) Failed to connect to 10.96.52.101 port 3000: Connection timed out

real    2m7.307s
user    0m0.006s
sys     0m0.008s           

檢視grafana-service111的iptables規則,發現有reject規則,但從上面的實測現象看,應該是沒有生效:

[root@node01 ~]# iptables-save |grep 10.96.52.101
-A KUBE-SERVICES -d 10.96.52.101/32 -p tcp -m comment --comment "kube-system/grafana-service111: has no endpoints" -m tcp --dport 3000 -j REJECT --reject-with icmp-port-unreachable           

在業務Pod容器網卡上抓包,沒有發現響應封包(不符合預期):

[root@node01 ~]# tcpdump -n -i calie2568ca85e4 host 10.96.52.101
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on calie2568ca85e4, link-type EN10MB (Ethernet), capture size 262144 bytes
20:31:34.647286 IP 10.78.166.136.39230 > 10.96.52.101.hbci: Flags [S], seq 1890821953, win 29200, options [mss 1460,sackOK,TS val 792301056 ecr 0,nop,wscale 7], length 0           

在節點網卡上抓包,存在服務請求包(不符合預期):

[root@node01 ~]# tcpdump -n -i eth0 host 10.96.52.101
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
20:33:36.994881 IP 10.10.72.10.41234 > 10.96.52.101.hbci: Flags [S], seq 3530065013, win 29200, options [mss 1460,sackOK,TS val 792423403 ecr 0,nop,wscale 7], length 0
20:33:37.995298 IP 10.10.72.10.41234 > 10.96.52.101.hbci: Flags [S], seq 3530065013, win 29200, options [mss 1460,sackOK,TS val 792424404 ecr 0,nop,wscale 7], length 0
20:33:39.999285 IP 10.10.72.10.41234 > 10.96.52.101.hbci: Flags [S], seq 3530065013, win 29200, options [mss 1460,sackOK,TS val 792426408 ecr 0,nop,wscale 7], length 0           

既然reject規則存在,初步懷疑可能影響該規則的元件有兩個:

  1. kube-proxy
  2. calico-node

基于上一篇《使用Kubeasz一鍵部署K8S叢集》,在最新的K8S叢集上做相同的測試,發現不存在該問題,說明該問題在新版本已經修複了。分别在K8S和Calico的issue上查詢相關問題,最後發現是Calico的bug,相關issue見參考資料[1, 2],修複記錄見參考資料[3]。

下面是新老版本的Calico處理cali-FORWARD鍊的差異點:

有問題的環境:
[root@node4 ~]# iptables -t filter -S  cali-FORWARD
-N cali-FORWARD
-A cali-FORWARD -m comment --comment "cali:vjrMJCRpqwy5oRoX" -j MARK --set-xmark 0x0/0xe0000
-A cali-FORWARD -m comment --comment "cali:A_sPAO0mcxbT9mOV" -m mark --mark 0x0/0x10000 -j cali-from-hep-forward
-A cali-FORWARD -i cali+ -m comment --comment "cali:8ZoYfO5HKXWbB3pk" -j cali-from-wl-dispatch
-A cali-FORWARD -o cali+ -m comment --comment "cali:jdEuaPBe14V2hutn" -j cali-to-wl-dispatch
-A cali-FORWARD -m comment --comment "cali:12bc6HljsMKsmfr-" -j cali-to-hep-forward
-A cali-FORWARD -m comment --comment "cali:MH9kMp5aNICL-Olv" -m comment --comment "Policy explicitly accepted packet." -m mark --mark 0x10000/0x10000 -j ACCEPT
//問題在這最後這一條規則,新版本的calico把這條規則移到了FORWARD鍊

正常的環境:
[root@node01 ~]# iptables -t filter -S cali-FORWARD
-N cali-FORWARD
-A cali-FORWARD -m comment --comment "cali:vjrMJCRpqwy5oRoX" -j MARK --set-xmark 0x0/0xe0000
-A cali-FORWARD -m comment --comment "cali:A_sPAO0mcxbT9mOV" -m mark --mark 0x0/0x10000 -j cali-from-hep-forward
-A cali-FORWARD -i cali+ -m comment --comment "cali:8ZoYfO5HKXWbB3pk" -j cali-from-wl-dispatch
-A cali-FORWARD -o cali+ -m comment --comment "cali:jdEuaPBe14V2hutn" -j cali-to-wl-dispatch
-A cali-FORWARD -m comment --comment "cali:12bc6HljsMKsmfr-" -j cali-to-hep-forward
-A cali-FORWARD -m comment --comment "cali:NOSxoaGx8OIstr1z" -j cali-cidr-block           

下面是在最新的K8S叢集上做相同的測試記錄,可以跟異常環境做對比。

模拟一個業務請求pod:

[root@node01 home]# kubectl run busybox --image=busybox-curl:v1.0 --image-pull-policy=IfNotPresent -- sleep 300000
pod/busybox created

[root@node01 home]# kubectl get pod -A -owide
NAMESPACE     NAME                                         READY   STATUS    RESTARTS   AGE   IP             NODE           default       busybox                                      1/1     Running   0          14h   10.78.153.73   10.10.11.49            

模拟一個業務響應服務metrics-server111,并且該服務無後端endpoint:

[root@node01 home]# kubectl get svc -A
NAMESPACE     NAME                        TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                  AGE
default       kubernetes                  ClusterIP   10.68.0.1      <none>        443/TCP                  18h
kube-system   dashboard-metrics-scraper   ClusterIP   10.68.174.38   <none>        8000/TCP                 17h
kube-system   kube-dns                    ClusterIP   10.68.0.2      <none>        53/UDP,53/TCP,9153/TCP   17h
kube-system   kube-dns-upstream           ClusterIP   10.68.41.41    <none>        53/UDP,53/TCP            17h
kube-system   kubernetes-dashboard        NodePort    10.68.160.45   <none>        443:30861/TCP            17h
kube-system   metrics-server              ClusterIP   10.68.65.249   <none>        443/TCP                  17h
kube-system   metrics-server111           ClusterIP   10.68.224.53   <none>        443/TCP                  14h
kube-system   node-local-dns              ClusterIP   None           <none>        9253/TCP                 17h

[root@node01 ~]# kubectl get ep -A
NAMESPACE     NAME                        ENDPOINTS                                           AGE
default       kubernetes                  172.28.11.49:6443                                   18h
kube-system   dashboard-metrics-scraper   10.78.153.68:8000                                   18h
kube-system   kube-dns                    10.78.153.67:53,10.78.153.67:53,10.78.153.67:9153   18h
kube-system   kube-dns-upstream           10.78.153.67:53,10.78.153.67:53                     18h
kube-system   kubernetes-dashboard        10.78.153.66:8443                                   18h
kube-system   metrics-server              10.78.153.65:4443                                   18h
kube-system   metrics-server111           <none>                                              15h
kube-system   node-local-dns              172.28.11.49:9253                                   18h           

進入業務請求pod,做curl測試,請求立刻被拒絕(符合預期):

[root@node01 02-k8s]# kubectl exec -it busybox bash
/ # curl -i -k https://10.68.224.53:443
curl: (7) Failed to connect to 10.68.224.53 port 443 after 2 ms: Connection refused           

tcpdump抓取容器網卡封包,出現tcp port https unreachable(符合預期):

tcpdump -n -i cali12d4a061371
21:54:42.697437 IP 10.78.153.73.41606 > 10.68.224.53.https: Flags [S], seq 3510100476, win 29200, options [mss 1460,sackOK,TS val 2134372616 ecr 0,nop,wscale 7], length 0
21:54:42.698804 IP 10.10.11.49> 10.78.153.73: ICMP 10.68.224.53 tcp port https unreachable, length 68           

tcpdump抓取節點網卡封包,無請求從測試容器内發出叢集(符合預期);

[root@node01 bin]# tcpdump -n -i eth0 host 10.68.224.53
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
2 packets received by filter
0 packets dropped by kernel           

解決方案

更新Calico,要求版本>=v3.16.0。

參考資料

https://github.com/projectcalico/calico/issues/1055

https://github.com/projectcalico/calico/issues/3901

https://github.com/projectcalico/felix/pull/2424

繼續閱讀