前言
排錯的過程是痛苦的也是有趣的。
運維乃至IT,排錯能力是拉開人與人之間的重要差距。
本篇會記錄我的排錯之旅。
由來
現如今我司所有業務都運作在阿裡雲托管kubernetes環境上,因為前端需要對外通路,是以需要對外域名,考慮申請https證書過于麻煩,是以希望借助免費的工具自動生成tls證書。
借鑒于網上或者阿裡雲的相關文檔是存在大坑的,我認為有必要寫一篇無坑版的利用cert-manager自動簽發TLS證書。
思路
cert-manager是Kubernetes上一個管理SSL證書的插件,配合nginx-ingress可以對網站配置https通路,在加上letsencrypt提供免費的SSL證書,所有就産生了cert-manager+nginx-ingress+letsencrypt的免費套餐。詳情請到GitHub檢視:cert-manager
本文将介紹:基于阿裡雲托管kubernetes+cert-manager的 單域名,通配符域名證書申請。
部署
注:網上大多都使用helm 部署的,而helm部署确實非常簡單,我認為最好最好不要使用他人的helm清單,不然出問題,就不曉得是怎麼一個部署邏輯,還需要去分析資源清單。
前提
kubectl create namespace cert-manager #建立 cert-manager 命名空間
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true #标記 cert-manager 命名空間以禁用資源驗證
配置CRDs
wget https://github.com/jetstack/cert-manager/releases/download/v0.15.1/cert-manager-legacy.crds.yaml
kubectl apply --validate=false -f cert-manager-legacy.crds.yaml
配置cert-manager
# wget https://github.com/jetstack/cert-manager/releases/download/v0.15.1/cert-manager.yaml
# kubectl apply --validate=false -f cert-manager.yaml
這裡會發現pod一直處于ContainerCreating 容器建立中
# kubectl get pods -n cert-manager
NAME READY STATUS RESTARTS AGE
cert-manager-75856bb467-fr5zz 0/1 ContainerCreating 0 16m
cert-manager-cainjector-597f5b4768-jqsvp 0/1 ContainerCreating 0 16m
cert-manager-webhook-5c9f7b5f75-gnphd 0/1 ContainerCreating 0 16m
#檢視pod 詳情,會發現是因為拉取鏡像的問題,我的解決方案
#在香港機上拉取鏡像打标簽,推送到鏡像倉庫然後修改cert-manager.yaml的鏡像位址
#檢視原資源清單鏡像
# cat cert-manager.yaml |grep image
image: "quay.io/jetstack/cert-manager-cainjector:v0.15.1"
image: "quay.io/jetstack/cert-manager-controller:v0.15.1"
image: "quay.io/jetstack/cert-manager-webhook:v0.15.1"
#香港機拉取--> 打标簽 --> 推送
# docker pull quay.io/jetstack/cert-manager-cainjector:v0.15.1
# docker tag quay.io/jetstack/cert-manager-cainjector:v0.15.1 registry.cn-shenzhen.aliyuncs.com/test_test/cert-manager:cert-manager-cainjector-v0.15.1
# docker push registry.cn-shenzhen.aliyuncs.com/test_test/cert-manager:cert-manager-cainjector-v0.15.1
# docker pull quay.io/jetstack/cert-manager-controller:v0.15.1
# docker tag quay.io/jetstack/cert-manager-controller:v0.15.1 registry.cn-shenzhen.aliyuncs.com/test_test/cert-manager:cert-manager-controller-v0.15.1
# docker push registry.cn-shenzhen.aliyuncs.com/test_test/cert-manager:cert-manager-controller-v0.15.1
# docker pull quay.io/jetstack/cert-manager-webhook:v0.15.1
# docker tag quay.io/jetstack/cert-manager-webhook:v0.15.1 registry.cn-shenzhen.aliyuncs.com/test_test/cert-manager:cert-manager-webhook-v0.15.1
# docker push registry.cn-shenzhen.aliyuncs.com/test_test/cert-manager:cert-manager-webhook-v0.15.1
#現資源清單鏡像 【臨時鏡像倉庫位址為公開】
# cat cert-manager.yaml |grep image
#image: "quay.io/jetstack/cert-manager-cainjector:v0.15.1"
image: "registry.cn-shenzhen.aliyuncs.com/test_test/cert-manager:cert-manager-cainjector-v0.15.1"
#image: "quay.io/jetstack/cert-manager-controller:v0.15.1"
image: "registry.cn-shenzhen.aliyuncs.com/test_test/cert-manager:cert-manager-controller-v0.15.1"
#image: "quay.io/jetstack/cert-manager-webhook:v0.15.1"
image: "registry.cn-shenzhen.aliyuncs.com/test_test/cert-manager:cert-manager-webhook-v0.15.1"
# kubectl apply --validate=false -f cert-manager.yaml
# kubectl get pods -n cert-manager
NAME READY STATUS RESTARTS AGE
cert-manager-75856bb467-fr5zz 1/1 Running 0 1h
cert-manager-cainjector-597f5b4768-jqsvp 1/1 Running 0 1h
cert-manager-webhook-5c9f7b5f75-gnphd 1/1 Running 0 1h
驗證cert-manager
# cat test-cert-manager.yaml
##########################################################################
#Author: zisefeizhu
#QQ: 2********0
#Date: 2020-08-11
#FileName: test-cert-manager.yaml
#URL: https://www.cnblogs.com/zisefeizhu/
#Description: The test script
#Copyright (C): 2020 All rights reserved
###########################################################################
apiVersion: v1
kind: Namespace
metadata:
name: cert-manager-test
---
apiVersion: cert-manager.io/v1alpha2
kind: Issuer
metadata:
name: test-selfsigned
namespace: cert-manager-test
spec:
selfSigned: {}
---
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
name: selfsigned-cert
namespace: cert-manager-test
spec:
commonName: example.com
secretName: selfsigned-cert-tls
issuerRef:
name: test-selfsigned
# kubectl apply -f test-cert-manager.yaml
Normal GeneratedKey 20s cert-manager Generated a new private key
Normal Requested 20s cert-manager Created new CertificateRequest resource "selfsigned-cert-2334779822"
Normal Issued 20s cert-manager Certificate issued successfully
# kubectl delete -f test-cert-manager.yaml
通過dns配置域名證書
這裡樣式的是阿裡雲DNS操作的流程,如果需要其他平台的方法,可以自行開發,或者找已開源webhook,這是官方的例子:https://github.com/jetstack/cert-manager-webhook-example
這裡用的是這個包:https://github.com/pragkent/alidns-webhook
配置alidns的webhook
# wget https://raw.githubusercontent.com/pragkent/alidns-webhook/master/deploy/bundle.yaml
# kubectl apply -f bundle.yaml
# kubectl get pods -n cert-manager #檢視webhook
NAME READY STATUS RESTARTS AGE
cert-manager-webhook-alidns-6b87bc8597-tc9pk 1/1 Running 2 1h
配置Issuer
cert-manager
提供了
Issuer
和
ClusterIssuer
兩種類型的簽發機構,
Issuer
隻能用來簽發自己所在命名空間下的證書,
ClusterIssuer
可以簽發任意命名空間下的證書。
通過阿裡雲RAM建立一個賬号,并授權
AliyunDNSFullAccess,管理雲解析(DNS)的權限
,将賬号的AK記下來,并通過下面的指令建立secret,這個secret用于webhook在DNS認證的時候,會向DNS解析裡面寫入一條txt類型的記錄,認證完成後删除。
建立 alidns AccessKey Id 和 Secret
# kubectl -n cert-manager create secret generic alidns-access-key-id --from-literal=accessKeyId='xxxxxxx'
# kubectl -n cert-manager create secret generic alidns-access-key-secret --from-literal=accessKeySecret='xxxxxxx'
我這裡用
ClusterIssuer
為例,建立
letsencrypt-prod.yaml
檔案
# cat letsencrypt-prod.yaml
##########################################################################
#Author: zisefeizhu
#QQ: 2********0
#Date: 2020-08-10
#FileName: letsencrypt-prod.yaml
#URL: https://www.cnblogs.com/zisefeizhu/
#Description: The test script
#Copyright (C): 2020 All rights reserved
###########################################################################
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
labels:
name: letsencrypt-prod
name: letsencrypt-prod # 自定義的簽發機構名稱,後面會引用
spec:
acme:
email: [email protected] # 你的郵箱,證書快過期的時候會郵件提醒,不過我們可以設定自動續期
solvers:
- http01:
ingress:
class: nginx
privateKeySecretRef:
name: letsencrypt-prod # 訓示此簽發機構的私鑰将要存儲到哪個 Secret 對象中
server: https://acme-v02.api.letsencrypt.org/directory # acme 協定的服務端,我們用Let's Encrypt
讓我們看看acme協定的服務端資訊
{
"Dmrr3rQDHDQ": "https://community.letsencrypt.org/t/adding-random-entries-to-the-directory/33417",
"keyChange": "https://acme-v02.api.letsencrypt.org/acme/key-change",
"meta": {
"caaIdentities": [
"letsencrypt.org"
],
"termsOfService": "https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf",
"website": "https://letsencrypt.org"
},
"newAccount": "https://acme-v02.api.letsencrypt.org/acme/new-acct",
"newNonce": "https://acme-v02.api.letsencrypt.org/acme/new-nonce",
"newOrder": "https://acme-v02.api.letsencrypt.org/acme/new-order",
"revokeCert": "https://acme-v02.api.letsencrypt.org/acme/revoke-cert"
}
應用
yaml
# kubectl apply -f letsencrypt-prod.yaml
檢視狀态
# kubectl get ClusterIssuer
NAME READY AGE
letsencrypt-prod True 1h
至此,在kubernetes上利用 cert-manager 自動簽發 TLS 證書 理論上部署完畢,下面進行驗證!
驗證
在這裡我将提供兩種驗證方法:1. 手動簽發證書 2. 自動簽發證書
注意:這裡存在一個大坑,請留意!
手動簽發證書
# cat test-manual-cert.yaml
##########################################################################
#Author: zisefeizhu
#QQ: 2********0
#Date: 2020-08-11
#FileName: test-manual-cert.yaml
#URL: https://www.cnblogs.com/zisefeizhu/
#Description: The test script
#Copyright (C): 2020 All rights reserved
###########################################################################
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
name: test-monkeyrun-net-cert
spec:
secretName: tls-test-monkeyrun-net # 證書儲存的 secret 名
duration: 2160h # 90d
renewBefore: 360h # 15d
organization:
- jetstack
isCA: false
keySize: 2048
keyAlgorithm: rsa
keyEncoding: pkcs1
dnsNames:
- test01.advance.test.com
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
group: cert-manager.io
# kubectl apply -f test-manual-cert.yaml
certificate.cert-manager.io/test-monkeyrun-net-cert created
大坑
預警:坑要來了!!!
# kubectl get certificate #檢查是否生成證書檔案
NAME READY SECRET AGE
test-monkeyrun-net-cert False tls-test-monkeyrun-net 27s
# # kubectl get certificate
NAME READY SECRET AGE
test-monkeyrun-net-cert False tls-test-monkeyrun-net 27s
# kubectl describe certificate test-monkeyrun-net-cert #檢視詳情
Status:
Conditions:
Last Transition Time: 2020-08-11T02:53:06Z
Message: Waiting for CertificateRequest "test-monkeyrun-net-cert-1270901994" to complete
Reason: InProgress
Status: False
Type: Ready
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Requested 40s cert-manager Created new CertificateRequest resource "test-monkeyrun-net-cert-1270901994"
這裡證書生成是失敗了的,原因:Waiting for CertificateRequest "test-monkeyrun-net-cert-1270901994" to complete
一直在請求,也就是說請求不到。這個問題從上周五就開始困擾着我。這也是個大坑,可看到的文檔基本沒有講到
解決
檢視有關證書生成的元件落到的節點:
# kubectl get pods -n cert-manager -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cert-manager-75856bb467-fr5zz 1/1 Running 0 18h 192.168.0.98 cn-shenzhen.172.16.0.123 <none> <none>
cert-manager-cainjector-597f5b4768-jqsvp 1/1 Running 0 18h 192.168.0.101 cn-shenzhen.172.16.0.123 <none> <none>
cert-manager-webhook-5c9f7b5f75-gnphd 1/1 Running 0 18h 192.168.0.99 cn-shenzhen.172.16.0.123 <none> <none>
cert-manager-webhook-alidns-6b87bc8597-tc9pk 1/1 Running 2 17h 192.168.0.13 cn-shenzhen.172.16.0.122 <none> <none>
發現它們落在不同的節點上,靈光一閃,想起來了一件事:https://www.cnblogs.com/zisefeizhu/p/13262239.html 或許這個問題和自處一樣呢?證書頒發者的pod與負載均衡器纏繞在不同的節點上,是以它無法通過入口與其自身進行通信。有可能哦
登陸阿裡雲看此kubernetes叢集的外部流量引入政策
還真是的呢?根據之前對externaltrafficpolicy 的原理性了解,我有90%的把握是此處的問題,改為cluster
注:為什麼我要在這裡 我要登陸aliyun 點選更改而不是用指令 導出資源清單更改呢?這是因為阿裡雲的托管k8s有坑,這裡如果用指令來改會導緻nginx-ingress的lb的IP 也就是對外的公網IP發生變化,這樣你的域名就全失效了因為IP變了.... 這個需要固定IP
再次測試
# kubectl delete -f test-manual-cert.yaml
certificate.cert-manager.io "test-monkeyrun-net-cert" deleted
# kubectl apply -f test-manual-cert.yaml
certificate.cert-manager.io/test-monkeyrun-net-cert created
# kubectl get certificate
NAME READY SECRET AGE
test-monkeyrun-net-cert True tls-test-monkeyrun-net 4s
🆗! 此問題解決,手動簽發證書成功,下面進行自動簽發證書!
自動簽發證書
資源清單
# cat test-nginx.yaml
##########################################################################
#Author: zisefeizhu
#QQ: 2********0
#Date: 2020-08-10
#FileName: test-nginx.yaml
#URL: https://www.cnblogs.com/zisefeizhu/
#Description: The test script
#Copyright (C): 2020 All rights reserved
###########################################################################
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-nginx
spec:
replicas: 1
selector:
matchLabels:
run: test-nginx
template:
metadata:
labels:
run: test-nginx
spec:
containers:
- name: test-nginx
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: test-nginx
labels:
app: test-nginx
spec:
ports:
- port: 80
protocol: TCP
name: http
selector:
run: test-nginx
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: test-nginx
annotations:
kubernetes.io/ingress.class: "nginx"
kubernetes.io/tls-acme: "true"
certmanager.k8s.io/cluster-issuer: "letsencrypt-prod"
spec:
rules:
- host: test.axxx.rxxxox.com
http:
paths:
- backend:
serviceName: test-nginx
servicePort: 80
path: /
tls:
- secretName: tls-test-monkeyrun-net
hosts:
- test.axxxxe.rexxxox.com
測試
# kubectl apply -f test-nginx.yaml
deployment.apps/test-nginx created
service/test-nginx created
ingress.extensions/test-nginx created
# kubectl get pods,ingress -n test
NAME READY STATUS RESTARTS AGE
pod/test-nginx-6dcd7c6dc5-fd5xx 1/1 Running 0 31s
NAME HOSTS ADDRESS PORTS AGE
ingress.extensions/test-nginx test.xxxce.rxxxxxxbox.com xxxxxxx 80, 443 31s
# kubectl get secrets -n test
NAME TYPE DATA AGE
acr-credential-560b66540f01e51c18524b09ad7f575f kubernetes.io/dockerconfigjson 1 87m
acr-credential-5dee66918cdf5d93de4aa5cd90247f73 kubernetes.io/dockerconfigjson 1 87m
acr-credential-6731ef77d88edc24b279ebf20860f30f kubernetes.io/dockerconfigjson 1 87m
acr-credential-bab42ef118a2913b05cd8cdb95441d70 kubernetes.io/dockerconfigjson 1 87m
acr-credential-be55512166dd26eda658d0706de5a06a kubernetes.io/dockerconfigjson 1 87m
default-token-22cwv kubernetes.io/service-account-token 3 87m
tls-test-monkeyrun-net kubernetes.io/tls 3 10m
登陸aliyun 檢視證書資訊
容器管理平台 --> 配置管理 --> 中繼資料
浏覽器通路

🆗!
過手如登山,一步一重天