kubernetes中部署的應用的資訊都存放在etcd裡面,這裡面的資料非常重要,需要備份,以備不時之需。
這裡使用k8s提供的定時任務來執行備份任務,定時任務的pod要和etcd的pod要在同一個node上面(使用nodeAffinity)。
備份etcd資料
apiVersion: batch/v2alpha1
kind: CronJob
metadata:
name: etcd-disaster-recovery
namespace: cron
spec:
schedule: "0 22 * * *"
jobTemplate:
spec:
template:
metadata:
labels:
app: etcd-disaster-recovery
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/role
operator: In
values:
- master
containers:
- name: etcd
image: coreos/etcd:v3.0.17
command:
- sh
- -c
- "export ETCDCTL_API=3; \
etcdctl --endpoints $ENDPOINT snapshot save /snapshot/$(date +%Y%m%d_%H%M%S)_snapshot.db; \
echo etcd backup sucess"
env:
- name: ENDPOINT
value: "127.0.0.1:2379"
volumeMounts:
- mountPath: "/snapshot"
name: snapshot
subPath: data/etcd-snapshot
- mountPath: /etc/localtime
name: lt-config
- mountPath: /etc/timezone
name: tz-config
restartPolicy: OnFailure
volumes:
- name: snapshot
persistentVolumeClaim:
claimName: cron-nas
- name: lt-config
hostPath:
path: /etc/localtime
- name: tz-config
hostPath:
path: /etc/timezone
hostNetwork: true
恢複etcd資料
在izbp10mfzkjb2hv7ayu190z 的操作如下,其他兩個node(izbp10mfzkjb2hv7ayu191z 、izbp10mfzkjb2hv7ayu192z )操作同理。
1. 先停止本機上的etcd和apiserver
[root@izbp1ijmrejjh7t2wv7fi0z~]# mv /etc/kubernetes/manifests/etcd.yaml ~/etcd_restore/manifests_backup
[root@izbp1ijmrejjh7t2wv7fi0z~]# mv /etc/kubernetes/manifests/kube-apiserver.yaml ~/etcd_restore/manifests_backup
确認ectd、api容器已經exit了
[root@izbp1ijmrejjh7t2wv7fi0z~]# docker ps -a | grep -E ".*(etcd|kube-api).*kube-system.*"
如有有資料輸出則執行下面的指令
[root@izbp1ijmrejjh7t2wv7fi0z~]# systemctl restart kubelet
2. 恢複etcd備份資料
[root@izbp1ijmrejjh7t2wv7fi0z~]# rm -rf /var/lib/etcd/member
[root@izbp1ijmrejjh7t2wv7fi0z~]#
ETCDCTL_API=3 etcdctl snapshot restore /mnt/nas/data/etcd-snapshot/20170915_snapshot.db \
--name etcd-master --initial-cluster etcd-master=http://master.k8s:2380,etcd-master1=http://master1.k8s:2380,etcd-master2=http://master2.k8s:2380 \
--initial-cluster-token etcd-cluster \
--initial-advertise-peer-urls http://master.k8s:2380 \
--data-dir /var/lib/etcd
注意:
這裡的每個參數可能會因主控端不同而不同,這裡需與每個主控端的/etc/kubernetes/manifests/etcd.yaml相應的參數保持一緻
這裡是把資料恢複到主控端的/var/lib/etcd目錄,因為在第4步起的etcd容器會挂載本目錄。
3. 啟動etcd、apiserver
[root@izbp1ijmrejjh7t2wv7fi0z~]# mv ~/etcd_restore/manifests_backup/etcd.yaml /etc/kubernetes/manifests/etcd.yaml
[root@izbp1ijmrejjh7t2wv7fi0z~]# mv ~/etcd_restore/manifests_backup/kube-apiserver.yaml /etc/kubernetes/manifests/kube-apiserver.yaml
驗證etcd和apiserver是否已經UP了
[root@izbp1ijmrejjh7t2wv7fi0z etcd-snapshot]# kubectl get pod -n kube-system | grep -E ".*(etcd|kube-api).*"
etcd-izbp1ijmrejjh7t2wv7fhyz 1/1 Running 879 23d
etcd-izbp1ijmrejjh7t2wv7fhzz 1/1 Running 106 1d
etcd-izbp1ijmrejjh7t2wv7fi0z 1/1 Running 101 2d
kube-apiserver-izbp1ijmrejjh7t2wv7fhyz 1/1 Running 1 2d
kube-apiserver-izbp1ijmrejjh7t2wv7fhzz 1/1 Running 6 1d
kube-apiserver-izbp1ijmrejjh7t2wv7fi0z 1/1 Running 0 2d
4. 驗證kube-system下面的所有pod、Node下的kubelet服務日志沒有錯誤資訊。
驗證所有命名空間下的應用是否起來了。