文章目錄
- kubeadm 安裝k8s 1.12
-
- 1 安裝環境準備
-
- 1.1檢查以下兩個檔案的輸出:
- 1.2 關閉selinux,firewalld,swap等
- 1.3 ntp配置
- 2 安裝docker
- 3 k8s環境準備
-
- 3.1 将阿裡雲的kubernetes yum源 sync到本地
- 3.2 引用自身repo
- 3.3 配置hosts,以及各節點SSH免密
- 3.4 安裝kubeadm,kubectl,kubelet
- 3.5 配置kubelet
- 3.6 kubeadm init安裝主節點
- 3.7 從節點安裝
- 3.8 heapster安裝
- 3.9 删除節點操作
- 3.10 解決CNI0 SNAT問題
- 3.11 添加gpu支援
kubeadm 安裝k8s 1.12
産品需要從k8s 1.6更新到1.12.3最新穩定版,目前kubeadm已經GA,是以這次也準備用上,替代以前的大量ansible腳本工作,由于要支援舊業務的heapster等元件,更新過程有點曲折,特此記錄。
1 安裝環境準備
1.1檢查以下兩個檔案的輸出:
#確定這兩個值為1
cat /proc/sys/net/bridge/bridge-nf-call-ip6tables
cat /proc/sys/net/bridge/bridge-nf-call-iptables
#如果這兩個值不為1,則進行如下操作:
#建立/etc/sysctl.d/k8s.conf檔案,添加如下内容:
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
#執行下面兩條指令使修改生效。
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf
1.2 關閉selinux,firewalld,swap等
1.3 ntp配置
- 伺服器預設已經安裝了ntpd和ntpdate服務。如果沒有安裝,請自行yum安裝。
systemctl enable ntpd
systemctl start ntpd
-
配置/etc/ntp.conf
其中主節點配置如下:
- 從節點配置如下:
- 注意統一時區為上海
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
- 在從節點上同步時間:
-
檢視ntp狀态
主節點:
從節點:
2 安裝docker
yum install -y yum-utils device-mapper-persistent-data
yum install -y docker-ce-18.06.0.ce-3.el7.x86_64
#安裝docker 18.06版本,如果之前安裝過更高版本的docker,可能在yum時報錯conflict,需要rpm -e 指定的版本
3 k8s環境準備
3.1 将阿裡雲的kubernetes yum源 sync到本地
3.2 引用自身repo
這裡省略了安裝nginx等過程,請自行查找自建yum源方法~
3.3 配置hosts,以及各節點SSH免密
3.4 安裝kubeadm,kubectl,kubelet
yum install -y kubeadm-1.12.3-0.x86_64 kubectl-1.12.3-0.x86_64 kubelet-1.12.3-0.x86_64
3.5 配置kubelet
systemctl enable kubelet
注意:這裡隻enable,不可start
通過觀察kubeadm的配置檔案,可以看出kubelet啟動時的啟動參數都從哪裡讀
3.6 kubeadm init安裝主節點
kubeadm init --kubernetes-version=v1.12.3 --pod-network-cidr=172.30.0.0/16 --service-cidr=10.96.0.0/16 --ignore-preflight-errors=Swap
#也可以使用kubeadm config images pull 來預先下載下傳鏡像
由于Kubeadm安裝的master節點,預設帶有taint,不接受一般的pod排程,是以coredns pod無法建立成功
将admin的kubeconfig檔案拷貝到各節點的 /root/.kube目錄下
cp /etc/kubernetes/admin.conf /root/.kube/config
kubectl taint nodes node1 node-role.kubernetes.io/master-
#可以去掉污點,讓master節點參與工作負載
#安裝flannel
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# flannel 修改為如下配置
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"ipMasq": false,
"plugins": [
{
"type": "flannel",
"ipMasq": false,
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "172.30.0.0/16",
"Backend": {
"Type": "host-gw"
}
}
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: kube-flannel-ds-amd64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
template:
metadata:
labels:
tier: node
app: flannel
spec:
hostNetwork: true
nodeSelector:
beta.kubernetes.io/arch: amd64
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.10.0-amd64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.10.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq=false
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: true
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
需要在主節點開啟非安全端口,需要在/etc/kubernetes/manifests/kube-apiserver.yaml 進行如下配置:
3.7 從節點安裝
從節點安裝kubeadm,kubelet,kubectl與主節點相同;
# 從節點joib到叢集
kubeadm join 10.6.6.190:6443 --token i3r115.q4ucaz0uio489438 --discovery-token-ca-cert-hash sha256:eb9daf9067902d23069c3910caefc5651d3156809fec18afee17e158c57f6afe --ignore-preflight-errors=Swap
#如果從節點的鏡像不是預先導入的,則也需要通過代理拉取鏡像。
#如果忘記了token和ca hash,通過以下指令生成新token和查找hash
kubeadm token create
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
主節點scp ~/.kube/config 到從節點對應目錄
3.8 heapster安裝
wget https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/rbac/heapster-rbac.yaml
wget https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/heapster.yaml
wget <https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/influxdb.yaml>
#注意!HeapSter會出錯 報錯資訊如下:
E0823 04:43:05.012122 1 manager.go:101] Error in scraping containers from kubelet:192.168.1.67:10255: failed to get all container stats from Kubelet URL "http://192.168.1.67:10 255/stats/container/": Post http://192.168.1.67:10255/stats/container/: dial tcp 192.168.1.67:10255: getsockopt: connection refused
#解決 在yaml檔案中進行如下配置
- --source=kubernetes:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true
- --sink=influxdb:http://monitoring-influxdb.kube-system.svc.cluster.local:8086
influxDb會因為dns的原因,解析錯誤的位址和8088端口,去嘗試綁定
解決:
再次注意:
k8s 1.12版本的 proxy api接口改變了
原來是/api/v1/proxy/namespaces/xxx/pods/xxx
現在是/api/v1/namespaces/xxx/pods/xxx/proxy
通過kubectl cluster-info 可以擷取服務清單
通路一個nginx服務
此時heapster的通路路徑改為:
curl 10.6.6.190:8080/api/v1/namespaces/kube-system/services/heapster/proxy/api/v1/model/nodes/lct-k8s-1/metrics/cpu/usage
3.9 删除節點操作
#在master節點上執行:
kubectl drain node2 --delete-local-data --force --ignore-daemonsets
kubectl delete node node2
#在node2上執行:
kubeadm reset
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm -rf /var/lib/cni/
3.10 解決CNI0 SNAT問題
# 添加一條規則,放在首位
iptables -t nat -I POSTROUTING 1 -s 172.30.0.0/16 -d 172.30.0.0/16 -j ACCEPT
#删除此條規則
iptables -t nat -D POSTROUTING -s 172.30.0.0/16 -d 172.30.0.0/16 -j ACCEPT
3.11 添加gpu支援
- 安裝nv-docker
# If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f sudo yum remove nvidia-docker
# Add the package repositories distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | \ sudo tee /etc/yum.repos.d/nvidia-docker.repo
# Install nvidia-docker2 and reload the Docker daemon configuration
sudo yum install -y nvidia-docker2 sudo pkill -SIGHUP dockerd
# Test nvidia-smi with the latest official CUDA image
docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi
- 安裝device-plugin
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.11/nvidia-device-plugin.yml
- kubelet配置
--feature-gates=DevicePlugins=true
- 重新開機kubelet服務