說明
本文參考
https://segmentfault.com/a/1190000012755243。在前 文基礎上整理、增加說明,避坑。
踩過的坑: 安裝k8s 1.9.0 實踐:問題集錦
環境說明
環境資訊(采用一個master節點+兩個node節點)
192.168.1.137 tensorflow0 node
192.168.1.138 tensorflow1 master
192.168.1.139 tensorflow2 node
作業系統版本:
[root@tensorflow1 ~]# cat /etc/redhat-release
CentOS Linux release 7.4.1708 (Core)
核心版本:
[root@tensorflow1 ~]# cat /proc/version
Linux version 3.10.0-693.el7.x86_64 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Tue Aug 22 21:09:27 UTC 2017
軟體版本:
kubernetes v1.9.0
docker:17.03.2-ce
kubeadm:v1.9.0
kube-apiserver:v1.9.0
kube-controller-manager:v1.9.0
kube-scheduler:v1.9.0
k8s-dns-sidecar:1.14.7
k8s-dns-kube-dns:1.14.7
k8s-dns-dnsmasq-nanny:1.14.7
kube-proxy:v1.9.0
etcd:3.1.10
pause :3.0
flannel:v0.9.1
kubernetes-dashboard:v1.8.1
采用kubeadm安裝
kubeadm為kubernetes官方推薦的自動化部署工具,他将kubernetes的元件以pod的形式部署在master和node節點上,并自動完成證書認證等操作。
因為kubeadm預設要從google的鏡像倉庫下載下傳鏡像,但目前國内無法通路google鏡像倉庫,是以這裡我送出将鏡像下好了,隻需要将離線包的鏡像導入到節點中就可以了。
開始安裝
下載下傳
連結:
https://pan.baidu.com/s/1c2O1gIW密碼: 9s92
比對md5解壓離線包
MD5 (k8s_images.tar.bz2) = b60ad6a638eda472b8ddcfa9006315ee
解壓下載下傳下來的離線包
tar -xjvf k8s_images.tar.bz2
所有節點操作
環境設定
綁定hosts
将節點ip和host寫入hosts檔案
[root@tensorflow1 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.137 tensorflow0
192.168.1.138 tensorflow1
192.168.1.139 tensorflow2
關閉防火牆
systemctl stop firewalld && systemctl disable firewalld
關閉selinux
修改vi /etc/selinux/config 檔案,将SELINUX改為disabled
SELINUX=disabled
setenforce 0
關閉swap
swapoff -a
設定永久關閉swap
修改/etc/fstab中内容,将swap那一行用#注釋掉。
配置系統路由參數,防止kubeadm報路由警告
echo "
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
" >> /etc/sysctl.conf
sysctl -p
安裝docker
安裝docker-ce17.03 (kubeadmv1.9最大支援docker-ce17.03)
rpm -ihv docker-ce-selinux-17.03.2.ce-1.el7.centos.noarch.rpm
rpm -ivh docker-ce-17.03.2.ce-1.el7.centos.x86_64.rpm
啟動docker-ce
systemctl start docker && systemctl enable docker
檢查docker服務
systemctl status docker
active (running)則正常
安裝k8s
導入鏡像
docker load <etcd-amd64_v3.1.10.tar
docker load <flannel\:v0.9.1-amd64.tar
docker load <k8s-dns-dnsmasq-nanny-amd64_v1.14.7.tar
docker load <k8s-dns-kube-dns-amd64_1.14.7.tar
docker load <k8s-dns-sidecar-amd64_1.14.7.tar
docker load <kube-apiserver-amd64_v1.9.0.tar
docker load <kube-controller-manager-amd64_v1.9.0.tar
docker load <kube-scheduler-amd64_v1.9.0.tar
docker load <kube-proxy-amd64_v1.9.0.tar
docker load <pause-amd64_3.0.tar
docker load <kubernetes-dashboard_v1.8.1.tar
注意kubernetes-dashboard_v1.8.1.tar與其他包不在同一個目錄下,在上一級目錄中
安裝kubelet kubeadm kubectl包
rpm -ivh socat-1.7.3.2-2.el7.x86_64.rpm
rpm -ivh kubernetes-cni-0.6.0-0.x86_64.rpm kubelet-1.9.9-9.x86_64.rpm kubectl-1.9.0-0.x86_64.rpm
rpm -ivh kubectl-1.9.0-0.x86_64.rpm
rpm -ivh kubeadm-1.9.0-0.x86_64.rpm
修改kublet配置檔案
檢視docker cgroup driver:
docker info|grep Cgroup
有systemd和cgroupfs兩種,把kubelet service配置改成與docker一緻
vi /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
修改 Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"。修改 systemd為cgroupfs
啟動kubelet
systemctl enable kubelet && sudo systemctl start kubelet
檢查kubelet服務
systemctl status kubelet
kubelet啟動後 ,不停重新開機、ca檔案不存在是正常現象,在後續步驟 kubeadm init執行後會生成ca檔案,就會正常運作。
The kubelet is now restarting every few seconds, as it waits in a crashloop for kubeadm to tell it what to do. This crashloop is expected and normal, please proceed with the next step and the kubelet will start running normally.
master節點操作
開始初始化master
kubeadm init --kubernetes-version=v1.9.0 --pod-network-cidr=10.244.0.0/16
kubernetes預設支援多重網絡插件如flannel、weave、calico,這裡使用flanne,就必須要設定--pod-network-cidr參數,10.244.0.0/16是kube-flannel.yml裡面配置的預設網段,如果需要修改的話,需要把kubeadm init的--pod-network-cidr參數和後面的kube-flannel.yml裡面修改成一樣的網段就可以了。
将kubeadm join xxx儲存下來,等下node節點加入叢集需要使用
eg:
kubeadm join --token 5ce44e.47b6dc4e4b66980f 192.168.1.138:6443 --discovery-token-ca-cert-hash sha256:9d7eac82d66744405c783de5403e1f2bb7191b4c1b350d721b7b8570c62ff83a
如果忘記了,可以在master上通過kubeadmin token list得到
kubeadmin token list
預設token 24小時就會過期,後續的機器要加入叢集需要重新生成token
kubeadm token create
sha256擷取方式 master節點執行:
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
按照上面提示,此時root使用者還不能使用kubelet控制叢集需要,配置下環境變量
對于非root使用者
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
對于root使用者
可以直接放到~/.bash_profile
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
source一下環境變量
source ~/.bash_profile
kubectl version測試
kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.0", GitCommit:"925c127ec6b946659ad0fd596fa959be43f0cc05", GitTreeState:"clean", BuildDate:"2017-12-15T21:07:38Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.0", GitCommit:"925c127ec6b946659ad0fd596fa959be43f0cc05", GitTreeState:"clean", BuildDate:"2017-12-15T20:55:30Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
安裝網絡,可以使用flannel、calico、weave、macvlan這裡我們用flannel。直接使用離線包裡面的。
若要修改網段,需要kubeadm --pod-network-cidr=和這裡同步
vi kube-flannel.yml
修改network項
"Network": "10.244.0.0/16",
執行
kubectl create -f kube-flannel.yml
node節點操作
使用剛剛kubeadm後的kubeadm join
kubeadm join --token 5ce44e.47b6dc4e4b66980f 192.168.1.138:6443 --discovery-token-ca-cert-hash sha256:9d7eac82d66744405c783de5403e1f2bb7191b4c1b350d721b7b8570c62ff83a
在master節點上确認一下
[root@tensorflow1 hadoop]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
tensorflow0 Ready <none> 1d v1.9.0
tensorflow1 Ready master 1d v1.9.0
tensorflow2 Ready <none> 1d v1.9.0
kubernetes會在每個node節點建立flannel和kube-proxy的pod
[root@tensorflow1 hadoop]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-tensorflow1 1/1 Running 0 1d
kube-system kube-apiserver-tensorflow1 1/1 Running 0 1d
kube-system kube-controller-manager-tensorflow1 1/1 Running 0 1d
kube-system kube-dns-6f4fd4bdf-59ttf 3/3 Running 0 1d
kube-system kube-flannel-ds-fb75p 1/1 Running 0 1d
kube-system kube-flannel-ds-ppm2t 1/1 Running 0 1d
kube-system kube-flannel-ds-w54wh 1/1 Running 0 1d
kube-system kube-proxy-4lftj 1/1 Running 0 1d
kube-system kube-proxy-cj4st 1/1 Running 0 1d
kube-system kube-proxy-kd7vb 1/1 Running 0 1d
kube-system kube-scheduler-tensorflow1 1/1 Running 0 1d
至此kubernetes基本叢集安裝完成。
--後續補充部署dashboard内容--
安裝nvidia-gpu元件
這裡是為了給容器使用gpu,需要安裝元件,否則不用安裝。
編輯/etc/docker/daemon.json
cat /etc/docker/daemon.json
{
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia"
}
重新開機docker
systemctl restart docker
編輯kubelet配置檔案
資源則需要增加一行 Environment="KUBELET_EXTRA_ARGS=--feature-gates=DevicePlugins=true" 注意要加在 ExecStart= 之前
重新開機kubelet
systemctl daemon-reload && systemctl restart kubelet
根據gpu型号下載下傳相應的gpu插件鏡像,挂載鏡像
docker load <
# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nvidia/k8s-device-plugin 1.9 3325c3b04513 2 weeks ago 63 MB
通過yaml檔案啟動
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.9/nvidia-device-plugin.yml
檔案内容如下:
[root@tensorflow1 tf_gpu]# cat nvidia-device-plugin.yml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: nvidia-device-plugin-daemonset
namespace: kube-system
spec:
template:
metadata:
# Mark this pod as a critical add-on; when enabled, the critical add-on scheduler
# reserves resources for critical add-on pods so that they can be rescheduled after
# a failure. This annotation works in tandem with the toleration below.
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
labels:
name: nvidia-device-plugin-ds
spec:
tolerations:
# Allow this pod to be rescheduled while the node is in "critical add-ons only" mode.
# This, along with the annotation above marks this pod as a critical add-on.
- key: CriticalAddonsOnly
operator: Exists
containers:
- image: nvidia/k8s-device-plugin:1.9
name: nvidia-device-plugin-ctr
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
volumes:
- name: device-plugin
hostPath:
path: /var/lib/kubelet/device-plugins
systemctl daemon-reload && systemctl restart kubelet
本文轉自CSDN-
離線安裝k8s 1.9.0