天天看點

k3d搭建kubesphere+k3s本地叢集環境

作者:勇往直前的程式員

前言

随着Docker、Kubernetes(k8s)等容器化技術的廣泛應用,軟體開發人員有必要了解并掌握該項技能。k8s本身太重,比較耗費資源,不便在個人電腦上安裝,這也或多或少的限制了我們對它的學習和研究。

k8s太重的缺陷,大牛們早已注意到了,是以rancher團隊開發了一款經過CNCF認證的kubernertes輕量級發行版k3s,其API與kubernetes相容,安裝包100多M吧。但麻雀雖小五髒俱全,用它來學習k8s再合适不過了。再結合k3d(k3s in docker),就是三個字:爽歪歪。

網上很多k3d教程基本照搬官網示例,且時間久遠,參考意義不大。是以編寫了本文,分享自己搭建本地k3s叢集環境的經驗,遇到的問題及解決方式。希望能夠對你有所幫助。

軟體環境

windows11

docker desktop for windows v4.16.3

k3d 5.4.7

k3s v1.24.10+k3s1

kubesphere v3.3.1

以上k3d、k3s、kubesphere的版本最好不要随意替換

怎樣在windows上安裝docker不屬于本文的範疇,可自行上網查閱資料。

也可以使用Virtual Box或VMWare安裝Linux系統,在虛拟機環境下安裝docker,在虛拟機中搭建k3s叢集

備注:我在k3s v1.25.6+k3s1中安裝kubesphere v3.3.2,devops無論是安裝前後開啟,均無法成功開啟devops插件,通過日志也沒有看出有用的資訊

準備k3d和k3s檔案

  1. 從k3d的github releases頁面下載下傳k3d-windows-amd64.exe,将檔案重命名為k3d.exe,然後将該檔案路徑加入到環境變量的Path路徑中(以便在終端中可以直接執行k3d指令)
  2. 從k3s的github releases下載下傳對應架構的離線鏡像檔案(解決無法通路docker.io域名,導緻叢集元件無法正常啟動的問題)
k3d搭建kubesphere+k3s本地叢集環境

k3s的github倉庫

k3d搭建kubesphere+k3s本地叢集環境

k3s離線docker鏡像檔案

k3s離線docker鏡像版本v1.24.10+k3s1

若無法通路GitHub,可以考慮通過GitHub鏡像(https://hub.nuaa.cf/)搜尋k3d和k3s,然後下載下傳對應的檔案

GitHub - k3d-io/k3d: Little helper to run CNCF's k3s in Docker

GitHub - k3s-io/k3s: Lightweight Kubernetes

注意:k3d需要使用到兩個docker鏡像(ghcr.io/k3d-io/k3d-proxy:5.4.7和ghcr.io/k3d-io/k3d-tools:5.4.7),建議通過docker鏡像加速伺服器下載下傳後,再通過docker tag指令将其修改為上述括号中的名稱

準備kubesphere安裝檔案

kubesphere提供了國内docker鏡像,是以可以參考官方最小化安裝教程在 Kubernetes 上最小化安裝 KubeSphere。下面是安裝kubesphere使用的兩個yaml檔案

kubesphere-installer.yaml「連結」

---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: clusterconfigurations.installer.kubesphere.io
spec:
  group: installer.kubesphere.io
  versions:
    - name: v1alpha1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              x-kubernetes-preserve-unknown-fields: true
            status:
              type: object
              x-kubernetes-preserve-unknown-fields: true
  scope: Namespaced
  names:
    plural: clusterconfigurations
    singular: clusterconfiguration
    kind: ClusterConfiguration
    shortNames:
      - cc

---
apiVersion: v1
kind: Namespace
metadata:
  name: kubesphere-system

# 手動建立一個kubesphere-devops-worker命名空間(在v3.3.2版本中,安裝失敗,提示沒有找到這個命名空間,是以在v3.3.1版本時,我主動加上看看)
---
apiVersion: v1
kind: Namespace
metadata:
  name: kubesphere-devops-worker

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: ks-installer
  namespace: kubesphere-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: ks-installer
rules:
- apiGroups:
  - ""
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - apps
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - extensions
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - batch
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - rbac.authorization.k8s.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - apiregistration.k8s.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - apiextensions.k8s.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - tenant.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - certificates.k8s.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - devops.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - monitoring.coreos.com
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - logging.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - jaegertracing.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - storage.k8s.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - admissionregistration.k8s.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - policy
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - autoscaling
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - networking.istio.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - config.istio.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - iam.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - notification.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - auditing.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - events.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - core.kubefed.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - installer.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - storage.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - security.istio.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - monitoring.kiali.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - kiali.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - networking.k8s.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - edgeruntime.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - types.kubefed.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - monitoring.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - application.kubesphere.io
  resources:
  - '*'
  verbs:
  - '*'


---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: ks-installer
subjects:
- kind: ServiceAccount
  name: ks-installer
  namespace: kubesphere-system
roleRef:
  kind: ClusterRole
  name: ks-installer
  apiGroup: rbac.authorization.k8s.io

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ks-installer
  namespace: kubesphere-system
  labels:
    app: ks-installer
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ks-installer
  template:
    metadata:
      labels:
        app: ks-installer
    spec:
      serviceAccountName: ks-installer
      containers:
      - name: installer
      # 修改為國内鏡像源
        image: registry.cn-beijing.aliyuncs.com/kubesphereio/ks-installer:v3.3.1
        imagePullPolicy: "Always"
        resources:
          limits:
            cpu: "1"
            memory: 1Gi
          requests:
            cpu: 20m
            memory: 100Mi
        volumeMounts:
        - mountPath: /etc/localtime
          name: host-time
          readOnly: true
      volumes:
      - hostPath:
          path: /etc/localtime
          type: ""
        name: host-time           

cluster-configuration.yaml「連結」

---
apiVersion: installer.kubesphere.io/v1alpha1
kind: ClusterConfiguration
metadata:
  name: ks-installer
  namespace: kubesphere-system
  labels:
    version: v3.3.1
spec:
  persistence:
    storageClass: ""        # If there is no default StorageClass in your cluster, you need to specify an existing StorageClass here.
  authentication:
    # adminPassword: ""     # Custom password of the admin user. If the parameter exists but the value is empty, a random password is generated. If the parameter does not exist, P@88w0rd is used.
    jwtSecret: ""           # Keep the jwtSecret consistent with the Host Cluster. Retrieve the jwtSecret by executing "kubectl -n kubesphere-system get cm kubesphere-config -o yaml | grep -v "apiVersion" | grep jwtSecret" on the Host Cluster.
    #倉庫位址替換為國内鏡像
  local_registry: "registry.cn-beijing.aliyuncs.com"        # Add your private registry address if it is needed.
  # dev_tag: ""               # Add your kubesphere image tag you want to install, by default it's same as ks-installer release version.
  etcd:
    monitoring: false       # Enable or disable etcd monitoring dashboard installation. You have to create a Secret for etcd before you enable it.
    endpointIps: localhost  # etcd cluster EndpointIps. It can be a bunch of IPs here.
    port: 2379              # etcd port.
    tlsEnable: true
  common:
    core:
      console:
        enableMultiLogin: true  # Enable or disable simultaneous logins. It allows different users to log in with the same account at the same time.
        port: 30880
        type: NodePort

    # apiserver:            # Enlarge the apiserver and controller manager's resource requests and limits for the large cluster
    #  resources: {}
    # controllerManager:
    #  resources: {}
    redis:
      enabled: false
      enableHA: false
      volumeSize: 2Gi # Redis PVC size.
     # 提前開啟,避免devops安裝失敗
    openldap:
      enabled: true
      volumeSize: 2Gi   # openldap PVC size.
    minio:
      volumeSize: 20Gi # Minio PVC size.
    monitoring:
      # type: external   # Whether to specify the external prometheus stack, and need to modify the endpoint at the next line.
      endpoint: http://prometheus-operated.kubesphere-monitoring-system.svc:9090 # Prometheus endpoint to get metrics data.
      GPUMonitoring:     # Enable or disable the GPU-related metrics. If you enable this switch but have no GPU resources, Kubesphere will set it to zero.
        enabled: false
    gpu:                 # Install GPUKinds. The default GPU kind is nvidia.com/gpu. Other GPU kinds can be added here according to your needs.
      kinds:
      - resourceName: "nvidia.com/gpu"
        resourceType: "GPU"
        default: true
    es:   # Storage backend for logging, events and auditing.
      # master:
      #   volumeSize: 4Gi  # The volume size of Elasticsearch master nodes.
      #   replicas: 1      # The total number of master nodes. Even numbers are not allowed.
      #   resources: {}
      # data:
      #   volumeSize: 20Gi  # The volume size of Elasticsearch data nodes.
      #   replicas: 1       # The total number of data nodes.
      #   resources: {}
      logMaxAge: 7             # Log retention time in built-in Elasticsearch. It is 7 days by default.
      elkPrefix: logstash      # The string making up index names. The index name will be formatted as ks-<elk_prefix>-log.
      basicAuth:
        enabled: false
        username: ""
        password: ""
      externalElasticsearchHost: ""
      externalElasticsearchPort: ""
  alerting:                # (CPU: 0.1 Core, Memory: 100 MiB) It enables users to customize alerting policies to send messages to receivers in time with different time intervals and alerting levels to choose from.
    enabled: false         # Enable or disable the KubeSphere Alerting System.
    # thanosruler:
    #   replicas: 1
    #   resources: {}
  auditing:                # Provide a security-relevant chronological set of recordsúČrecording the sequence of activities happening on the platform, initiated by different tenants.
    enabled: false         # Enable or disable the KubeSphere Auditing Log System.
    # operator:
    #   resources: {}
    # webhook:
    #   resources: {}
  # 安裝前就開啟devops插件(我在k3s-v1.25.6+k3s1和kubesphere-v3.3.2下,不論安裝前後開啟devops插件均會失敗)
  devops:                  # (CPU: 0.47 Core, Memory: 8.6 G) Provide an out-of-the-box CI/CD system based on Jenkins, and automated workflow tools including Source-to-Image & Binary-to-Image.
    enabled: true             # Enable or disable the KubeSphere DevOps System.
    # resources: {}
    jenkinsMemoryLim: 2Gi      # Jenkins memory limit.
    jenkinsMemoryReq: 512Mi   # Jenkins memory request.
    jenkinsVolumeSize: 8Gi     # Jenkins volume size.
  events:                  # Provide a graphical web console for Kubernetes Events exporting, filtering and alerting in multi-tenant Kubernetes clusters.
    enabled: false         # Enable or disable the KubeSphere Events System.
    # operator:
    #   resources: {}
    # exporter:
    #   resources: {}
    # ruler:
    #   enabled: true
    #   replicas: 2
    #   resources: {}
  logging:                 # (CPU: 57 m, Memory: 2.76 G) Flexible logging functions are provided for log query, collection and management in a unified console. Additional log collectors can be added, such as Elasticsearch, Kafka and Fluentd.
    enabled: false         # Enable or disable the KubeSphere Logging System.
    logsidecar:
      enabled: true
      replicas: 2
      # resources: {}
  metrics_server:                    # (CPU: 56 m, Memory: 44.35 MiB) It enables HPA (Horizontal Pod Autoscaler).
    enabled: false                   # Enable or disable metrics-server.
  monitoring:
    storageClass: ""                 # If there is an independent StorageClass you need for Prometheus, you can specify it here. The default StorageClass is used by default.
    node_exporter:
      port: 9100
      # resources: {}
    # kube_rbac_proxy:
    #   resources: {}
    # kube_state_metrics:
    #   resources: {}
    # prometheus:
    #   replicas: 1  # Prometheus replicas are responsible for monitoring different segments of data source and providing high availability.
    #   volumeSize: 20Gi  # Prometheus PVC size.
    #   resources: {}
    #   operator:
    #     resources: {}
    # alertmanager:
    #   replicas: 1          # AlertManager Replicas.
    #   resources: {}
    # notification_manager:
    #   resources: {}
    #   operator:
    #     resources: {}
    #   proxy:
    #     resources: {}
    gpu:                           # GPU monitoring-related plug-in installation.
      nvidia_dcgm_exporter:        # Ensure that gpu resources on your hosts can be used normally, otherwise this plug-in will not work properly.
        enabled: false             # Check whether the labels on the GPU hosts contain "nvidia.com/gpu.present=true" to ensure that the DCGM pod is scheduled to these nodes.
        # resources: {}
  multicluster:
    clusterRole: none  # host | member | none  # You can install a solo cluster, or specify it as the Host or Member Cluster.
  network:
    networkpolicy: # Network policies allow network isolation within the same cluster, which means firewalls can be set up between certain instances (Pods).
      # Make sure that the CNI network plugin used by the cluster supports NetworkPolicy. There are a number of CNI network plugins that support NetworkPolicy, including Calico, Cilium, Kube-router, Romana and Weave Net.
      enabled: false # Enable or disable network policies.
    ippool: # Use Pod IP Pools to manage the Pod network address space. Pods to be created can be assigned IP addresses from a Pod IP Pool.
      type: none # Specify "calico" for this field if Calico is used as your CNI plugin. "none" means that Pod IP Pools are disabled.
    topology: # Use Service Topology to view Service-to-Service communication based on Weave Scope.
      type: none # Specify "weave-scope" for this field to enable Service Topology. "none" means that Service Topology is disabled.
  openpitrix: # An App Store that is accessible to all platform tenants. You can use it to manage apps across their entire lifecycle.
    store:
     # 提前開啟,避免devops安裝失敗
      enabled: true # Enable or disable the KubeSphere App Store.
  servicemesh:         # (0.3 Core, 300 MiB) Provide fine-grained traffic management, observability and tracing, and visualized traffic topology.
    enabled: false     # Base component (pilot). Enable or disable KubeSphere Service Mesh (Istio-based).
    istio:  # Customizing the istio installation configuration, refer to https://istio.io/latest/docs/setup/additional-setup/customize-installation/
      components:
        ingressGateways:
        - name: istio-ingressgateway
          enabled: false
        cni:
          enabled: false
  edgeruntime:          # Add edge nodes to your cluster and deploy workloads on edge nodes.
    enabled: false
    kubeedge:        # kubeedge configurations
      enabled: false
      cloudCore:
        cloudHub:
          advertiseAddress: # At least a public IP address or an IP address which can be accessed by edge nodes must be provided.
            - ""            # Note that once KubeEdge is enabled, CloudCore will malfunction if the address is not provided.
        service:
          cloudhubNodePort: "30000"
          cloudhubQuicNodePort: "30001"
          cloudhubHttpsNodePort: "30002"
          cloudstreamNodePort: "30003"
          tunnelNodePort: "30004"
        # resources: {}
        # hostNetWork: false
      iptables-manager:
        enabled: true 
        mode: "external"
        # resources: {}
      # edgeService:
      #   resources: {}
  gatekeeper:        # Provide admission policy and rule management, A validating (mutating TBA) webhook that enforces CRD-based policies executed by Open Policy Agent.
    enabled: false   # Enable or disable Gatekeeper.
    # controller_manager:
    #   resources: {}
    # audit:
    #   resources: {}
  terminal:
    # image: 'alpine:3.15' # There must be an nsenter program in the image
    timeout: 600         # Container timeout, if set to 0, no timeout will be used. The unit is seconds           

安裝k3s叢集

安裝名稱為local的k3s叢集(可以輸入k3d --help檢視幫助)

k3d cluster create local -i rancher/k3s:v1.24.10-k3s1 --api-port 6443 -p 30880:30880@loadbalancer --agents 1 --servers 1  --registry-config registry.yaml --verbose --trace           

registry.yaml

mirrors:
 #本地docker私服
  "192.168.138.1:5000":
    endpoint:
      - http://192.168.138.1:5000           

等待k3s叢集初始化完畢(需要一會兒),通過kubectl get pod -n kube-system 檢視結果,如下圖所示即表示初始化完畢

k3d搭建kubesphere+k3s本地叢集環境

叢集初始化完畢示例

安裝kubesphere

把修改過的kubesphere-installer.yaml和cluster-configuration.yaml儲存到本地目錄(我的是D:\study\kubesphere)

在D:\study\kubesphere目錄打開cmd終端,執行以下兩條指令

kubectl apply -f kubesphere-installer.yaml
kubectl apply -f cluster-configuration.yaml           

然後使用 kubectl -n kubesphere-system get pod 檢視pod清單,找到ks-installer-開頭的pod(假設是ks-installer-xxx),使用以下指令檢視安裝日志

kubectl -n kubesphere-system -v 8 logs ks-installer-xxx -f           

看到下圖中的輸出就表示安裝成功了(時間有點長,耐心等待)

k3d搭建kubesphere+k3s本地叢集環境

安裝kubesphere成功日志

進行到這裡,我們就已經完成了整個k3s叢集和kubesphere的搭建。在浏覽器位址欄輸入localhost:30880 回車,不出意外就可以看到kubesphere的登陸頁面了(首次通路有點慢)。

遇到的問題及解決方式

輸入admin/P@88w0rd後無法登入kubesphere。會有形如這樣的提示

request to http://ks-apiserver/oauth/token failed, reason: connect ECONNREFUSED 10.43.51.184:80

原因分析:

提示很明确,網絡不通。起初我懷疑是由于kubesphere相關的service建立有異常,導緻請求不到對應的ks-apiserver

# 排查service(發現ks-apiserver存在)
kubectl -n kubesphere-system get service -o wide
#排查endpoints,確定service比對到了pod(ks-apiserver對應10.42.1.24:9090,無異常)
kubectl -n kubesphere-system get endpoints -o wide           

顯然pod、service都沒有問題,那麼問題出在哪裡呢?帶着疑問,我檢視了k3d-local-server-0這個node的日志,發現了以下錯誤日志

"Failed to execute iptables-restore" err=<           

日志資訊指向了iptables,再結合k8s預設使用iptables來解決pod之間的通信問題,于是打開k3d-local-server-0的終端(通過docker desktop的container清單中進入,或者使用docker exec -it進入),使用以下指令檢查iptables規則中是否存在9090端口的配置

iptables-save | grep 9090
# 顯示結果為空,果然沒有,問題原因找到了(深層次原因,由于個人能力有限,并未深究)           

解決方法:利用k3d重新開機叢集(如果一次不行,多試幾次)

k3d cluster stop local
k3d cluster start local
# 重新開機之後再次進入k3d-local-server-0節點終端,iptables-save | grep 9090 就會有結果了

# 檢視kubesphere-system相關pod是否正常啟動(如果發現有pod的狀态異常,可多等一會兒)
kubectl -n kubesphere-system get pod -o wide           
k3d搭建kubesphere+k3s本地叢集環境

重新開機叢集後可能會出現部分pod反複重新開機的情況

再次打開http://localhost:30880,輸入賬号密碼後即可進入kubesphere頁面

k3d搭建kubesphere+k3s本地叢集環境

繼續閱讀