天天看點

cephadm安裝ceph環境前置操作報錯引導新叢集

環境

操縱系統及核心版本(最小化安裝,不自帶python)

[[email protected] ~]# cat /etc/redhat-release 
CentOS Linux release 8.0.1905 (Core) 
[[email protected] ~]# uname -r
4.18.0-80.el8.x86_64
[[email protected] ~]# uname -a
Linux node-01 4.18.0-80.el8.x86_64 #1 SMP Tue Jun 4 09:19:46 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
           

ip位址及主機名

192.168.91.133 node-01
192.168.91.134 node-02
192.168.91.135 node-03
           

前置操作

關閉防火牆及selinux

所有節點都操作

systemctl disable firewalld
systemctl stop firewalld
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config 
setenforce 0
           

配置yum源位址

所有節點都操作

注意:如下圖Centos8.0(1905)的yum源centos官網已經不再維護,下載下傳cephadm會報錯,是以将其改成vault源。

cephadm安裝ceph環境前置操作報錯引導新叢集
[[email protected] ~]# cat /etc/yum.repos.d/CentOS-AppStream.repo 
# CentOS-AppStream.repo
#
# The mirror system uses the connecting IP address of the client and the
# update status of each mirror to pick mirrors that are updated to and
# geographically close to the client.  You should use this for CentOS updates
# unless you are manually picking other mirrors.
#
# If the mirrorlist= does not work for you, as a fall back you can try the
# remarked out baseurl= line instead.
#
#

[AppStream]
name=CentOS-$releasever - AppStream
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=AppStream&infra=$infra
#baseurl=http://mirror.centos.org/$contentdir/$releasever/AppStream/$basearch/os/
#更改baseurl行
baseurl=http://vault.centos.org/$contentdir/$releasever/AppStream/$basearch/os/
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial

[[email protected] ~]# cat /etc/yum.repos.d/CentOS-Base.repo 
# CentOS-Base.repo
#
# The mirror system uses the connecting IP address of the client and the
# update status of each mirror to pick mirrors that are updated to and
# geographically close to the client.  You should use this for CentOS updates
# unless you are manually picking other mirrors.
#
# If the mirrorlist= does not work for you, as a fall back you can try the
# remarked out baseurl= line instead.
#
#

[BaseOS]
name=CentOS-$releasever - Base
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=BaseOS&infra=$infra
#baseurl=http://mirror.centos.org/$contentdir/$releasever/BaseOS/$basearch/os/
#更改baseurl行
baseurl=http://vault.centos.org/$contentdir/$releasever/BaseOS/$basearch/os/
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial
           
dnf clean all
           

時鐘同步

所有節點都操作

安裝時鐘同步軟體
dnf -y install chrony
           
#開機自啟動
systemctl enable chronyd
#啟動
systemctl start chronyd
#驗證
chronyc sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^+ 139.199.214.202               2   6    33     4  +4521us[+4521us] +/-   79ms
^? makaki.miuku.net              0   6     0     -     +0ns[   +0ns] +/-    0ns
^- de-user.deepinid.deepin.>     3   6    17     6    +34ms[  +34ms] +/-  164ms
^* time.cloudflare.com           3   6    17     7   +122us[  +21ms] +/-  128ms
           

安裝podman

所有節點都操作

報錯

invalid literal for int() with base 10

[[email protected] ~]# cephadm bootstrap --mon-ip 192.168.91.133
Traceback (most recent call last):
  File "/usr/sbin/cephadm", line 8571, in <module>
    main()
  File "/usr/sbin/cephadm", line 8557, in main
    check_container_engine(ctx)
  File "/usr/sbin/cephadm", line 2014, in check_container_engine
    engine.get_version(ctx)
  File "/usr/sbin/cephadm", line 197, in get_version
    self._version = _parse_podman_version(out)
  File "/usr/sbin/cephadm", line 1603, in _parse_podman_version
    return tuple(map(to_int, version_str.split('.')))
  File "/usr/sbin/cephadm", line 1601, in to_int
    return to_int(val[0:-1], org_e or e)
  File "/usr/sbin/cephadm", line 1597, in to_int
    raise org_e
  File "/usr/sbin/cephadm", line 1599, in to_int
    return int(val)
ValueError: invalid literal for int() with base 10: ''
           

解決:

安裝podman

安裝cephadm

所有節點都操作。

說明:安裝這個會順帶把python3也安裝上,而後續的操作(比如添加主機)是需要python3環境的,是以就直接執行這個把python3安裝了。

dnf install --assumeyes centos-release-ceph-pacific.noarch
dnf install --assumeyes cephadm
           

引導新叢集

修改cephadm腳本

将鏡像位址改成下載下傳速度快的位址

cat /usr/sbin/cephadm |head -75
           

替換後的内容如下

# Default container images -----------------------------------------------------
#DEFAULT_IMAGE = 'quay.io/ceph/ceph:v16'
#DEFAULT_IMAGE_IS_MASTER = False
#DEFAULT_IMAGE_RELEASE = 'pacific'
#DEFAULT_PROMETHEUS_IMAGE = 'quay.io/prometheus/prometheus:v2.18.1'
#DEFAULT_NODE_EXPORTER_IMAGE = 'quay.io/prometheus/node-exporter:v0.18.1'
#DEFAULT_ALERT_MANAGER_IMAGE = 'quay.io/prometheus/alertmanager:v0.20.0'
#DEFAULT_GRAFANA_IMAGE = 'quay.io/ceph/ceph-grafana:6.7.4'
#DEFAULT_HAPROXY_IMAGE = 'docker.io/library/haproxy:2.3'
#DEFAULT_KEEPALIVED_IMAGE = 'docker.io/arcts/keepalived'
#DEFAULT_REGISTRY = 'docker.io'   # normalize unqualified digests to this
# ------------------------------------------------------------------------------

# Default container images -----------------------------------------------------
DEFAULT_IMAGE = 'docker.io/ceph/ceph:v16'
DEFAULT_IMAGE_IS_MASTER = False
DEFAULT_IMAGE_RELEASE = 'pacific'
DEFAULT_PROMETHEUS_IMAGE = 'docker.io/bitnami/prometheus:latest'
DEFAULT_NODE_EXPORTER_IMAGE = 'docker.io/bitnami/node-exporter:latest'
DEFAULT_ALERT_MANAGER_IMAGE = 'docker.io/prom/alertmanager:latest'
DEFAULT_GRAFANA_IMAGE = 'docker.io/ceph/ceph-grafana:latest'
DEFAULT_HAPROXY_IMAGE = 'docker.io/library/haproxy:2.3'
DEFAULT_KEEPALIVED_IMAGE = 'docker.io/arcts/keepalived'
DEFAULT_REGISTRY = 'docker.io'   # normalize unqualified digests to this
# ------------------------------------------------------------------------------
           
[[email protected] ~]# cephadm bootstrap --mon-ip 192.168.91.133
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman (/usr/bin/podman) version 3.3.1 is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 2e99a36a-bfb8-11ec-8fe2-000c29779b64
Verifying IP 192.168.91.133 port 3300 ...
Verifying IP 192.168.91.133 port 6789 ...
Mon IP `192.168.91.133` is in CIDR network `192.168.91.0/24`
- internal network (--cluster-network) has not been provided, OSD replication will default to the public_network
Pulling container image docker.io/ceph/ceph:v16...
Ceph version: ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network to 192.168.91.0/24
Wrote config to /etc/ceph/ceph.conf
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Creating mgr...
Verifying port 9283 ...
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/15)...
mgr not available, waiting (2/15)...
mgr not available, waiting (3/15)...
mgr not available, waiting (4/15)...
mgr not available, waiting (5/15)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for mgr epoch 5...
mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to /etc/ceph/ceph.pub
Adding key to [email protected] authorized_keys...
Adding host node-01...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Deploying prometheus service with default placement...
Deploying grafana service with default placement...
Deploying node-exporter service with default placement...
Deploying alertmanager service with default placement...
Enabling the dashboard module...
Waiting for the mgr to restart...
Waiting for mgr epoch 9...
mgr epoch 9 is available
Generating a dashboard self-signed certificate...
Creating initial admin user...
Fetching dashboard port number...
Ceph Dashboard is now available at:

	     URL: https://node-01:8443/
	    User: admin
	Password: ld8nohjdgd

Enabling client.admin keyring and conf on hosts with "admin" label
You can access the Ceph CLI with:

	sudo /usr/sbin/cephadm shell --fsid 2e99a36a-bfb8-11ec-8fe2-000c29779b64 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

Please consider enabling telemetry to help improve Ceph:

	ceph telemetry on

For more information see:

	https://docs.ceph.com/docs/pacific/mgr/telemetry/

Bootstrap complete.
           

儲存如下資訊

Ceph Dashboard is now available at:

	     URL: https://node-01:8443/
	    User: admin
	Password: ld8nohjdgd

Enabling client.admin keyring and conf on hosts with "admin" label
You can access the Ceph CLI with:

	sudo /usr/sbin/cephadm shell --fsid 2e99a36a-bfb8-11ec-8fe2-000c29779b64 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

Please consider enabling telemetry to help improve Ceph:

	ceph telemetry on

For more information see:

	https://docs.ceph.com/docs/pacific/mgr/telemetry/

Bootstrap complete.
           

通路ceph的dashboard

注意:用https,否則通路不到。

https://192.168.91.133:8443

cephadm安裝ceph環境前置操作報錯引導新叢集
cephadm安裝ceph環境前置操作報錯引導新叢集
cephadm安裝ceph環境前置操作報錯引導新叢集
cephadm安裝ceph環境前置操作報錯引導新叢集

添加主機

登入ceph指令行(關于對ceph的操作都需要先登入ceph的指令行,有免登入的設定,本文未設定)

輸出

Inferring fsid 2e99a36a-bfb8-11ec-8fe2-000c29779b64
Using recent ceph image docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb
           

添加node-01節點

輸出

添加node-02節點

輸出

添加node-03節點

輸出

報錯

[ceph: [email protected] /]# ceph orch host add node-02 192.168.91.134
Error EINVAL: Host node-02 (192.168.91.134) failed check(s): []
           

原因:該節點沒有安裝podman或者docker。

[ceph: [email protected] /]# ceph orch host add node-03 192.168.91.135
Error EINVAL: Can't communicate with remote host `192.168.91.135`, possibly because python3 is not installed there: cannot send (already closed?)
           

原因:該節點沒有安裝python3。

解決:

dnf install --assumeyes centos-release-ceph-pacific.noarch
dnf install --assumeyes cephadm
           

檢視可用主機和裝置

[ceph: [email protected] /]# ceph orch host ls
HOST     ADDR            LABELS  STATUS  
node-01  192.168.91.133                  
node-02  192.168.91.134                  
node-03  192.168.91.135                  
[ceph: [email protected] /]# ceph orch device ls
Hostname  Path      Type  Serial  Size   Health   Ident  Fault  Available  
node-01   /dev/sdb  hdd           21.4G  Unknown  N/A    N/A    Yes        
node-02   /dev/sdb  hdd           21.4G  Unknown  N/A    N/A    Yes        
           

注意:node-03節點的裝置沒有顯示出來,而node-03節點實際是有可用裝置的。為什麼?

node-03節點相關的容器沒有啟動。

等node-03節點相關容器啟動再次檢視

[ceph: [email protected] /]# ceph orch host ls
HOST     ADDR            LABELS  STATUS  
node-01  192.168.91.133                  
node-02  192.168.91.134                  
node-03  192.168.91.135                  
[ceph: [email protected] /]# ceph orch device ls
Hostname  Path      Type  Serial  Size   Health   Ident  Fault  Available  
node-01   /dev/sdb  hdd           21.4G  Unknown  N/A    N/A    Yes        
node-02   /dev/sdb  hdd           21.4G  Unknown  N/A    N/A    Yes        
node-03   /dev/sdb  hdd           21.4G  Unknown  N/A    N/A    Yes       
           

各個主機目前啟動的容器

[[email protected] ~]# podman ps -a
CONTAINER ID  IMAGE                                                                                        COMMAND               CREATED       STATUS           PORTS       NAMES
a620ac0a433a  docker.io/ceph/ceph:v16                                                                      -n mon.node-01 -f...  22 hours ago  Up 22 hours ago              ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-mon-node-01
5fbcb8b47e20  docker.io/ceph/ceph:v16                                                                      -n mgr.node-01.vk...  22 hours ago  Up 22 hours ago              ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-mgr-node-01-vkduxo
8e15dc167b28  docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb  -n client.crash.n...  22 hours ago  Up 22 hours ago              ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-crash.node-01
5a39b8488648  docker.io/prom/node-exporter:v0.18.1                                                         --no-collector.ti...  22 hours ago  Up 22 hours ago              ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-node-exporter.node-01
dc95861f41f8  docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb                        21 hours ago  Up 21 hours ago              romantic_varahamihira
7f35cdb22c80  docker.io/prom/alertmanager:v0.20.0                                                          --cluster.listen-...  19 hours ago  Up 19 hours ago              ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-alertmanager.node-01
82a2ec351349  docker.io/prom/prometheus:v2.18.1                                                            --config.file=/et...  2 hours ago   Up 2 hours ago               ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-prometheus.node-01
           
[[email protected] ~]# podman ps -a
CONTAINER ID  IMAGE                                                                                        COMMAND               CREATED       STATUS           PORTS       NAMES
a333d817326f  docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb  -n client.crash.n...  19 hours ago  Up 19 hours ago              ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-crash.node-02
a3dfbe105d18  docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb  -n mgr.node-02.kn...  19 hours ago  Up 19 hours ago              ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-mgr.node-02.knnehw
0deab1a1d01e  docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb  -n mon.node-02 -f...  19 hours ago  Up 19 hours ago              ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-mon.node-02
ddd55502e17b  docker.io/prom/node-exporter:v0.18.1                                                         --no-collector.ti...  19 hours ago  Up 19 hours ago              ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-node-exporter.node-02
           
[[email protected] ~]# podman ps -a
CONTAINER ID  IMAGE                                                                                        COMMAND               CREATED         STATUS             PORTS       NAMES
47ba78c7a826  docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb  -n client.crash.n...  2 hours ago     Up 2 hours ago                 ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-crash.node-03
008e3dc211a3  docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb  -n mon.node-03 -f...  2 hours ago     Up 2 hours ago                 ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-mon.node-03
24166557cea6  docker.io/prom/node-exporter:v0.18.1                                                         --no-collector.ti...  2 hours ago     Up 2 hours ago                 ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-node-exporter.node-03
cc53a5de5faf  docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb  -n osd.0 -f --set...  10 minutes ago  Up 10 minutes ago              ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-osd.0
           

建立osds

建立osd有兩種方法,本文采用的是第一種。

方法一:告訴ceph使用一切可用和未使用的裝置

輸出

Scheduled osd.all-available-devices update...
           

方法二:從特定主機的特定裝置建立osd

輸出(因為已經用第一種方法從node-01:/dev/sdb建立好了osd,是以會輸出這個提示)

Created no osd(s) on host node-01; already created?
           

驗證叢集狀态

說明:到目前為止叢集狀态就已經正常了。

ceph指令行驗證

[ceph: [email protected] /]# ceph -s
  cluster:
    id:     2e99a36a-bfb8-11ec-8fe2-000c29779b64
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum node-01,node-02,node-03 (age 2h)
    mgr: node-01.vkduxo(active, since 22h), standbys: node-02.knnehw
    osd: 3 osds: 3 up (since 7m), 3 in (since 7m)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   15 MiB used, 60 GiB / 60 GiB avail
    pgs:     1 active+clean
           

web界面驗證

cephadm安裝ceph環境前置操作報錯引導新叢集