環境
操縱系統及核心版本(最小化安裝,不自帶python)
[[email protected] ~]# cat /etc/redhat-release
CentOS Linux release 8.0.1905 (Core)
[[email protected] ~]# uname -r
4.18.0-80.el8.x86_64
[[email protected] ~]# uname -a
Linux node-01 4.18.0-80.el8.x86_64 #1 SMP Tue Jun 4 09:19:46 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
ip位址及主機名
192.168.91.133 node-01
192.168.91.134 node-02
192.168.91.135 node-03
前置操作
關閉防火牆及selinux
所有節點都操作
systemctl disable firewalld
systemctl stop firewalld
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
setenforce 0
配置yum源位址
所有節點都操作
注意:如下圖Centos8.0(1905)的yum源centos官網已經不再維護,下載下傳cephadm會報錯,是以将其改成vault源。
![](https://img.laitimes.com/img/__Qf2AjLwojIjJCLyojI0JCLiYTMfhHLlN3XnxCM38FdsYkRGZkRG9lcvx2bjxCMy8VZ6l2csEVbaZzZlBXN20mN1ITZwVTQClGVF5UMR9Fd4VGdsATNfd3bkFGazxSUhxGatJGbwhFT1Y0Mk9VZwlHdssmch1mclRXY39CXldWYtlWPzNXZj9mcw1ycz9WL49zZuBnLjVGOilTM4Q2M1kTZmlDOiZ2N3QDNzEWNhV2M2EWY4I2Lc52YucWbp5GZzNmLn9Gbi1yZtl2Lc9CX6MHc0RHaiojIsJye.png)
[[email protected] ~]# cat /etc/yum.repos.d/CentOS-AppStream.repo
# CentOS-AppStream.repo
#
# The mirror system uses the connecting IP address of the client and the
# update status of each mirror to pick mirrors that are updated to and
# geographically close to the client. You should use this for CentOS updates
# unless you are manually picking other mirrors.
#
# If the mirrorlist= does not work for you, as a fall back you can try the
# remarked out baseurl= line instead.
#
#
[AppStream]
name=CentOS-$releasever - AppStream
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=AppStream&infra=$infra
#baseurl=http://mirror.centos.org/$contentdir/$releasever/AppStream/$basearch/os/
#更改baseurl行
baseurl=http://vault.centos.org/$contentdir/$releasever/AppStream/$basearch/os/
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial
[[email protected] ~]# cat /etc/yum.repos.d/CentOS-Base.repo
# CentOS-Base.repo
#
# The mirror system uses the connecting IP address of the client and the
# update status of each mirror to pick mirrors that are updated to and
# geographically close to the client. You should use this for CentOS updates
# unless you are manually picking other mirrors.
#
# If the mirrorlist= does not work for you, as a fall back you can try the
# remarked out baseurl= line instead.
#
#
[BaseOS]
name=CentOS-$releasever - Base
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=BaseOS&infra=$infra
#baseurl=http://mirror.centos.org/$contentdir/$releasever/BaseOS/$basearch/os/
#更改baseurl行
baseurl=http://vault.centos.org/$contentdir/$releasever/BaseOS/$basearch/os/
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial
dnf clean all
時鐘同步
所有節點都操作
安裝時鐘同步軟體
dnf -y install chrony
#開機自啟動
systemctl enable chronyd
#啟動
systemctl start chronyd
#驗證
chronyc sources
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^+ 139.199.214.202 2 6 33 4 +4521us[+4521us] +/- 79ms
^? makaki.miuku.net 0 6 0 - +0ns[ +0ns] +/- 0ns
^- de-user.deepinid.deepin.> 3 6 17 6 +34ms[ +34ms] +/- 164ms
^* time.cloudflare.com 3 6 17 7 +122us[ +21ms] +/- 128ms
安裝podman
所有節點都操作
報錯
invalid literal for int() with base 10
[[email protected] ~]# cephadm bootstrap --mon-ip 192.168.91.133
Traceback (most recent call last):
File "/usr/sbin/cephadm", line 8571, in <module>
main()
File "/usr/sbin/cephadm", line 8557, in main
check_container_engine(ctx)
File "/usr/sbin/cephadm", line 2014, in check_container_engine
engine.get_version(ctx)
File "/usr/sbin/cephadm", line 197, in get_version
self._version = _parse_podman_version(out)
File "/usr/sbin/cephadm", line 1603, in _parse_podman_version
return tuple(map(to_int, version_str.split('.')))
File "/usr/sbin/cephadm", line 1601, in to_int
return to_int(val[0:-1], org_e or e)
File "/usr/sbin/cephadm", line 1597, in to_int
raise org_e
File "/usr/sbin/cephadm", line 1599, in to_int
return int(val)
ValueError: invalid literal for int() with base 10: ''
解決:
安裝podman
安裝cephadm
所有節點都操作。
說明:安裝這個會順帶把python3也安裝上,而後續的操作(比如添加主機)是需要python3環境的,是以就直接執行這個把python3安裝了。
dnf install --assumeyes centos-release-ceph-pacific.noarch
dnf install --assumeyes cephadm
引導新叢集
修改cephadm腳本
将鏡像位址改成下載下傳速度快的位址
cat /usr/sbin/cephadm |head -75
替換後的内容如下
# Default container images -----------------------------------------------------
#DEFAULT_IMAGE = 'quay.io/ceph/ceph:v16'
#DEFAULT_IMAGE_IS_MASTER = False
#DEFAULT_IMAGE_RELEASE = 'pacific'
#DEFAULT_PROMETHEUS_IMAGE = 'quay.io/prometheus/prometheus:v2.18.1'
#DEFAULT_NODE_EXPORTER_IMAGE = 'quay.io/prometheus/node-exporter:v0.18.1'
#DEFAULT_ALERT_MANAGER_IMAGE = 'quay.io/prometheus/alertmanager:v0.20.0'
#DEFAULT_GRAFANA_IMAGE = 'quay.io/ceph/ceph-grafana:6.7.4'
#DEFAULT_HAPROXY_IMAGE = 'docker.io/library/haproxy:2.3'
#DEFAULT_KEEPALIVED_IMAGE = 'docker.io/arcts/keepalived'
#DEFAULT_REGISTRY = 'docker.io' # normalize unqualified digests to this
# ------------------------------------------------------------------------------
# Default container images -----------------------------------------------------
DEFAULT_IMAGE = 'docker.io/ceph/ceph:v16'
DEFAULT_IMAGE_IS_MASTER = False
DEFAULT_IMAGE_RELEASE = 'pacific'
DEFAULT_PROMETHEUS_IMAGE = 'docker.io/bitnami/prometheus:latest'
DEFAULT_NODE_EXPORTER_IMAGE = 'docker.io/bitnami/node-exporter:latest'
DEFAULT_ALERT_MANAGER_IMAGE = 'docker.io/prom/alertmanager:latest'
DEFAULT_GRAFANA_IMAGE = 'docker.io/ceph/ceph-grafana:latest'
DEFAULT_HAPROXY_IMAGE = 'docker.io/library/haproxy:2.3'
DEFAULT_KEEPALIVED_IMAGE = 'docker.io/arcts/keepalived'
DEFAULT_REGISTRY = 'docker.io' # normalize unqualified digests to this
# ------------------------------------------------------------------------------
[[email protected] ~]# cephadm bootstrap --mon-ip 192.168.91.133
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman (/usr/bin/podman) version 3.3.1 is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 2e99a36a-bfb8-11ec-8fe2-000c29779b64
Verifying IP 192.168.91.133 port 3300 ...
Verifying IP 192.168.91.133 port 6789 ...
Mon IP `192.168.91.133` is in CIDR network `192.168.91.0/24`
- internal network (--cluster-network) has not been provided, OSD replication will default to the public_network
Pulling container image docker.io/ceph/ceph:v16...
Ceph version: ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network to 192.168.91.0/24
Wrote config to /etc/ceph/ceph.conf
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Creating mgr...
Verifying port 9283 ...
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/15)...
mgr not available, waiting (2/15)...
mgr not available, waiting (3/15)...
mgr not available, waiting (4/15)...
mgr not available, waiting (5/15)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for mgr epoch 5...
mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to /etc/ceph/ceph.pub
Adding key to [email protected] authorized_keys...
Adding host node-01...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Deploying prometheus service with default placement...
Deploying grafana service with default placement...
Deploying node-exporter service with default placement...
Deploying alertmanager service with default placement...
Enabling the dashboard module...
Waiting for the mgr to restart...
Waiting for mgr epoch 9...
mgr epoch 9 is available
Generating a dashboard self-signed certificate...
Creating initial admin user...
Fetching dashboard port number...
Ceph Dashboard is now available at:
URL: https://node-01:8443/
User: admin
Password: ld8nohjdgd
Enabling client.admin keyring and conf on hosts with "admin" label
You can access the Ceph CLI with:
sudo /usr/sbin/cephadm shell --fsid 2e99a36a-bfb8-11ec-8fe2-000c29779b64 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
Please consider enabling telemetry to help improve Ceph:
ceph telemetry on
For more information see:
https://docs.ceph.com/docs/pacific/mgr/telemetry/
Bootstrap complete.
儲存如下資訊
Ceph Dashboard is now available at:
URL: https://node-01:8443/
User: admin
Password: ld8nohjdgd
Enabling client.admin keyring and conf on hosts with "admin" label
You can access the Ceph CLI with:
sudo /usr/sbin/cephadm shell --fsid 2e99a36a-bfb8-11ec-8fe2-000c29779b64 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
Please consider enabling telemetry to help improve Ceph:
ceph telemetry on
For more information see:
https://docs.ceph.com/docs/pacific/mgr/telemetry/
Bootstrap complete.
通路ceph的dashboard
注意:用https,否則通路不到。
https://192.168.91.133:8443
添加主機
登入ceph指令行(關于對ceph的操作都需要先登入ceph的指令行,有免登入的設定,本文未設定)
輸出
Inferring fsid 2e99a36a-bfb8-11ec-8fe2-000c29779b64
Using recent ceph image docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb
添加node-01節點
輸出
添加node-02節點
輸出
添加node-03節點
輸出
報錯
[ceph: [email protected] /]# ceph orch host add node-02 192.168.91.134
Error EINVAL: Host node-02 (192.168.91.134) failed check(s): []
原因:該節點沒有安裝podman或者docker。
[ceph: [email protected] /]# ceph orch host add node-03 192.168.91.135
Error EINVAL: Can't communicate with remote host `192.168.91.135`, possibly because python3 is not installed there: cannot send (already closed?)
原因:該節點沒有安裝python3。
解決:
dnf install --assumeyes centos-release-ceph-pacific.noarch
dnf install --assumeyes cephadm
檢視可用主機和裝置
[ceph: [email protected] /]# ceph orch host ls
HOST ADDR LABELS STATUS
node-01 192.168.91.133
node-02 192.168.91.134
node-03 192.168.91.135
[ceph: [email protected] /]# ceph orch device ls
Hostname Path Type Serial Size Health Ident Fault Available
node-01 /dev/sdb hdd 21.4G Unknown N/A N/A Yes
node-02 /dev/sdb hdd 21.4G Unknown N/A N/A Yes
注意:node-03節點的裝置沒有顯示出來,而node-03節點實際是有可用裝置的。為什麼?
node-03節點相關的容器沒有啟動。
等node-03節點相關容器啟動再次檢視
[ceph: [email protected] /]# ceph orch host ls
HOST ADDR LABELS STATUS
node-01 192.168.91.133
node-02 192.168.91.134
node-03 192.168.91.135
[ceph: [email protected] /]# ceph orch device ls
Hostname Path Type Serial Size Health Ident Fault Available
node-01 /dev/sdb hdd 21.4G Unknown N/A N/A Yes
node-02 /dev/sdb hdd 21.4G Unknown N/A N/A Yes
node-03 /dev/sdb hdd 21.4G Unknown N/A N/A Yes
各個主機目前啟動的容器
[[email protected] ~]# podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a620ac0a433a docker.io/ceph/ceph:v16 -n mon.node-01 -f... 22 hours ago Up 22 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-mon-node-01
5fbcb8b47e20 docker.io/ceph/ceph:v16 -n mgr.node-01.vk... 22 hours ago Up 22 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-mgr-node-01-vkduxo
8e15dc167b28 docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb -n client.crash.n... 22 hours ago Up 22 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-crash.node-01
5a39b8488648 docker.io/prom/node-exporter:v0.18.1 --no-collector.ti... 22 hours ago Up 22 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-node-exporter.node-01
dc95861f41f8 docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb 21 hours ago Up 21 hours ago romantic_varahamihira
7f35cdb22c80 docker.io/prom/alertmanager:v0.20.0 --cluster.listen-... 19 hours ago Up 19 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-alertmanager.node-01
82a2ec351349 docker.io/prom/prometheus:v2.18.1 --config.file=/et... 2 hours ago Up 2 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-prometheus.node-01
[[email protected] ~]# podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a333d817326f docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb -n client.crash.n... 19 hours ago Up 19 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-crash.node-02
a3dfbe105d18 docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb -n mgr.node-02.kn... 19 hours ago Up 19 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-mgr.node-02.knnehw
0deab1a1d01e docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb -n mon.node-02 -f... 19 hours ago Up 19 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-mon.node-02
ddd55502e17b docker.io/prom/node-exporter:v0.18.1 --no-collector.ti... 19 hours ago Up 19 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-node-exporter.node-02
[[email protected] ~]# podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
47ba78c7a826 docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb -n client.crash.n... 2 hours ago Up 2 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-crash.node-03
008e3dc211a3 docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb -n mon.node-03 -f... 2 hours ago Up 2 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-mon.node-03
24166557cea6 docker.io/prom/node-exporter:v0.18.1 --no-collector.ti... 2 hours ago Up 2 hours ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-node-exporter.node-03
cc53a5de5faf docker.io/ceph/[email protected]:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb -n osd.0 -f --set... 10 minutes ago Up 10 minutes ago ceph-2e99a36a-bfb8-11ec-8fe2-000c29779b64-osd.0
建立osds
建立osd有兩種方法,本文采用的是第一種。
方法一:告訴ceph使用一切可用和未使用的裝置
輸出
Scheduled osd.all-available-devices update...
方法二:從特定主機的特定裝置建立osd
輸出(因為已經用第一種方法從node-01:/dev/sdb建立好了osd,是以會輸出這個提示)
Created no osd(s) on host node-01; already created?
驗證叢集狀态
說明:到目前為止叢集狀态就已經正常了。
ceph指令行驗證
[ceph: [email protected] /]# ceph -s
cluster:
id: 2e99a36a-bfb8-11ec-8fe2-000c29779b64
health: HEALTH_OK
services:
mon: 3 daemons, quorum node-01,node-02,node-03 (age 2h)
mgr: node-01.vkduxo(active, since 22h), standbys: node-02.knnehw
osd: 3 osds: 3 up (since 7m), 3 in (since 7m)
data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 15 MiB used, 60 GiB / 60 GiB avail
pgs: 1 active+clean
web界面驗證