天天看點

在CentOS7上部署GFS叢集在CentOS7上部署GFS叢集

在CentOS7上部署GFS叢集

  • 在CentOS7上部署GFS叢集
    • 準備工作
      • 了解GFS的知識(一定要熟讀!)
      • 叢集各主機hostname和hosts檔案設定
      • 確定各主機已連接配接到SAN或IP-SAN
    • 安裝必要的軟體包
    • 配置corosync服務
    • 配置lvm
    • 啟動相關服務
    • 設定叢集卷組等并格式化gfs分區
    • 挂載gfs2邏輯卷
    • 常見錯誤

準備工作

了解GFS的知識(一定要熟讀!)

本文提到的GFS是Redhat的Global File System,不是Google的Google File System。GFS的官方文檔連結如下:

RHEL6版(中文)https://access.redhat.com/documentation/zh-cn/red_hat_enterprise_linux/6/html/global_file_system_2/

RHEL7版(英文)https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/global_file_system_2/index

叢集各主機hostname和hosts檔案設定

在叢集的每個主機上,修改hostname檔案并確定主機名在叢集中獨一無二,在hosts檔案中列出叢集所有主機的hostname和IP位址。示例如下:

[[email protected] ~]# cat /etc/hostname
a01

[[email protected] ~]# cat /etc/hosts
   localhost localhost.localdomain localhost4 localhost4.localdomain4
::         localhost localhost.localdomain localhost6 localhost6.localdomain6
 a01
 a02
 a03
           

確定各主機已連接配接到SAN或IP-SAN

可參考我之前的一篇文章(但不要執行最後一步“在客戶機上對ISCSI存儲卷進行分區和挂載”,否則會遇到許多意料之外的問題):

https://blog.csdn.net/huzhenwei/article/details/80690623

安裝必要的軟體包

yum -y install gfs2-utils  lvm2-cluster
yum install lvm2-sysvinit
           

配置corosync服務

建立/etc/corosync/corosync.conf檔案,内容示例如下:

# Please read the corosync.conf.5 manual page
totem {
        version: 2
        cluster_name: cluster0
        # crypto_cipher and crypto_hash: Used for mutual node authentication.
        # If you choose to enable this, then do remember to create a shared
        # secret with "corosync-keygen".
        # enabling crypto_cipher, requires also enabling of crypto_hash.
        crypto_cipher: none
        crypto_hash: none
        clear_node_high_bit: yes

        # interface: define at least one interface to communicate
        # over. If you define more than one interface stanza, you must
        # also set rrp_mode.
        interface {
                # Rings must be consecutively numbered, starting at 
                ringnumber: 
                # This is normally the *network* address of the
                # interface to bind to. This ensures that you can use
                # identical instances of this configuration file
                # across all your cluster nodes, without having to
                # modify this option.
                bindnetaddr: 
                # However, if you have multiple physical network
                # interfaces configured for the same subnet, then the
                # network address alone is not sufficient to identify
                # the interface Corosync should bind to. In that case,
                # configure the *host* address of the interface
                # instead:
                # bindnetaddr: 
                # When selecting a multicast address, consider RFC
                #  (which, among other things, specifies that
                # .x.x addresses are left to the discretion of
                # the network administrator). Do not reuse multicast
                # addresses across multiple Corosync clusters sharing
                # the same network.
                mcastaddr: 
                # Corosync uses the port you specify here for UDP
                # messaging, and also the immediately preceding
                # port. Thus if you set this to , Corosync sends
                # messages over UDP ports  and 
                mcastport: 
                # Time-to-live for cluster communication packets. The
                # number of hops (routers) that this ring will allow
                # itself to pass. Note that multicast routing must be
                # specifically enabled on most network routers.
                ttl: 
        }
}

logging {
        # Log the source file and line where messages are being
        # generated. When in doubt, leave off. Potentially useful for
        # debugging.
        fileline: off
        # Log to standard error. When in doubt, set to no. Useful when
        # running in the foreground (when invoking "corosync -f")
        to_stderr: no
        # Log to a log file. When set to "no", the "logfile" option
        # must not be set.
        to_logfile: yes
        logfile: /var/log/cluster/corosync.log
        # Log to the system log daemon. When in doubt, set to yes.
        to_syslog: yes
        # Log debug messages (very verbose). When in doubt, leave off.
        debug: on
        # Log messages with time stamps. When in doubt, set to on
        # (unless you are only logging to syslog, where double
        # timestamps can be annoying).
        timestamp: on
        logger_subsys {
                subsys: QUORUM
                debug: off
        }
}

quorum {
        # Enable and configure quorum subsystem (default: off)
        # see also corosync.conf and votequorum
        provider: corosync_votequorum
        expected_votes: 
        votes: 
}

           

注意關鍵配置項:

cluster_name: 此項設定的叢集名,在之後格式化邏輯卷為gfs2類型時要用到。

clear_node_high_bit: yes:dlm服務需要,否則無法啟動。

quorum {}子產品:叢集選舉投票設定,必需配置。如果是2節點的叢集,可參考如下配置:

quorum {
        # Enable and configure quorum subsystem (default: off)
        # see also corosync.conf.5 and votequorum.5
        provider: corosync_votequorum
        two_node: 
        expected_votes: 
        #votes: 1
}
           

配置lvm

# 開啟lvm的叢集模式。
# 這個指令會自動修改/etc/lvm/lvm.conf配置檔案中的locking_type和use_lvmetad選項的值
[[email protected] ~]# lvmconf --enable-cluster

修改後差別如下:
[[email protected] ~]# diff /etc/lvm/lvm.conf /etc/lvm/lvm.conf.lvmconfold
771c771
<     locking_type = 3
---
>       locking_type = 3
940c940
<     use_lvmetad = 0
---
>       use_lvmetad = 1
           

啟動相關服務

注意先後順序,先在叢集所有主機上啟動corosync,叢集狀态正常之後,在各個主機上啟動dlm、clvmd服務。

#在叢集所有主機上啟動corosync服務
systemctl enable corosync
systemctl start corosync
           

在corosync服務啟動完成後,可以使用corosync-quorumtool -s檢視叢集的狀态,確定狀态中的Quorate為Yes。示例如下:

[[email protected] ~]# corosync-quorumtool -s
Quorum information
------------------
Date:             Fri Jun 15 16:56:07 2018
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          1084777673
Ring ID:          1084777673/380
Quorate:          Yes
           

在叢集各個主機上啟動dlm、clvmd服務

systemctl enable dlm
systemctl start dlm

systemctl enable clvmd
systemctl start clvmd
           

設定叢集卷組等并格式化gfs分區

在叢集的一台主機上運作如下指令:

# 檢視實體卷資訊,如果沒有列出存儲伺服器上的塊裝置,請確定各主機已連接配接到SAN或IP-SAN
pvscan
pvs

# 建立卷組、邏輯卷并将邏輯卷格式化為gfs2類型
vgcreate -Ay -cy gfsvg /dev/mapper/mpatha
lvcreate -L G -n gfsvol1 gfsvg 
lvs -o +devices gfsvg
# 下面指令中的-j 4參數是指日志區的數量為4,這個數量通常為叢集主機數N+1,
# 叢集規模擴大時,可以使用gfs2_jadd指令添加日志區。
# -t後的“cluster0”為叢集名,確定它與corosync.conf中配置的cluster_name一緻。
mkfs.gfs2 -p lock_dlm -t cluster0:gfsvolfs -j  /dev/gfsvg/gfsvol1
           

挂載gfs2邏輯卷

在叢集的所有主機上運作如下指令:

# 建立挂載點目錄
mkdir /mnt/iscsigfs
# 挂載gfs2分區
# 如果主機提示沒有/dev/gfsvg/gfsvol1這個裝置,原因是主機在建立此邏輯卷之前啟動,将主機重新啟動即可。
mount -t gfs2 /dev/gfsvg/gfsvol1 /mnt/iscsigfs -o noatime,nodiratime
           

設定開機自動挂載,請參考:

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/storage_administration_guide/iscsi-api

常見錯誤

  • 運作pvs、vgcreate等lvm相關指令時提示connect()等錯誤

    錯誤示例如下,原因是clvmd沒有正常啟動

    connect() failed on local socket: 沒有那個檔案或目錄 Internal cluster locking initialisation failed. WARNING: Falling back to local file-based locking. Volume Groups with the clustered attribute will be inaccessible.

  • 運作pvs、vgcreate等lvm相關指令時提示Skipping clustered volume group XXX或者Device XXX excluded by a filter

    這通常是做了vgremove等操作後、或塊裝置上存在舊的卷組資訊導緻的,可以通過重新格式化塊裝置的方式解決。指令示例如下:

    mkfs.xfs /dev/mapper/mpatha

繼續閱讀