天天看点

Corosync+Pacemaker+DRBD+NFS高可用实例配置

环境说明:    

操作系统: centos 6.6 x64,本文采用rpm方式安装corosync+pacemaker+drbd+nfs。 

[root@app1 soft]# vi /etc/hosts   

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4    

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6    

192.168.0.24         app1    

192.168.0.25         app2    

10.10.10.24          app1-priv    

10.10.10.25          app2-priv

说明:10段是心跳ip, 192.168段是业务ip, 采用vip地址是192.168.0.26。

sed -i '/selinux/s/enforcing/disabled/' /etc/selinux/config 

setenforce 0 

chkconfig iptables off 

service iptables stop

app1: 

[root@app1 ~]# ssh-keygen  -t rsa -f ~/.ssh/id_rsa  -p ''  

[root@app1 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@app2

app2: 

[root@app2 ~]# ssh-keygen  -t rsa -f ~/.ssh/id_rsa  -p '' 

[root@app2 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@app1

app1: /dev/sdb1  —> app2: /dev/sdb1

(1) 下载drbd安装包, centos6.6采用kmod-drbd84-8.4.5-504.1安装包才可用。 

<a href="http://rpm.pbone.net/" target="_blank">http://rpm.pbone.net/</a>

drbd84-utils-8.9.1-1.el6.elrepo.x86_64.rpm 

kmod-drbd84-8.4.5-504.1.el6.x86_64.rpm

# rpm -ivh drbd84-utils-8.9.5-1.el6.elrepo.x86_64.rpm kmod-drbd84-8.4.5-504.1.el6.x86_64.rpm 

preparing...                ########################################### [100%] 

   1:drbd84-utils           ########################################### [ 50%] 

   2:kmod-drbd84            ########################################### [100%] 

working. this may take some time ... 

done. 

#

(2) 加载drbd到内核模块

app1,app2分别操作,并加入到/etc/rc.local文件中。 

modprobe drbd 

lsmode |grep drbd

[root@app1 ~]# vi /etc/drbd.d/global_common.conf 

global { 

        usage-count no; 

common { 

        protocol c; 

        disk { 

                on-io-error detach; 

                no-disk-flushes; 

                no-md-flushes;  

        } 

        net { 

                sndbuf-size 512k; 

                max-buffers     8000; 

                unplug-watermark   1024; 

                max-epoch-size  8000; 

                cram-hmac-alg "sha1"; 

                shared-secret "hdhwxes23syehart8t"; 

                after-sb-0pri disconnect; 

                after-sb-1pri disconnect; 

                after-sb-2pri disconnect; 

                rr-conflict disconnect; 

        syncer { 

                rate 300m; 

                al-extents 517; 

}

resource data { 

      on app1 { 

               device    /dev/drbd0; 

               disk      /dev/sdb1; 

               address   10.10.10.24:7788; 

               meta-disk internal; 

      } 

      on app2 { 

               device     /dev/drbd0; 

               disk       /dev/sdb1; 

               address    10.10.10.25:7788; 

在app1和app2上分别执行:

# drbdadm create-md data

initializing activity log 

not initializing bitmap 

writing meta data... 

new drbd meta data block successfully created.

在app1和app2上分别执行:或采用 drbdadm up data

# service drbd start

starting drbd resources: [ 

     create res: data 

   prepare disk: data 

    adjust disk: data 

     adjust net: data 

.......... 

cat /proc/drbd       #或者直接使用命令drbd-overview

节点1: 

[root@app1 drbd.d]# cat /proc/drbd  

version: 8.4.5 (api:1/proto:86-101) 

git-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by [email protected], 2015-01-02 12:06:20

0: cs:connected ro:secondary/secondary ds:inconsistent/inconsistent c r----- 

    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:20964116

节点2: 

[root@app2 drbd.d]# cat /proc/drbd  

我们需要将其中一个节点设置为primary,在要设置为primary的节点上执行如下两条命令均可: 

drbdadm -- --overwrite-data-of-peer primary data   

主节点查看同步状态: 

0: cs:syncsource ro:primary/secondary ds:uptodate/inconsistent c r----- 

    ns:1229428 nr:0 dw:0 dr:1230100 al:0 bm:0 lo:0 pe:2 ua:0 ap:0 ep:1 wo:d oos:19735828 

        [&gt;...................] sync'ed:  5.9% (19272/20472)m 

        finish: 0:27:58 speed: 11,744 (11,808) k/sec 

[root@app1 drbd.d]#

文件系统的挂载只能在primary节点进行,只有在设置了主节点后才能对drbd设备进行格式化, 格式化与手动挂载测试。

[root@app1 ~]# mkfs.ext4 /dev/drbd0 

[root@app1 ~]# mount /dev/drbd0 /data

# vi /etc/exports 

/data 192.168.0.0/24(rw,no_root_squash)

# service rpcbind start 

# service nfs start 

# chkconfig rpcbind on 

# chkconfig nfs on

# yum install corosync pacemaker -y

rhel自6.4起不再提供集群的命令行配置工具crmsh,要实现对集群资源管理,还需要独立安装crmsh。 

[root@app1 crm]# yum install python-dateutil -y   

说明:python-pssh、pssh依懒于python-dateutil包

[root@app1 crm]# rpm -ivh pssh-2.3.1-4.2.x86_64.rpm python-pssh-2.3.1-4.2.x86_64.rpm crmsh-2.1-1.6.x86_64.rpm 

warning: pssh-2.3.1-4.2.x86_64.rpm: header v3 rsa/sha1 signature, key id 17280ddf: nokey 

   1:python-pssh            ########################################### [ 33%] 

   2:pssh                   ########################################### [ 67%] 

   3:crmsh                  ########################################### [100%] 

[root@app1 crm]# 

[root@app1 crm]#

cd /etc/corosync/ 

cp corosync.conf.example corosync.conf

vi /etc/corosync/corosync.conf 

# please read the corosync.conf.5 manual page 

compatibility: whitetank 

totem {    

        version: 2 

        secauth: on 

        threads: 0 

        interface { 

                ringnumber: 0 

                bindnetaddr: 10.10.10.0 

                mcastaddr: 226.94.8.8 

                mcastport: 5405 

                ttl: 1 

logging { 

        fileline: off 

        to_stderr: no 

        to_logfile: yes 

        to_syslog: no 

        logfile: /var/log/cluster/corosync.log 

        debug: off 

        timestamp: on 

        logger_subsys { 

                subsys: amf 

                debug: off 

amf { 

        mode: disabled 

service { 

        ver:  1                   

        name: pacemaker        

aisexec { 

        user: root 

        group:  root 

各节点之间通信需要安全认证,需要安全密钥,生成后会自动保存至当前目录下,命名为authkey,权限为400。

[root@app1 corosync]# corosync-keygen 

corosync cluster engine authentication key generator. 

gathering 1024 bits for key from /dev/random. 

press keys on your keyboard to generate entropy. 

press keys on your keyboard to generate entropy (bits = 128). 

press keys on your keyboard to generate entropy (bits = 192). 

press keys on your keyboard to generate entropy (bits = 256). 

press keys on your keyboard to generate entropy (bits = 320). 

press keys on your keyboard to generate entropy (bits = 384). 

press keys on your keyboard to generate entropy (bits = 448). 

press keys on your keyboard to generate entropy (bits = 512). 

press keys on your keyboard to generate entropy (bits = 576). 

press keys on your keyboard to generate entropy (bits = 640). 

press keys on your keyboard to generate entropy (bits = 704). 

press keys on your keyboard to generate entropy (bits = 768). 

press keys on your keyboard to generate entropy (bits = 832). 

press keys on your keyboard to generate entropy (bits = 896). 

press keys on your keyboard to generate entropy (bits = 960). 

writing corosync key to /etc/corosync/authkey. 

[root@app1 corosync]#

# scp authkeys corosync.conf  root@app2:/etc/corosync/  

节点1:   

[root@app1 ~]# service corosync start    

starting corosync cluster engine (corosync):               [ok]

[root@app1 ~]# service pacemaker start 

starting pacemaker cluster manager                         [ok]

配置服务开机自启动: 

chkconfig corosync on 

chkconfig pacemaker on

节点2:   

[root@app2 ~]# service corosync start    

(1) 查看节点情况

[root@app1 ~]# crm status 

last updated: tue jan 26 13:13:19 2016 

last change: mon jan 25 17:46:04 2016 via cibadmin on app1 

stack: classic openais (with plugin) 

current dc: app1 - partition with quorum 

version: 1.1.10-14.el6-368c726 

2 nodes configured, 2 expected votes 

0 resources configured

online: [ app1 app2 ]

(2) 查看端口启动情况

# netstat -tunlp 

active internet connections (only servers) 

proto recv-q send-q local address               foreign address             state       pid/program name   

udp        0      0 10.10.10.25:5404            0.0.0.0:*                               2828/corosync       

udp        0      0 10.10.10.25:5405            0.0.0.0:*                               2828/corosync       

udp        0      0 226.94.8.8:5405             0.0.0.0:*                               2828/corosync      

(3) 查看日志

[root@app1 corosync]# tail -f  /var/log/cluster/corosync.log

可以查看日志中关键信息: 

jan 23 16:09:30 corosync [main  ] corosync cluster engine ('1.4.7'): started and ready to provide service. 

jan 23 16:09:30 corosync [main  ] successfully read main configuration file '/etc/corosync/corosync.conf'. 

.... 

jan 23 16:09:30 corosync [totem ] initializing transmit/receive security: libtomcrypt sober128/sha1hmac (mode 0). 

jan 23 16:09:31 corosync [totem ] the network interface [10.10.10.24] is now up. 

jan 23 16:09:31 corosync [totem ] a processor joined or left the membership and a new membership was formed. 

jan 23 16:09:48 corosync [totem ] a processor joined or left the membership and a new membership was formed. 

corosync默认启用了stonith功能,而我们要配置的集群并没有stonith设备,因此在配置集群的全局属性时要对其禁用。

# crm 

crm(live)# configure                                      ##进入配置模式 

crm(live)configure# property stonith-enabled=false        ##禁用stonith设备 

crm(live)configure# property no-quorum-policy=ignore      ##不具备法定票数时采取的动作 

crm(live)configure# rsc_defaults resource-stickiness=100  ##设置默认的资源黏性,只对当前节点有效。 

crm(live)configure# verify                                ##校验 

crm(live)configure# commit                                ##校验没有错误再提交 

crm(live)configure# show                                  ##查看当前配置 

node app1 

node app2 

property cib-bootstrap-options: \ 

        dc-version=1.1.11-97629de \ 

        cluster-infrastructure="classic openais (with plugin)" \ 

        expected-quorum-votes=2 \ 

        stonith-enabled=false \ 

        default-resource-stickiness=100 \ 

        no-quorum-policy=ignore

或:

# crm configure property stonith-enabled=false 

# crm configure property no-quorum-policy=ignore 

# crm configure property default-resource-stickiness=100

#命令使用经验说明:verify报错的,可以直接退出,也可以采用edit编辑,修改正确为止。 

# crm configure edit  可以直接编辑配置文件

不要单个资源提交,等所有资源及约束一起建立之后提交。 

crm(live)configure# primitive vip ocf:heartbeat:ipaddr params ip=192.168.0.26 cidr_netmask=24 nic=eth0:1 op monitor interval=30s timeout=20s on-fail=restart 

crm(live)configure# verify    #验证一下参数是否正确

说明:

primitive                   :定义资源命令 

myip                        :资源id名,可自行定义 

ocf:heartbeat:ipaddr        :资源代理(ra) 

params ip=192.168.0.26      :定义vip 

op monitor                  :监控该资源 

interval=30s                :间隔时间 

timeout=20s                 :超时时间 

on-fail=restart             :如服务非正常关闭,让其重启,如重启不了,再转移至其他节点

crm(live)configure# primitive mydrbd ocf:linbit:drbd params drbd_resource=data op monitor role=master interval=20 timeout=30 op monitor role=slave interval=30 timeout=30 op start timeout=240 op stop timeout=100 

crm(live)configure# verify

crm(live)configure# ms ms_mydrbd mydrbd meta master-max=1 master-node-max=1 clone-max=2  clone-node-max=1 notify=true 

crm(live)configure# primitive mystore ocf:heartbeat:filesystem params device=/dev/drbd0 directory=/data fstype=ext4 op start timeout=60s op stop timeout=60s op monitor interval=30s timeout=40s on-fail=restart 

创建组资源,vip与mystore一起。                    

crm(live)configure# group g_service vip mystore 

创建位置约束,组资源的启动依懒于drbd主节点 

crm(live)configure# colocation c_g_service inf: g_service ms_mydrbd:master 

创建位置约整,mystore存储挂载依赖于drbd主节点 

crm(live)configure# colocation mystore_with_drbd_master inf: mystore ms_mydrbd:master 

启动顺序依懒,drbd启动后,创建g_service组资源 

crm(live)configure# order o_g_service inf: ms_mydrbd:promote g_service:start 

crm(live)configure# verify 

crm(live)configure# commit

last updated: mon jan 25 22:24:55 2016 

last change: mon jan 25 22:24:46 2016 

current dc: app2 - partition with quorum 

version: 1.1.11-97629de 

4 resources configured

master/slave set: ms_mydrbd [mydrbd] 

     masters: [ app1 ] 

     slaves: [ app2 ] 

resource group: g_service 

     vip        (ocf::heartbeat:ipaddr):        started app1 

     mystore    (ocf::heartbeat:filesystem):    started app1 

[root@app1 ~]#

#说明:切换测试时有时会出现警告提示,影响真实状态查看,可以采用如下方式清除,提示哪个资源报警就清哪个,清理后,再次crm status查看状态显示正常。 

failed actions: 

mystore_stop_0 on app1 'unknown error' (1): call=97, status=complete, last-rc-change='tue jan 26 14:39:21 2016', queued=6390ms, exec=0ms

[root@app1 ~]# crm resource cleanup mystore 

cleaning up mystore on app1 

cleaning up mystore on app2 

waiting for 2 replies from the crmd.. ok 

(1) 查看drbd挂载目录

[root@app2 ~]# df -h 

filesystem            size  used avail use% mounted on 

/dev/mapper/vg_app2-lv_root 

                       35g  3.7g   30g  11% / 

tmpfs                 497m   45m  452m  10% /dev/shm 

/dev/sda1             477m   34m  418m   8% /boot 

192.168.1.26:/data     20g   44m   19g   1% /mnt 

/dev/drbd0             20g   44m   19g   1% /data 

[root@app2 ~]#

(2) 查看drbd主备情况

[root@app2 ~]# cat /proc/drbd 

0: cs:connected ro:primary/secondary ds:uptodate/uptodate c r----- 

    ns:20484 nr:336 dw:468 dr:21757 al:4 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0

[root@app1 ~]# cat /proc/drbd 

0: cs:connected ro:secondary/primary ds:uptodate/uptodate c r----- 

    ns:0 nr:20484 dw:20484 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0

(3) nfs客户端挂载读写正常

[root@vm15 ~]# df -h 

/dev/sda3              21g  4.6g   15g  24% / 

/dev/sda1              99m   23m   72m  25% /boot 

tmpfs                 7.4g     0  7.4g   0% /dev/shm 

/dev/mapper/vg-data    79g   71g  4.2g  95% /data 

192.168.0.26:/data/   5.0g  138m  4.6g   3% /mnt 

[root@vm15 ~]# 

[root@vm15 ~]# cd /mnt 

[root@vm15 mnt]# ls 

abc.txt  lost+found 

[root@vm15 mnt]# cp abc.txt a.txt 

[root@vm15 mnt]# 

a.txt  abc.txt  lost+found 

[root@vm15 mnt]#

(1) 关闭app1节点,资源全都在节点2启动

[root@app2 ~]# crm status 

last updated: tue jan 26 13:31:54 2016 

last change: tue jan 26 13:30:21 2016 via cibadmin on app1 

online: [ app2 ] 

offline: [ app1 ]

     masters: [ app2 ] 

     stopped: [ app1 ] 

     vip        (ocf::heartbeat:ipaddr):        started app2 

     mystore    (ocf::heartbeat:filesystem):    started app2 

(2) 磁盘目录挂载成功 

filesystem                   size  used avail use% mounted on 

/dev/mapper/vg_app2-lv_root   36g  3.7g   30g  11% / 

tmpfs                       1004m   44m  960m   5% /dev/shm 

/dev/sda1                    485m   39m  421m   9% /boot 

/dev/drbd0                   5.0g  138m  4.6g   3% /data 

(3) drbd也切换成了主节点: 

version: 8.4.3 (api:1/proto:86-101) 

git-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00 

0: cs:wfconnection ro:primary/unknown ds:uptodate/dunknown c r----- 

    ns:0 nr:144 dw:148 dr:689 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:4 

节点1启动后,可以直接加入,资源也无需要再次切换。

# crm node standby app2         #app2离线

查看资源,节点资源直接切换到app1上面,还是重启效果好。 

last updated: tue jan 26 14:30:05 2016 

last change: tue jan 26 14:29:59 2016 via crm_attribute on app2 

node app2: standby 

online: [ app1 ]

     stopped: [ app2 ] 

本文采用vmware esxi5.1虚拟机,stonith也是采用vmware esxi的fence设备fence_vmware_soap

注:在测试corosync+pacemaker过程中出现无法快速reboot/shutdown.stonith对一些服务器无法重启时配置该操作很有用。

需要在app1,app2安装fence-agents安装包。

# yum install fence-agents

安装之后位置以及stonith测试功能

[root@app1 ~]# /usr/sbin/fence_vmware_soap -a 192.168.0.61 -z -l root -p 876543 -o list  

...

...                  

drbd_heartbeat_app1,564d09c3-e8ee-9a01-e5f4-f1b11f03c810

drbd_heartbeat_app2,564dddb8-f4bf-40e6-dbad-9b97b97d3d25

例如:重启虚拟机:

[root@app1 ~]# /usr/sbin/fence_vmware_soap -a 192.168.0.61 -z -l root -p 876543 -n drbd_heartbeat_app2 -o reboot

[root@app1 ~]# crm

crm(live)# configure

crm(live)configure# primitive vm-fence-app1 stonith:fence_vmware_soap params ipaddr=192.168.0.61 login=root passwd=876543 port=app1 ssl="1" pcmk_host_list="drbd_heartbeat_app1" retry_on="10" shell_timeout="120" login_timeout="120" action="reboot" op start interval="0" timeout="120"

crm(live)configure# primitive vm-fence-app2 stonith:fence_vmware_soap params ipaddr=192.168.0.61 login=root passwd=876543 port=app2 ssl="1" pcmk_host_list="drbd_heartbeat_app2" retry_on="10" shell_timeout="120" login_timeout="120" action="reboot" op start interval="0" timeout="120"

crm(live)configure# location l-vm-fence-app1 vm-fence-app1 -inf: app1

crm(live)configure# location l-vm-fence-app2 vm-fence-app2 -inf: app2

crm(live)configure# property stonith-enabled=true

[root@app1 ~]# crm status

last updated: tue jan 26 16:50:53 2016

last change: tue jan 26 16:50:27 2016 via crmd on app2

stack: classic openais (with plugin)

current dc: app2 - partition with quorum

version: 1.1.10-14.el6-368c726

2 nodes configured, 2 expected votes

6 resources configured

 master/slave set: ms_mydrbd [mydrbd]

     masters: [ app2 ]

     slaves: [ app1 ]

 resource group: g_service

 vm-fence-app1  (stonith:fence_vmware_soap):    started app2 

 vm-fence-app2  (stonith:fence_vmware_soap):    started app1

[root@app1 ~]# crm 

crm(live)configure# show xml

&lt;?xml version="1.0" ?&gt;

&lt;cib num_updates="4" dc-uuid="app2" update-origin="app2" crm_feature_set="3.0.7" validate-with="pacemaker-1.2" update-client="crmd" epoch="91" admin_epoch="0" cib-last-written="tue jan 26 16:50:27 2016" have-quorum="1"&gt;

  &lt;configuration&gt;

    &lt;crm_config&gt;

      &lt;cluster_property_set id="cib-bootstrap-options"&gt;

        &lt;nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.10-14.el6-368c726"/&gt;

        &lt;nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="classic openais (with plugin)"/&gt;

        &lt;nvpair id="cib-bootstrap-options-expected-quorum-votes" name="expected-quorum-votes" value="2"/&gt;

        &lt;nvpair name="stonith-enabled" value="false" id="cib-bootstrap-options-stonith-enabled"/&gt;

        &lt;nvpair name="no-quorum-policy" value="ignore" id="cib-bootstrap-options-no-quorum-policy"/&gt;

        &lt;nvpair name="default-resource-stickiness" value="100" id="cib-bootstrap-options-default-resource-stickiness"/&gt;

        &lt;nvpair id="cib-bootstrap-options-last-lrm-refresh" name="last-lrm-refresh" value="1453798227"/&gt;

      &lt;/cluster_property_set&gt;

    &lt;/crm_config&gt;

    &lt;nodes&gt;

      &lt;node id="app2" uname="app2"&gt;

        &lt;instance_attributes id="nodes-app2"&gt;

          &lt;nvpair id="nodes-app2-standby" name="standby" value="off"/&gt;

        &lt;/instance_attributes&gt;

      &lt;/node&gt;

      &lt;node id="app1" uname="app1"&gt;

        &lt;instance_attributes id="nodes-app1"&gt;

          &lt;nvpair id="nodes-app1-standby" name="standby" value="off"/&gt;

    &lt;/nodes&gt;

    &lt;resources&gt;

      &lt;primitive id="vm-fence-app1" class="stonith" type="fence_vmware_soap"&gt;

        &lt;instance_attributes id="vm-fence-app1-instance_attributes"&gt;

          &lt;nvpair name="ipaddr" value="192.168.0.61" id="vm-fence-app1-instance_attributes-ipaddr"/&gt;

          &lt;nvpair name="login" value="root" id="vm-fence-app1-instance_attributes-login"/&gt;

          &lt;nvpair name="passwd" value="xjj876543" id="vm-fence-app1-instance_attributes-passwd"/&gt;

          &lt;nvpair name="port" value="app1" id="vm-fence-app1-instance_attributes-port"/&gt;

          &lt;nvpair name="ssl" value="1" id="vm-fence-app1-instance_attributes-ssl"/&gt;

          &lt;nvpair name="pcmk_host_list" value="drbd_heartbeat_app1" id="vm-fence-app1-instance_attributes-pcmk_host_list"/&gt;

          &lt;nvpair name="retry_on" value="10" id="vm-fence-app1-instance_attributes-retry_on"/&gt;

          &lt;nvpair name="shell_timeout" value="120" id="vm-fence-app1-instance_attributes-shell_timeout"/&gt;

          &lt;nvpair name="login_timeout" value="120" id="vm-fence-app1-instance_attributes-login_timeout"/&gt;

          &lt;nvpair name="action" value="reboot" id="vm-fence-app1-instance_attributes-action"/&gt;

        &lt;operations&gt;

          &lt;op name="start" interval="0" timeout="120" id="vm-fence-app1-start-0"/&gt;

        &lt;/operations&gt;

      &lt;/primitive&gt;

      &lt;primitive id="vm-fence-app2" class="stonith" type="fence_vmware_soap"&gt;

        &lt;instance_attributes id="vm-fence-app2-instance_attributes"&gt;

          &lt;nvpair name="ipaddr" value="192.168.0.61" id="vm-fence-app2-instance_attributes-ipaddr"/&gt;

          &lt;nvpair name="login" value="root" id="vm-fence-app2-instance_attributes-login"/&gt;

          &lt;nvpair name="passwd" value="xjj876543" id="vm-fence-app2-instance_attributes-passwd"/&gt;

          &lt;nvpair name="port" value="app2" id="vm-fence-app2-instance_attributes-port"/&gt;

          &lt;nvpair name="ssl" value="1" id="vm-fence-app2-instance_attributes-ssl"/&gt;

          &lt;nvpair name="pcmk_host_list" value="drbd_heartbeat_app2" id="vm-fence-app2-instance_attributes-pcmk_host_list"/&gt;

          &lt;nvpair name="retry_on" value="10" id="vm-fence-app2-instance_attributes-retry_on"/&gt;

          &lt;nvpair name="shell_timeout" value="120" id="vm-fence-app2-instance_attributes-shell_timeout"/&gt;

          &lt;nvpair name="login_timeout" value="120" id="vm-fence-app2-instance_attributes-login_timeout"/&gt;

          &lt;nvpair name="action" value="reboot" id="vm-fence-app2-instance_attributes-action"/&gt;

          &lt;op name="start" interval="0" timeout="120" id="vm-fence-app2-start-0"/&gt;

      &lt;group id="g_service"&gt;

        &lt;primitive id="vip" class="ocf" provider="heartbeat" type="ipaddr"&gt;

          &lt;instance_attributes id="vip-instance_attributes"&gt;

            &lt;nvpair name="ip" value="192.168.0.26" id="vip-instance_attributes-ip"/&gt;

            &lt;nvpair name="cidr_netmask" value="24" id="vip-instance_attributes-cidr_netmask"/&gt;

            &lt;nvpair name="nic" value="eth0:1" id="vip-instance_attributes-nic"/&gt;

          &lt;/instance_attributes&gt;

          &lt;operations&gt;

            &lt;op name="monitor" interval="30s" timeout="20s" on-fail="restart" id="vip-monitor-30s"/&gt;

          &lt;/operations&gt;

        &lt;/primitive&gt;

        &lt;primitive id="mystore" class="ocf" provider="heartbeat" type="filesystem"&gt;

          &lt;instance_attributes id="mystore-instance_attributes"&gt;

            &lt;nvpair name="device" value="/dev/drbd0" id="mystore-instance_attributes-device"/&gt;

            &lt;nvpair name="directory" value="/data" id="mystore-instance_attributes-directory"/&gt;

            &lt;nvpair name="fstype" value="ext4" id="mystore-instance_attributes-fstype"/&gt;

            &lt;op name="start" timeout="60s" interval="0" id="mystore-start-0"/&gt;

            &lt;op name="stop" timeout="60s" interval="0" id="mystore-stop-0"/&gt;

            &lt;op name="monitor" interval="30s" timeout="40s" on-fail="restart" id="mystore-monitor-30s"/&gt;

      &lt;/group&gt;

      &lt;master id="ms_mydrbd"&gt;

        &lt;meta_attributes id="ms_mydrbd-meta_attributes"&gt;

          &lt;nvpair name="master-max" value="1" id="ms_mydrbd-meta_attributes-master-max"/&gt;

          &lt;nvpair name="master-node-max" value="1" id="ms_mydrbd-meta_attributes-master-node-max"/&gt;

          &lt;nvpair name="clone-max" value="2" id="ms_mydrbd-meta_attributes-clone-max"/&gt;

          &lt;nvpair name="clone-node-max" value="1" id="ms_mydrbd-meta_attributes-clone-node-max"/&gt;

          &lt;nvpair name="notify" value="true" id="ms_mydrbd-meta_attributes-notify"/&gt;

        &lt;/meta_attributes&gt;

        &lt;primitive id="mydrbd" class="ocf" provider="linbit" type="drbd"&gt;

          &lt;instance_attributes id="mydrbd-instance_attributes"&gt;

            &lt;nvpair name="drbd_resource" value="data" id="mydrbd-instance_attributes-drbd_resource"/&gt;

            &lt;op name="monitor" role="master" interval="20" timeout="30" id="mydrbd-monitor-20"/&gt;

            &lt;op name="monitor" role="slave" interval="30" timeout="30" id="mydrbd-monitor-30"/&gt;

            &lt;op name="start" timeout="240" interval="0" id="mydrbd-start-0"/&gt;

            &lt;op name="stop" timeout="100" interval="0" id="mydrbd-stop-0"/&gt;

      &lt;/master&gt;

    &lt;/resources&gt;

    &lt;constraints&gt;

      &lt;rsc_colocation id="c_g_service" score="infinity" rsc="g_service" with-rsc="ms_mydrbd" with-rsc-role="master"/&gt;

      &lt;rsc_colocation id="mystore_with_drbd_master" score="infinity" rsc="mystore" with-rsc="ms_mydrbd" with-rsc-role="master"/&gt;

      &lt;rsc_order id="o_g_service" score="infinity" first="ms_mydrbd" first-action="promote" then="g_service" then-action="start"/&gt;

      &lt;rsc_location id="l-vm-fence-app1" rsc="vm-fence-app1" score="-infinity" node="app1"/&gt;

      &lt;rsc_location id="l-vm-fence-app2" rsc="vm-fence-app2" score="-infinity" node="app2"/&gt;

    &lt;/constraints&gt;

  &lt;/configuration&gt;

&lt;/cib&gt;

清空资源,重新配置操作方法:

[root@app2 ~]# crm status         

last updated: wed jan 27 10:39:24 2016

[root@app2 ~]# 

先依次关闭资源 :

[root@app2 ~]# crm resource stop vm-fence-app2

[root@app2 ~]# crm resource stop vm-fence-app1

[root@app2 ~]# crm resource stop mystore

[root@app2 ~]# crm resource stop vip

[root@app2 ~]# crm resource stop ms_mydrbd

[root@app2 ~]# crm status

last updated: wed jan 27 10:40:28 2016

last change: wed jan 27 10:40:23 2016 via cibadmin on app2

再清空配置:

[root@app2 ~]# crm configure erase

info: resource references in colocation:c_g_service updated

info: resource references in colocation:mystore_with_drbd_master updated

info: resource references in order:o_g_service updated

last updated: wed jan 27 10:40:58 2016

last change: wed jan 27 10:40:52 2016 via crmd on app2

就可以再次重新配置了。

之前多次未成功配置的成功主要在于资源的排列与定位启动上面,造成切换,启动均不成功,这个也是corosync+pacemaker的配置要理解的重点, drbd+可以实现很多种组合,本文仅提供技术实现参考。

继续阅读