天天看點

Heartbeat+Drbd實作

繼續之前的操作,在drbd部署完成之後,将drbd和heartbeat結合起來,實作drbd服務的高可用,并在主節點完成自動挂載,且能夠做到故障自動切換。

按照之前的部署,隻需要修改heartbeat中的資源,也即修改/etc/init.d/haresources檔案的内容。

1、準備工作

注意:在配置drbd高可用之前,需要保證drbd服務是啟動的,而且要實作兩端都是secondary的狀态,如下:

[root@heartbeat01 ~]# cat /proc/drbd

version: 8.3.16 (api:88/proto:86-97)

GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37

 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----

    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

是以,需要在兩個drbd節點上都把drbd設定為開機自動啟動。

<code>/etc/init</code><code>.d</code><code>/drbd</code> <code>start</code>

<code>chkconfig drbd on</code>

在上述工作完成之後,修改haresources檔案,内容如下所示:

[root@heartbeat01 ~]# tail -1 /etc/ha.d/haresources 

heartbeat01.contoso.com  IPaddr::172.16.49.100/24/eth1 drbddisk::test Filesystem::/dev/drbd0::/data::ext4

#這裡以heartbeat01為例,heartbeat02的配置和heartbeat01保持一緻

2、啟動heartbeat

然後,兩個節點同時啟動heartbeat服務,

/etc/init.d/heartbeat start

3、觀察兩個節點的服務

1)下面是節點1(heartbeat01)上的狀态:

[root@heartbeat01 ~]# ip a |grep 49.100

    inet 172.16.49.100/24 brd 172.16.49.255 scope global secondary eth1

可以看到,節點1(heartbeat01)已經擷取了VIP。

 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

    ns:4 nr:0 dw:4 dr:709 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

而且,heartbeat01是drbd中的Primary節點。

[root@heartbeat01 ~]# mount 

/dev/mapper/VolGroup-lv_root on / type ext4 (rw)

proc on /proc type proc (rw)

sysfs on /sys type sysfs (rw)

devpts on /dev/pts type devpts (rw,gid=5,mode=620)

tmpfs on /dev/shm type tmpfs (rw)

/dev/sda1 on /boot type ext4 (rw)

none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

/dev/drbd0 on /data type ext4 (rw)

heartbeat01已經自動挂載/dev/drbd0到/data下。

[root@heartbeat01 ~]# ls /data

10.txt  1.txt   29.txt  38.txt  47.txt  56.txt  65.txt  74.txt  83.txt  92.txt

11.txt  20.txt  2.txt   39.txt  48.txt  57.txt  66.txt  75.txt  84.txt  93.txt

12.txt  21.txt  30.txt  3.txt   49.txt  58.txt  67.txt  76.txt  85.txt  94.txt

13.txt  22.txt  31.txt  40.txt  4.txt   59.txt  68.txt  77.txt  86.txt  95.txt

14.txt  23.txt  32.txt  41.txt  50.txt  5.txt   69.txt  78.txt  87.txt  96.txt

15.txt  24.txt  33.txt  42.txt  51.txt  60.txt  6.txt   79.txt  88.txt  97.txt

16.txt  25.txt  34.txt  43.txt  52.txt  61.txt  70.txt  7.txt   89.txt  98.txt

17.txt  26.txt  35.txt  44.txt  53.txt  62.txt  71.txt  80.txt  8.txt   99.txt

18.txt  27.txt  36.txt  45.txt  54.txt  63.txt  72.txt  81.txt  90.txt  9.txt

19.txt  28.txt  37.txt  46.txt  55.txt  64.txt  73.txt  82.txt  91.txt  lost+found

同時,之前drbd同步的檔案也都在。

2)下面是節點1(heartbeat01)上的狀态:

[root@heartbeat02 ~]# ip a |grep 49.100

節點2上沒有VIP。

[root@heartbeat02 ~]# cat /proc/drbd

 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----

    ns:0 nr:4 dw:4 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

節點2(heartbeat02)在drbd中是secondary狀态。

[root@heartbeat02 ~]# mount -n

同時,heartbeat02也沒有挂載/dev/drbd0。

[root@heartbeat02 ~]# ll /data

total 0

當然,/data下面什麼都沒有。

4、模拟故障切換場景

下面将heartbeat01的heartbeat服務停掉,檢視drbd能否自動挂載到heartbeat02上。

[root@heartbeat01 ~]# /etc/init.d/heartbeat stop

Stopping High-Availability services: Done.

[root@heartbeat01 ~]# ip a|grep 49.100

    ns:16 nr:4 dw:20 dr:1418 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

[root@heartbeat01 ~]# ll /data

2)下面是節點2(heartbeat02)上的狀态:

    ns:4 nr:16 dw:20 dr:705 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

[root@heartbeat02 ~]# ls /data

3)檢查一下heartbeat02上的日志

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4084]: info: Received shutdown notice from 'heartbeat01.contoso.com'.

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4084]: info: Resources being acquired from heartbeat01.contoso.com.

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4150]: info: acquire local HA resources (standby).

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4150]: info: local HA resource acquisition completed (standby).

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4084]: info: Standby resource acquisition done [all].

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4151]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeys heartbeat02.contoso.com] to acquire.

harc(default)[4176]: 2016/09/26_00:32:04 info: Running /etc/ha.d//rc.d/status status

mach_down(default)[4193]: 2016/09/26_00:32:04 info: Taking over resource group IPaddr::172.16.49.100/24/eth1

ResourceManager(default)[4220]: 2016/09/26_00:32:04 info: Acquiring resource group: heartbeat01.contoso.com IPaddr::172.16.49.100/24/eth1 drbddisk::test Filesystem::/dev/drbd0::/data::ext4

/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_172.16.49.100)[4248]: 2016/09/26_00:32:04 INFO:  Resource is stopped

ResourceManager(default)[4220]: 2016/09/26_00:32:04 info: Running /etc/ha.d/resource.d/IPaddr 172.16.49.100/24/eth1 start

IPaddr(IPaddr_172.16.49.100)[4373]: 2016/09/26_00:32:04 INFO: Adding inet address 172.16.49.100/24 with broadcast address 172.16.49.255 to device eth1

IPaddr(IPaddr_172.16.49.100)[4373]: 2016/09/26_00:32:04 INFO: Bringing device eth1 up

IPaddr(IPaddr_172.16.49.100)[4373]: 2016/09/26_00:32:04 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-172.16.49.100 eth1 172.16.49.100 auto not_used not_used

/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_172.16.49.100)[4347]: 2016/09/26_00:32:04 INFO:  Success

ResourceManager(default)[4220]: 2016/09/26_00:32:04 info: Running /etc/ha.d/resource.d/drbddisk test start

/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[4505]: 2016/09/26_00:32:04 INFO:  Resource is stopped

ResourceManager(default)[4220]: 2016/09/26_00:32:04 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext4 start

Filesystem(Filesystem_/dev/drbd0)[4595]: 2016/09/26_00:32:04 INFO: Running start for /dev/drbd0 on /data

/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[4587]: 2016/09/26_00:32:04 INFO:  Success

mach_down(default)[4193]: 2016/09/26_00:32:04 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired

mach_down(default)[4193]: 2016/09/26_00:32:04 info: mach_down takeover complete for node heartbeat01.contoso.com.

Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: [4084]: info: mach_down takeover complete.

Sep 26 00:32:36 heartbeat02.contoso.com heartbeat: [4084]: WARN: node heartbeat01.contoso.com: is dead

Sep 26 00:32:36 heartbeat02.contoso.com heartbeat: [4084]: info: Dead node heartbeat01.contoso.com gave up resources.

Sep 26 00:32:36 heartbeat02.contoso.com heartbeat: [4084]: info: Link heartbeat01.contoso.com:eth1 dead.

Sep 26 00:32:36 heartbeat02.contoso.com ipfail: [4110]: info: Status update: Node heartbeat01.contoso.com now has status dead

Sep 26 00:32:38 heartbeat02.contoso.com ipfail: [4110]: info: NS: We are dead. :&lt;

Sep 26 00:32:38 heartbeat02.contoso.com ipfail: [4110]: info: Link Status update: Link heartbeat01.contoso.com/eth1 now has status dead

Sep 26 00:32:39 heartbeat02.contoso.com ipfail: [4110]: info: We are dead. :&lt;

Sep 26 00:32:39 heartbeat02.contoso.com ipfail: [4110]: info: Asking other side for ping node count.

本文轉自 jerry1111111 51CTO部落格,原文連結:http://blog.51cto.com/jerry12356/1856566,如需轉載請自行聯系原作者

繼續閱讀