一、HA高可
FailOver:故障轉移 包含HA Resource IP, service,STONITH
FailBack故障轉移原點
Faiover domain:故障轉移域
資源粘性資源更傾向于運作于哪個節點
Messagin Layer:叢集事務資訊層僅用來傳遞資訊并不負責後期資訊計算與比較
CRM:claster resource meanager 叢集資料總管負責統計收集叢集上每一個資源狀态根據資源狀态資源服務本身計算出應該運作在哪個節點上。
DC:Desinated Coordinator 事務協調員
PE:Policy Engine 政策引擎是CRM一個子功能
TE:Transaction 事務引擎由它指揮
LRM:local resource manager 本地資料總管 負責執行
資源限制Constraint
排列限制: (coloation)
資源是否能夠運作于同一節點
score:
正值可以在一起
負值不能在一起
位置限制(location), score(分數)
正值傾向于此節點
負值傾向于逃離于此節點
順序限制: (order)
定義資源啟動或關閉時的次序
vip, ipvs
ipvs-->vip
資源隔離
節點級别STONITH
資源級别
例如FC SAN switch可以實作在存儲資源級别拒絕某節點的通路
STONITH
split-brain: 叢集節點無法有效擷取其它節點的狀态資訊時産生腦裂
後果之一搶占共享存儲
仲裁磁盤
二、案例
snn
192.168.1.5
datanode4
192.168.1.6
vip192.168.1.7
伺服器名稱
系統
CPU架構
核心
IP位址
角色
snn.abc.com
CentOS release 6.5
x86_64
2.6.32-431.el6.x86_64
192.168.1.5
master
datanode4.abc.com
192.168.1.6
slave
epel下有我們需要安裝包
heartbeat - Heartbeat subsystem for High-Availability Linux 核心包
heartbeat-devel - Heartbeat development package 開發包
heartbeat-gui - Provides a gui interface to manage heartbeat clusters 管理heartbeat圖形界面
heartbeat-ldirectord - Monitor daemon for maintaining high availability resources, 為ipvs高可用提供規則自動生成及後端realserver健康狀态檢查的元件
heartbeat-pils - Provides a general plugin and interface loading library 裝載庫和插件接口
heartbeat-stonith - Provides an interface to Shoot The Other Node In The Head
三、前期配置
1、主機名解析
[root@snn ~]# cat /etc/hosts
192.168.1.5 snn.abc.com snn
192.168.1.6 datanode4.abc.com datanode4
[root@snn ~]# hostname
[root@snn ~]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=snn.abc.com
2、雙機互信
snn
#ssh-keygen -t rsa -f ~/.ssh/id_rsa -P ''
#ssh-copy-id -i .ssh/id_rsa.pub [email protected]
執行測試一下
[root@snn ~]# ssh 192.168.1.6 'ifconfig'
datenode4
[root@datanode4 ~]# ssh-keygen -t rsa -f ~/.ssh/id_rsa -P ''
[root@datanode4 ~]# ssh-copy-id -i .ssh/id_rsa.pub [email protected]
[root@datanode4 ~]# ssh 192.168.1.5 'ifconfig'
3、時間同步
[root@snn ~]# crontab -e
*/2 * * * * /usr/sbin/ntpdate time.nist.gov &> /dev/null
[root@snn ~]# scp /var/spool/cron/root datanode4:/var/spool/cron/
四、安裝heartbeat
1、解決依賴安包
[root@snn heartbeat]# yum install perl-TimeDate PyXML libnet net-snmp-libs -y
2、隻需安裝這四個即可
1[root@snn heartbeat]# rpm -ivh heartbeat-2.1.4-12.el6.x86_64.rpm heartbeat-gui-2.1.4-12.el6.x86_64.rpm heartbeat-pils-2.1.4-12.el6.x86_64.rpm heartbeat-stonith-2.1.4-12.el6.x86_64.rpm
error: Failed dependencies:
libnet.so.1()(64bit) is needed by heartbeat-2.1.4-12.el6.x86_64
pygtk2-libglade is needed by heartbeat-gui-2.1.4-12.el6.x86_64
2解決依賴包
下載下傳安裝epel
[root@snn heartbeat]# rpm -ivh epel-release-latest-6.noarch.rpm
3安裝依賴包libnet
[root@snn heartbeat]# yum install libnet
(4)再次安裝
[root@snn heartbeat]# rpm -ivh heartbeat-2.1.4-12.el6.x86_64.rpm heartbeat-gui-2.1.4-12.el6.x86_64.rpm heartbeat-pils-2.1.4-12.el6.x86_64.rpm heartbeat-stonith-2.1.4-12.el6.x86_64.rpm
Preparing... ########################################### [100%]
1:heartbeat-pils ########################################### [ 25%]
2:heartbeat-stonith ########################################### [ 50%]
3:heartbeat ########################################### [ 75%]
4:heartbeat-gui ########################################### [100%]
3、6的節點scp過去
root@snn heartbeat]# scp epel-release-latest-6.noarch.rpm heartbeat-2.1.4-12.el6.x86_64.rpm heartbeat-gui-2.1.4-12.el6.x86_64.rpm heartbeat-pils-2.1.4-12.el6.x86_64.rpm heartbeat-stonith-2.1.4-12.el6.x86_64.rpm datanode4:/root/heartbeat/
五、配置
1、三個配置檔案預設是沒有的
[root@snn ha.d]# ls /etc/ha.d/
harc rc.d README.config resource.d shellfuncs
1密鑰檔案600, authkeys
2heartbeat服務的配置配置ha.cf
3資源管理配置檔案haresources
2、複制樣例檔案
[root@snn ha.d]# cp /usr/share/doc/heartbeat-2.1.4/{authkeys,haresources,ha.cf} ./
3、修改authkeys 600權限
[root@snn ha.d]# chmod 600 authkeys
4、做個随機碼
[root@snn ha.d]# dd if=/dev/random count=1 bs=512 | md5sum
記錄了0+1 的讀入
記錄了0+1 的寫出
29位元組(29 B)已複制8.0656e-05 秒360 kB/秒
71cc2b8ff1bd825fce13ceaea932501d -
[root@snn ha.d]# vim authkeys
auth 1
1 md5 71cc2b8ff1bd825fce13ceaea932501d
5、核心配置檔案ha.cf
ha.cf
debugfile 調試資訊
logfile 日志檔案
logacility
keepalive 每隔多長時間發送一次心跳資訊
deadtime 多長時間替換
warnrime 警告時間
initdead 啟動heartbeat時多長時間探測
udpprot 端口
bcast 廣播
mcast 多點傳播 255.0.30.1
ucast 多點傳播
auto_failback 是否自動轉回
stonith bay
ping 仲裁裝置
node 節點資訊不能使用ip位址
ping_group ping組
debug debug級别
compression 壓縮傳輸算法
compression_threshold 壓縮大小
驗證以後要關閉服務并設定服務開機不能啟動
[root@snn ha.d]# vim ha.cf
bcast eth0 # Linux
node snn.abc.com
node datanode4.abc.com
6、兩台主機都安裝httpd服務
[root@snn ha.d]# yum install httpd
[root@snn ha.d]# echo "<h1>snn.abc.com</h1>" >> /var/www/html/index.html
驗證以後要關閉服務,并設定服務開機不能啟動
[root@snn ha.d]# service httpd stop
[root@datanode4 ha.d]# chkconfig httpd off
[root@snn ha.d]# chkconfig httpd off
7、定義aresources檔案
先說明主節點
node1.magedu.com VIP httpd
resource.d檔案夾用來定義RA
先找resource.d檔案夾後找/etc/rs.d/init.d/
VIP
ip/netmask/網卡/廣播位址
[root@snn ha.d]# vim haresources
snn.abc.com IPaddr::192.168.1.7/24/eth0 httpd
8、每個節點都需要有此檔案,scp -p 儲存原來屬性
[root@snn ha.d]# scp -p authkeys ha.cf haresources datanode4:/etc/ha.d/
六、啟動服務
[root@snn ha.d]# service heartbeat start
[root@snn ha.d]# ssh datanode4 'service heartbeat start'
[root@snn ha.d]# tail -f /var/log/messages
Jun 13 17:28:55 snn heartbeat: [3061]: info: Link 192.168.1.1:192.168.1.1 up.
Jun 13 17:28:55 snn heartbeat: [3061]: info: Status update for node 192.168.1.1: status ping
Jun 13 17:28:55 snn heartbeat: [3061]: info: Link snn.abc.com:eth0 up.//兩個節點都up起來了
Jun 13 17:29:02 snn heartbeat: [3061]: info: Link datanode4.abc.com:eth0 up.
Jun 13 17:29:02 snn heartbeat: [3061]: info: Status update for node datanode4.abc.com: status up //檢查狀态資訊
Jun 13 17:29:02 snn harc[3069]: info: Running /etc/ha.d/rc.d/status status
Jun 13 17:29:03 snn heartbeat: [3061]: info: Comm_now_up(): updating status to active
Jun 13 17:29:03 snn heartbeat: [3061]: info: Local status now set to: 'active'
Jun 13 17:29:03 snn heartbeat: [3061]: info: Status update for node datanode4.abc.com: status active
Jun 13 17:29:03 snn harc[3088]: info: Running /etc/ha.d/rc.d/status status
Jun 13 17:29:13 snn heartbeat: [3061]: info: remote resource transition completed.
Jun 13 17:29:13 snn heartbeat: [3061]: info: Initial resource acquisition complete (T_RESOURCES(us))
Jun 13 17:29:14 snn IPaddr[3141]: INFO: Resource is stopped
Jun 13 17:29:14 snn heartbeat: [3105]: info: Local Resource acquisition completed.
Jun 13 17:29:14 snn harc[3192]: info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp
Jun 13 17:29:14 snn ip-request-resp[3192]: received ip-request-resp IPaddr::192.168.1.7/24/eth0 OK yes
Jun 13 17:29:14 snn ResourceManager[3213]: info: Acquiring resource group: snn.abc.com IPaddr::192.168.1.7/24/eth0 httpd
Jun 13 17:29:14 snn IPaddr[3240]: INFO: Resource is stopped
Jun 13 17:29:14 snn ResourceManager[3213]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.7/24/eth0 start //資源配置start
Jun 13 17:29:14 snn IPaddr[3338]: INFO: Using calculated netmask for 192.168.1.7: 255.255.255.0
Jun 13 17:29:14 snn IPaddr[3338]: INFO: eval ifconfig eth0:0 192.168.1.7 netmask 255.255.255.0 broadcast 192.168.1.255
Jun 13 17:29:14 snn IPaddr[3309]: INFO: Success
Jun 13 17:29:14 snn ResourceManager[3213]: info: Running /etc/init.d/httpd start //http
[root@snn ha.d]# netstat -tlunp | grep 80
tcp 0 0 :::80 :::* LISTEN 3464/httpd
[root@snn ha.d]# ifconfig
eth0 Link encap:Ethernet HWaddr 00:0C:29:B1:89:48
inet addr:192.168.1.5 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:feb1:8948/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:35659 errors:0 dropped:0 overruns:0 frame:0
TX packets:10024 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:4539049 (4.3 MiB) TX bytes:2100109 (2.0 MiB)
eth0:0 Link encap:Ethernet HWaddr 00:0C:29:B1:89:48
inet addr:192.168.1.7 Bcast:192.168.1.255 Mask:255.255.255.0
七、利用一個腳本模拟主備切換
[root@snn ha.d]# sh /usr/lib64/heartbeat/hb_standby
2015/06/13_17:42:27 Going standby [all].
Jun 13 17:42:28 snn ResourceManager[3568]: info: Running /etc/init.d/httpd stop
Jun 13 17:42:28 snn ResourceManager[3568]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.7/24/eth0 stop
Jun 13 17:42:29 snn IPaddr[3663]: INFO: ifconfig eth0:0 down
Jun 13 17:42:29 snn IPaddr[3634]: INFO: Success
Jun 13 17:42:29 snn heartbeat: [3555]: info: all HA resource release completed (standby).
Jun 13 17:42:29 snn heartbeat: [3061]: info: Local standby process completed [all].
Jun 13 17:42:30 snn heartbeat: [3061]: WARN: 1 lost packet(s) for [datanode4.abc.com] [819:821]
Jun 13 17:42:30 snn heartbeat: [3061]: info: remote resource transition completed.
Jun 13 17:42:30 snn heartbeat: [3061]: info: No pkts missing from datanode4.abc.com!
Jun 13 17:42:30 snn heartbeat: [3061]: info: Other node completed standby takeover of all resources. //其他節點完成備用接管所有的資源
在6這個主機下看看
[root@datanode4 ha.d]# ifconfig
eth0 Link encap:Ethernet HWaddr 00:0C:29:E1:2F:66
inet addr:192.168.1.6 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fee1:2f66/64 Scope:Link
RX packets:37277 errors:0 dropped:0 overruns:0 frame:0
TX packets:3812 errors:0 dropped:0 overruns:0 carrier:0
RX bytes:5065186 (4.8 MiB) TX bytes:648956 (633.7 KiB)
eth0:0 Link encap:Ethernet HWaddr 00:0C:29:E1:2F:66
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
[root@datanode4 ha.d]# netstat -ltunp | grep 80
tcp 0 0 :::80 :::* LISTEN 2782/httpd
八、可以通過挂載nfs的方式
1、啟用另一台2.168.1.4datanode.abc.com datanode 做nfs檔案系統
[root@datanode ~]# mkdir /web/htodcs -p
2、共享的目錄檔案
[root@datanode ~]# vim /etc/exports
/web/htodcs 192.168.0.0/24(ro)
3、啟動nfs服務
[root@datanode ~]# service nfs start
啟動 NFS 服務: [确定]
關掉 NFS 配額: [确定]
啟動 NFS mountd: [确定]
啟動 NFS 守護程序: [确定]
正在啟動 RPC idmapd: [确定]
[root@datanode ~]# showmount -e 192.168.1.4
Export list for 192.168.1.4:
/web/htodcs 192.168.0.0/24
4、來到3這台主機,先把heartbeat停掉,在改資源配置檔案
[root@snn ha.d]# ssh datanode4 '/etc/init.d/heartbeat stop'
Stopping High-Availability services:
Done.
[root@snn ha.d]# service heartbeat stop
[root@snn ha.d]# mount -t nfs 192.168.1.4:/web/htdocs /mnt
[root@snn ha.d]# mount -l | grep mnt
192.168.1.4:/web/htdocs on /mnt type nfs (rw,vers=4,addr=192.168.1.4,clientaddr=192.168.1.5)
[root@snn ~]# cat /mnt/index.html
<h1>datanode.abc.com</h1>
測試能挂載上來,
[root@snn ~]# umount /mnt
九、在3主機上資料總管挂載檔案系統
資源先後次序很關鍵
先配置IP,然後配置檔案系統,再配置服務
檔案系統一定在服務之前的
[root@snn ~]# vim /etc/ha.d/ha.cf
snn.abc.com IPaddr::192.168.1.7/24/eth0 Filesystem::192.168.1.4:/web/htdocs::/var/www/html::nfs httpd
[root@snn ~]# scp /etc/ha.d/haresources datanode4:/etc/ha.d/haresources
十、啟動heartbeat後,檢視日志
//有錯,原因已經在heartbeat第二章寫出來了!
<a href="http://down.51cto.com/data/2365804" target="_blank">附件:http://down.51cto.com/data/2365804</a>
本文轉自 zouqingyun 51CTO部落格,原文連結:http://blog.51cto.com/zouqingyun/1661621,如需轉載請自行聯系原作者