前言
本文涉及到技术有Heartbeat、DRBD、MySQL。
Heartbeat介绍
详见官网http://linux-ha.org/wiki/Main_Page或blog.51cto.com/lzhnb
DRBD介绍
详见官网http://www.drbd.org/或blog.51cto.com/lzhnb
MySQL介绍
详见官网http://www.mysql.com/或blog.51cto.com/lzhnb
第1章 系统环境及架构
1.2系统环境
[root@MySQL-Master01 ~]# cat /etc/redhat-release
CentOS release 6.9 (Final)
[root@MySQL-Master01 ~]# uname -r
2.6.32-696.el6.x86_64
[root@MySQL-Master01 ~]# uname -m
x86_64
[root@MySQL-Master01 ~]# /etc/init.d/iptables stop ==》关闭防火墙
[root@MySQL-Master01 ~]# sed -i "s#SELINUX=enforcing#SELINUX=disabled#gp" /etc/selinux/config
[root@MySQL-Master01 ~]# grep "SELINUX=disabled" /etc/selinux/config
[root@MySQL-Master01 ~]# setenforce 0
[root@MySQL-Master01 ~]# getenforce
[root@MySQL-Master01 ~]# echo '#time sync by liuzhonghe at 2018-1-15' >>/var/spool/cron/root 设置时间同步
[root@MySQL-Master01 ~]# echo '*/5 * * * * /usr/sbin/ntpdate ntp1.aliyun.com >/dev/null 2>&1' >>/var/spool/cron/root
[root@MySQL-Master01 ~]# crontab -l
1.3软件环境
Heartbeat | heartbeat-3.0.4-2.el6.x86_64 |
DRBD | drbd83-utils-8.3.16 |
MySQL | mysql-5.5.49 |
1.4服务器及目录规划
1.4.1服务器名、IP、主机名规划
序号 | 角色 | IP | 主机名 |
1 | MySQL主节点-1 | 172.16.1.51/24(内网) | MySQL-Master01 |
172.16.4.2/24(心跳) | |||
172.168.4.2/24(DRBD数据传输) | |||
2 | MySQL主节点-2 | 172.16.1.52/24(内网) | MySQL-Master01 |
172.16.4.3/24(心跳) | |||
172.168.4.3/24(DRBD数据传输) | |||
3 | MySQL从节点-1 | 172.16.1.71/24 | MySQL-Slave01 |
4 | VIP | 172.16.1.53/24(内网提供服务) | |
说明:从库是通过主库的VIP进行数据同步的 |
1.4.2目录规划
目录名 | 位置 | 作用 |
/server/scripts | 所有服务器 | 存放脚本 |
/application/tools | 软件包存放 | |
/application | 编译安装路径 | |
/data | 数据库数据存放 |
第2章 安装部署过程
2.1 Heartbeat部署
2.1.1配置主库间的心跳路由
######################################主节点###################################
[root@MySQL-Master01 ~]# route add -host 172.16.4.3 dev eth2 ==》到达对端心跳路由
[root@MySQL-Master01 ~]# route add -host 172.168.4.3 dev eth3 ==》DRBD数据路由
######################################备节点###################################
[root@MySQL-Master02 ~]# route add -host 172.16.4.2 dev eth2 ==》到达对端心跳路由
[root@MySQL-Master02 ~]# route add -host 172.168.4.2 dev eth3 ==》DRBD数据路由
2.1.2安装Heartbeat(两者都要安装)
[root@MySQL-Master01 ~]# yum install -y heartbeat
[root@MySQL-Master02 ~]# yum install -y heartbeat
2.1.3配置Heartbeat配置文件(两者的配置文件完全一样)
2.1.3.1 /etc/ha.d/ha.cf
[root@MySQL-Master01 ~]# cat /etc/ha.d/ha.cf
#log configure
debugfile /var/log/ha-debug ==》存放heartbeat调试信息
logfile /var/log/ha-log ==》存放日志信息
logfacility local1 ==》在syslog服务中配置通过local1设备接收日志
#options configure
keepalive 2 ==》心跳的时间间隔 默认单位是秒
deadtime 30 ==》超出该时间间隔未收到对方心跳,则认为对方死亡
warntime 10 ==》超出该时间未收到对方心跳,则发出警告并记录到日志
initdead 120 ==》重启或者服务恢复后网络正常工作需要的时间,至少是deadtime的2倍
mcast eth2 225.0.0.7 694 1 0 ==》设置广播通信使用的端口 694为默认使用的端口
auto_failback on ==》主节点恢复后,将服务自动切回
node MySQL-Master01 ==》主节点的主机名,可以用IP地址
node MySQL-Master02 ==》备节点的主机名,可以用IP地址
crm no 是否开启资源管理功能
2.3.2 /etc/ha.d/haresources
[root@MySQL-Master01 ~]# cat /etc/ha.d/haresources
MySQL-Master01 IPaddr::172.16.1.53/24/eth1
#MySQL-Master01 IPaddr::172.16.1.53/24/eth1 drbddisk::data Filesystem::/dev/drbd1::/data::ext4 mysqld
说明:
drbddisk::data <==启动drbd data资源,相当于执行/etc/ha.d/resource.d/drbddisk data stop/start操作
Filesystem::/dev/drbd1::/data::ext4 <==drbd分区挂载到/data目录,相当于执行/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext4 stop/start <==相当于系统中执行mount /dev/drbd1 /data
mysqld <==启动mysql服务脚本,相当于/etc/init.d/mysqld stop/start
2.3.3/etc/ha.d/ authkeys
[root@MySQL-Master01 ~]# cat /etc/ha.d/authkeys
auth 1
1 sha1 liucdlzh
[root@MySQL-Master01 ~]# chmod 600 /etc/ha.d/authkeys
2.1.4 启动heartbeat(两个节点都要启动)
[root@MySQL-Master01 ~]# /etc/init.d/heartbeat start
[root@MySQL-Master01 ~]# chkconfig heartbeat off
[root@MySQL-Master02 ~]# /etc/init.d/heartbeat start
[root@MySQL-Master02 ~]# chkconfig heartbeat off
注意:关闭开机自启动,当重启服务器的时候需要人工手动重启服务
2.1.5测试heartbeat
2.1.5.1 正常状态下
[root@MySQL-Master01 ~]# ip addr |grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
inet 172.16.1.51/24 brd 172.16.1.255 scope global eth1
inet 172.16.1.53/24 brd 172.16.1.255 scope global secondary eth1
[root@MySQL-Master02 ~]# ip addr |grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
inet 172.16.1.52/24 brd 172.16.1.255 scope global eth1
2.1.5.2 模拟主节点宕机
[root@MySQL-Master01 ~]# /etc/init.d/heartbeat stop
Stopping High-Availability services: Done
[root@MySQL-Master02 ~]# ip addr |grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
inet 172.16.1.52/24 brd 172.16.1.255 scope global eth1
inet 172.16.1.53/24 brd 172.16.1.255 scope global secondary eth1
2.1.5.3 模拟主节点恢复
[root@MySQL-Master01 ~]# ip addr |grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
inet 172.16.1.51/24 brd 172.16.1.255 scope global eth1
inet 172.16.1.53/24 brd 172.16.1.255 scope global secondary eth1
[root@MySQL-Master02 mysql]# ip addr |grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
inet 172.16.1.52/24 brd 172.16.1.255 scope global eth1
2.2 DRBD部署(两个节点的操作是完全同步的)
2.2.1添加新硬盘
[root@MySQL-Master01 ~]# fdisk -l /dev/sdb
Disk /dev/sdb: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xe21fa70d
Device Boot Start End Blocks Id System
/dev/sdb1 1 654 5253223+ 83 Linux
/dev/sdb2 655 1305 5229157+ 83 Linux
实施步骤:
[root@MySQL-Master01 ~]# fdisk /dev/sdb ==》分两个分区
[root@MySQL-Master01 ~]# mkfs.ext4 /dev/sdb1 ==》格式化分区
[root@MySQL-Master01 ~]# tune2fs -c -1 /dev/sdb1 ==》设置最大挂载数为-1
注意:sbd2分区不需要格式化 因为为meta data分区
2.2.2安装DRBD
[root@MySQL-Master01 ~]# rpm -Uvh http://www.elrepo.org/elrepo-release-6-8.el6.elrepo.noarch.rpm
[root@MySQL-Master01 ~]# yum install -y kmod-drbd83 drbd83-utils
[root@MySQL-Master01 ~]# modprobe drbd
注意:千万不要设置echo "modprobe drbd" >>/etc/rc.local开机自动加载drbd模块,否则会先启动drbd服务在加载drbd的顺序,导致drbd启动不了出现的问题
2.2.3 配置DRBD
[root@MySQL-Master01 ~]# cd /etc/drbd.d/
[root@MySQL-Master01 drbd.d]# cp global_common.conf{,.bak}
[root@MySQL-Master01 drbd.d]# cat global_common.conf
global {
usage-count no; ==》不让linbit公司统计drbd目前的使用情况 默认yes
}
common {
protocol C; ==》同步模式默认为sysnc就是C
disk { ==>精细的调节drbd底层存储的属性
on-io-error detach; ==》同步IO出错时的做法:分离该磁盘
no-disk-flushes;
no-md-flushes;
}
net { ==>精细的调节网络相关的属性
sndbuf-size 512k; ==》调节TCP send buffer的大小 0自动调节 128k默认最大不超过2M
max-buffers 8000; ==》设定drbd分配的最大请求数
unplug-watermark 1024;
max-epoch-size 8000;
cram-hmac-alg "sha1"; ==》指定算法
shared-secret "liucdlzh";
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}
syncer {
rate 120M; ==》同步速率
al-extents 517;
}
}
2.2.4 配置DRBD资源
[root@MySQL-Master01 drbd.d]# cat r0.res
resource data {
on MySQL-Master01 { ==> 主节点
device /dev/drbd1;
disk /dev/sdb1;
address 172.16.1.51:7788;
meta-disk /dev/sdb2 [0];
}
on MySQL-Master02 { ==> 备节点
device /dev/drbd1;
disk /dev/sdb1;
address 172.16.1.52:7788;
meta-disk /dev/sdb2 [0];
}
}
2.2.5初始化设备元数据并启动
[root@MySQL-Master01 ~]# drbdadm create-md data
[root@MySQL-Master01 ~]# /etc/init.d/drbd start
2.2.6 初始化设备同步并挂载
[root@MySQL-Master01 ~]# drbdadm -- --overwrite-data-of-peer primary data ==》只需在主节点上执行即可
[root@MySQL-Master01 ~]# drbdadm primary all
[root@MySQL-Master01 ~]# mount /dev/drbd1 /data/
[root@MySQL-Master01 ~]# cat /proc/drbd ==》查看
2.2.7 测试DRBD
2.2.7.1正常状态
[root@MySQL-Master01 ~]# cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:2172 nr:2804 dw:4976 dr:35859 al:11 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@MySQL-Master02 ~]# cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37
1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:2804 nr:2196 dw:5000 dr:30544 al:10 bm:2 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
2.2.7.2模拟DRBD故障
[root@MySQL-Master01 ~]# umount /dev/drdb1
[root@MySQL-Master01 ~]# /etc/init.d/drbd stop
[root@MySQL-Master02 ~]# drbdadm primary all
[root@MySQL-Master02 ~]# mount /dev/drbd1 /data/
[root@MySQL-Master02 ~]# cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:2172 nr:2804 dw:4976 dr:35859 al:11 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@MySQL-Master02 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 8.8G 2.8G 5.7G 33% /
tmpfs 491M 0 491M 0% /dev/shm
/dev/sda1 190M 35M 146M 20% /boot
/dev/drbd1 4.9G 40M 4.6G 1% /data
2.3 MySQL部署
【前提】
q 三台数据库都需要安装mysql服务
q MySQL-Master02不需要初始化数据库
q mysqld服务不需要加入开机自启动
2.3.1安装过程
安装mysql
####创建mysql用户
[root@MySQL-Master01 ~]# useradd mysql -s /sbin/nologin -M
####解压安装mysql
[root@MySQL-Master01 ~]# cd /home/oldboy/tools/
[root@MySQL-Master01 ~]# rz
[root@MySQL-Master01 ~]# xf mysql-5.5.49-linux2.6-x86_64.tar.gz
[root@MySQL-Master01 ~]# mv mysql-5.5.49-linux2.6-x86_64 /application/mysql-5.5.49/
[root@MySQL-Master01 ~]# -s /application/mysql-5.5.49/ /application/mysql
[root@MySQL-Master01 ~]# ll /application/mysql
####初始化数据库(在备节点上不需要执行该步骤)
[root@MySQL-Master01 ~]# /application/mysql/scripts/mysql_install_db --basedir=/application/mysql --datadir=/application/mysql/data/ --user=mysql
####授权配置文件
[root@MySQL-Master01 ~]# chown -R mysql.mysql /application/mysql/
[root@MySQL-Master01 ~]# cp /application/mysql/support-files/my-small.cnf /etc/my.cnf
[root@MySQL-Master01 ~]# cp /application/mysql/support-files/mysql.server /etc/init.d/mysqld
[root@MySQL-Master01 ~]# chmod +x /etc/init.d/mysqld
[root@MySQL-Master01 ~]# sed -i 's#/usr/local/mysql#/application/mysql#g' /application/mysql/bin/mysqld_safe /etc/init.d/mysqld
[root@MySQL-Master01 ~]# /etc/init.d/mysqld start
####拷贝环境变量
[root@MySQL-Master01 ~]# cp -a /application/mysql/bin/* /usr/local/sbin/
####设置密码(备节点不需要执行该步骤)
[root@MySQL-Master01 ~]# mysqladmin -uroot password '123456'
2.3.2配置从库同VIP同步
2.3.2.1主库配置
1、开启binlog和设置server-id
[root@MySQL-Master01 ~]# cat /etc/my.cnf ==》在该文件中加入下面两行
log-bin = /application/mysql/mysql-bin
server-id = 3
[root@MySQL-Master01 ~]# /etc/init.d/mysqld restart ==》备节点不需要重启
2、授权并建立同步账户
[root@MySQL-Master01 ~]# mysql -uroot -p
mysql> grant replication slave on *.* to 'rep'@'172.16.1.%' identified by '123456';
2.3.2.2 slave配置
1、设置server-id
[root@MySQL-Slave02 ~]# cat /etc/my.cnf
server-id = 4
2、配置同步参数
[root@MySQL-Slave02 ~]# mysql -uroot -p
mysql> change master to
master_host='172.16.1.53',
master_port=3306,
master_user='rep',
master_password='123456',
master_log_file='mysql-bin.000001', ==》通过在主库执行show master status;获得
master_log_pos=257; ==》通过在主库执行show master status;获
2.3.3 检查是否主从同步
[root@MySQL-Slave02 ~]# mysql -uroot -p
mysql> show slave status\G
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
2.4 测试高可用性
2.4.1正常状态
[root@MySQL-Master01 ~]# mysql -uroot -p
mysql> create database lzh;
Query OK, 1 row affected (0.02 sec)
[root@MySQL-Slave02 ~]# mysql -uroot -p
mysql> show slave status\G
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
[root@MySQL-Slave02 ~]# mysql -uroot -e "show databases like 'lzh';" -p123456
+----------------+
| Database (lzh) |
+----------------+
| lzh |
+----------------+
2.4.2模拟高可用主节点宕机
[root@MySQL-Master01 ~]# /etc/init.d/heartbeat stop
Stopping High-Availability services: Done.
[root@MySQL-Master02 ~]# ip addr |grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
inet 172.16.1.52/24 brd 172.16.1.255 scope global eth1
inet 172.16.1.53/24 brd 172.16.1.255 scope global secondary eth1
[root@MySQL-Slave02 mysql]# mysql -uroot -p123456 -e "show slave status\G"|egrep "Slave_IO_Running|Slave_SQL_Running"
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
[root@MySQL-Master02 ~]# mysql -uroot -p123456 -e "create database oldboy;"
[root@MySQL-Slave02 mysql]# mysql -uroot -p123456 -e "show databases like 'old%';"
+-----------------+
| Database (old%) |
+-----------------+
| oldboy |
+-----------------+
2.4.3模拟高可用主节点恢复
[root@MySQL-Master01 ~]# /etc/init.d/heartbeat start
[root@MySQL-Master01 ~]# ip addr |grep eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
inet 172.16.1.51/24 brd 172.16.1.255 scope global eth1
inet 172.16.1.53/24 brd 172.16.1.255 scope global secondary eth1
[root@MySQL-Slave02 mysql]# mysql -uroot -p123456 -e "show slave status\G"|egrep "Slave_IO_Running|Slave_SQL_Running"
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
第3章 高可用脑裂问题及解决方案
3.1引起脑裂的原因
1、高可用服务器间的心跳线路故障,导致无法相互检查心跳
2、高可用服务器间开启了防火墙,阻挡心跳检测
3、高可用服务器网卡地址配置不正确,导致发送心跳失败
4、软件BUG、服务配置不当等原因
3.2 防止脑裂的解决方案
1、加冗余线路