zabbix高可用方案
本次采用rhcs高可用套件pacemaker+corosync+pcs完成zabbix系统高可用部署。当然zabbix官方也已经从6.0版本开始原生支持高可用,不再依赖第三方组件来实现高可用,此文通过使用红帽官方高可用套件来实现zabbix系统的高可用性,对比使用keepalived实现zabbix高可用,此方案更加简洁高效。有兴趣的也可以参考此方案配置举一反三尝试实现其他业务场景的高可用性。
1、服务器规划
服务器主机名 | 地址 | 软件 |
---|---|---|
zabbix-server1 | 192.168.59.128 | pacemaker corosync pcs zabbix5.x php72 httpd |
zabbix-server2 | 192.168.59.129 | pacemaker corosync pcs zabbix5.x php72 httpd |
mysql-server | 192.168.59.130 | mariadb |
vip:192.168.59.162
数据库安装及zabbix安装忽略
2、系统环境初始化
- 时间同步
- 关闭系统防火墙
- 关闭selinux
- 主机名解析
3、高可用套件安装(两台zabbix主机上执行)
安装
yum install pacemaker pcs -y
4、设置集群用户mima
echo 123456 |passwd --stdin hacluster
5、启动pcsd
systemctl enable pcsd && systemctl start pcsd
6、认证(在任意一台节点执行即可)
pcs cluster auth zabbix-server1 zabbix-server2
Username: hacluster
Password:
zabbix-server1: Authorized
zabbix-server2: Authorized
7、创建集群(在任意一台节点执行即可)
pcs cluster setup --name zabbixserver zabbix-server1 zabbix-server2
8、启动集群并设置开机自启(在任意一台节点执行即可)
pcs cluster start --all
pcs cluster enable --all
9、查看集群状态
pcs status cluster

10、配置服务
# 由于没有配置fence设备,所以关闭stonith
pcs property set stonith-enabled=false
# 由于集群是双节点,所以关闭仲裁机制
pcs property set no-quorum-policy=ignore
# 配置vip
pcs resource create cluster_vip ocf:heartbeat:IPaddr2 ip=192.168.59.162 cidr_netmask=24 op monitor interval=20s
# 配置php-fpm
pcs resource create php-fpm systemd:rh-php72-php-fpm op monitor interval=10s
# 配置httpd
pcs resource create httpd systemd:httpd op monitor interval=10s
# 配置zabbix-server
pcs resource create zabbix_server systemd:zabbix-server op monitor interval=10s
# 配置zabbix-agent
pcs resource create zabbix_agent systemd:zabbix-agent op monitor interval=10s
# 配置资源组
pcs resource group add grp_zabbix_httpd php-fpm zabbix_server httpd zabbix_agent
# 配置资源绑定(确保vip资源和zabbix服务在同一台节点上启动)
pcs constraint colocation add grp_zabbix_httpd cluster_vip INFINITY
# 配置资源启动顺序
pcs constraint order cluster_vip then grp_zabbix_httpd
# 查看资源状态
pcs status
[root@zabbix-server1 web]# pcs status
Cluster name: zabbixserver
Stack: corosync
Current DC: zabbix-server2 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Sat Aug 20 09:43:39 2022
Last change: Fri Aug 19 18:53:24 2022 by root via cibadmin on zabbix-server1
2 nodes configured
5 resource instances configured
Online: [ zabbix-server1 zabbix-server2 ]
Full list of resources:
cluster_vip (ocf::heartbeat:IPaddr2): Started zabbix-server1
Resource Group: grp_zabbix_httpd
php-fpm (systemd:rh-php72-php-fpm): Started zabbix-server1
zabbix_server (systemd:zabbix-server): Started zabbix-server1
httpd (systemd:httpd): Started zabbix-server1
zabbix_agent (systemd:zabbix-agent): Started zabbix-server1
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
11、故障转移测试
# 将zabbix-server1设置为standby或者直接关机,查看资源转移及运行情况
[root@zabbix-server1 ~]# pcs node standby
[root@zabbix-server1 ~]# pcs status nodes
Pacemaker Nodes:
Online: zabbix-server2
Standby: zabbix-server1
Standby with resource(s) running:
Maintenance:
Offline:
Pacemaker Remote Nodes:
Online:
Standby:
Standby with resource(s) running:
Maintenance:
Offline:
[root@zabbix-server1 ~]# pcs status
Cluster name: zabbixserver
Stack: corosync
Current DC: zabbix-server2 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Sat Aug 20 09:54:10 2022
Last change: Sat Aug 20 09:51:42 2022 by root via cibadmin on zabbix-server1
2 nodes configured
5 resource instances configured
Node zabbix-server1: standby
Online: [ zabbix-server2 ]
Full list of resources:
cluster_vip (ocf::heartbeat:IPaddr2): Started zabbix-server2
Resource Group: grp_zabbix_httpd
php-fpm (systemd:rh-php72-php-fpm): Started zabbix-server2
zabbix_server (systemd:zabbix-server): Started zabbix-server2
httpd (systemd:httpd): Started zabbix-server2
zabbix_agent (systemd:zabbix-agent): Started zabbix-server2
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled