天天看点

实现redis哨兵,模拟master故障场景1、redis哨兵(Sentinel)一键安装redis的脚本

1、redis哨兵(Sentinel)

实现redis哨兵,模拟master故障场景1、redis哨兵(Sentinel)一键安装redis的脚本

1.1、redis集群介绍

前面文章讲的主从复制集群是无法实现master和slave角色的自动切换的,如果master节点出现现redis服务异常、主机断电、磁盘损坏等问题导致master无法使用,而redis主从复制无法实现自动的故障转移(将slave 自动提升为新master),需要手动修改环境配置,才能切换到slave redis服务器,另外当单台Redis服务器性能无法满足业务写入需求的时候,也无法横向扩展Redis服务的并行写入性能。

需要解决以上的两个核心问题:

  • master和slave角色的无缝切换,让业务无感知从而不影响业务使用;
  • 可横向动态扩展Redis服务器,从而实现多台服务器并行写入以实现更高并发的目的。

Redis集群实现的方式:

  • 客户端分片: 由应用决定将不同的KEY发送到不同的Redis服务器
  • 代理分片: 由代理决定将不同的KEY发送到不同的Redis服务器,代理程序如:codis,twemproxy等
  • Redis Cluster

1.2、redis哨兵(Sentinel)的工作原理

Sentinel可以管理多个redis主从集群

实现redis哨兵,模拟master故障场景1、redis哨兵(Sentinel)一键安装redis的脚本

Sentinel 架构

实现redis哨兵,模拟master故障场景1、redis哨兵(Sentinel)一键安装redis的脚本

Sentinel 故障转移

实现redis哨兵,模拟master故障场景1、redis哨兵(Sentinel)一键安装redis的脚本

Sentinel 进程是用于监控redis集群中Master主服务器工作的状态,在Master主服务器发生故障的时候,可以实现Master和Slave服务器的切换,保证系统的高可用,此功能在redis2.6+的版本已引用,Redis的哨兵模式到了2.8版本之后就稳定了下来。一般在生产环境也建议使用Redis的2.8版本的以后版本

哨兵(Sentinel) 是一个分布式系统,可以在一个架构中运行多个哨兵(sentinel) 进程,这些进程使用流言协议(gossip protocols)来接收关于Master主服务器是否下线的信息,并使用投票协议(Agreement Protocols)来决定是否执行自动故障迁移,以及选择哪个Slave作为新的Master

每个哨兵(Sentinel)进程会向其它哨兵(Sentinel)、Master、Slave定时发送消息,以确认对方是否”活”着,如果发现对方在指定配置时间(此项可配置)内未得到回应,则暂时认为对方已离线,也就是所谓的”主观认为宕机” (主观:是每个成员都具有的独自的而且可能相同也可能不同的意识),英文名称:Subjective Down,简称SDOWN

有主观宕机,对应的有客观宕机。当“哨兵群”中的多数Sentinel进程在对Master主服务器做出SDOWN 的判断,并且通过 SENTINEL is-master-down-by-addr 命令互相交流之后,得出的Master Server下线判断,这种方式就是“客观宕机”(客观:是不依赖于某种意识而已经实际存在的一切事物),英文名称是:Objectively Down, 简称 ODOWN

通过一定的vote算法,从剩下的slave从服务器节点中,选一台提升为Master服务器节点,然后自动修改相关配置,并开启故障转移(failover)

Sentinel 机制可以解决master和slave角色的自动切换问题,但单个 Master 的性能瓶颈问题无法解决,类似于MySQL中的MHA功能

Redis Sentinel中的Sentinel节点个数应该为大于等于3且最好为奇数

客户端初始化时连接的是Sentinel节点集合,不再是具体的Redis节点,但Sentinel只是配置中心不是代理。

Redis Sentinel节点与普通redis没有区别,要实现读写分离依赖于客户端程序

redis 3.0之前版本中,生产环境一般使用哨兵模式,3.0后推出redis cluster功能,可以支持更大规模的生产环境

sentinel中的三个定时任务:

  • 每10秒每个sentinel对master和slave执行info

    发现slave节点

    确认主从关系

  • 每2秒每个sentinel通过master节点的channel交换信息(pub/sub)

    通过sentinel__:hello频道交互

    交互对节点的“看法”和自身信息

  • 每1秒每个sentinel对其他sentinel和redis执行ping

1.3、实现哨兵

实现redis哨兵,模拟master故障场景1、redis哨兵(Sentinel)一键安装redis的脚本

环境准备:

准备三台主机搭建主从集群,再在每个机器上搭建sentinel

IP Redis版本 主机名
10.0.0.100 redis-6.2.7 node1.stars.org
10.0.0.101 redis-6.2.7 node2.stars.org
10.0.0.102 redis-6.2.7 node3.stars.org

1.3.1、实现哨兵需要先实现一下主从复制的架构

哨兵的前提是已经实现了一个redis的主从复制的运行环境,从而实现一个一主两从基于哨兵的高可用redis架构

注意: master的配置文件中masterauth和slave都必须相同

所有主从节点的redis.conf中关健配置如下:

#在所有主从节点上执行
root@node1:~# vim /apps/redis/etc/redis.conf
bind 0.0.0.0
masterauth wm521314
requirepass wm521314
#这里也可以使用sed修改
#sed -i -e 's/bind 127.0.0.1/bind 0.0.0.0/' -e 's/^# masterauth .*/masterauth 123456/' -e 's/^# requirepass .*/requirepass 123456/' /apps/redis/etc/redis.conf

#在所有的从节点执行
root@node2:~# echo "replicaof 10.0.0.100 6379" >> /apps/redis/etc/redis.conf

#在所有主从节点上执行
root@node1:~# systemctl restart redis.service
           

查看master服务器状态:

root@node1:~# redis-cli -a wm521314
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> INFO replication
# Replication
role:master
connected_slaves:2
slave0:ip=10.0.0.101,port=6379,state=online,offset=126,lag=0
slave1:ip=10.0.0.102,port=6379,state=online,offset=126,lag=0
master_failover_state:no-failover
master_replid:6c38e5bab8f2ebd438426c7c46556eae889b46ea
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:126
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:126
           

查看slave1服务器状态:

root@node2:~# redis-cli -a wm521314
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> INFO replication
# Replication
role:slave
master_host:10.0.0.100
master_port:6379
master_link_status:up
master_last_io_seconds_ago:9
master_sync_in_progress:0
slave_read_repl_offset:238
slave_repl_offset:238
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:6c38e5bab8f2ebd438426c7c46556eae889b46ea
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:238
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:238
           

查看slave1服务器状态:

root@node3:~# redis-cli -a wm521314
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> INFO replication
# Replication
role:slave
master_host:10.0.0.100
master_port:6379
master_link_status:up
master_last_io_seconds_ago:8
master_sync_in_progress:0
slave_read_repl_offset:280
slave_repl_offset:280
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:6c38e5bab8f2ebd438426c7c46556eae889b46ea
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:280
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:113
repl_backlog_histlen:168
           

1.3.2、编辑哨兵的配置文件

Sentinel实际上是一个特殊的redis服务器,有些redis指令支持,但很多指令并不支持.默认监听在26379/tcp端口.

所有redis节点使用相同的以下的配置文件

#这里因为我是源码编译安装的,所以要人源码的路径下找到sentinel.conf将其复制到redis安装的目录下
root@node1:~# cd /usr/local/src/redis-6.2.7/
root@node1:/usr/local/src/redis-6.2.7# ls
00-RELEASENOTES  CONTRIBUTING  INSTALL    README.md   runtest-cluster    sentinel.conf  TLS.md
BUGS             COPYING       Makefile   redis.conf  runtest-moduleapi  src            utils
CONDUCT          deps          MANIFESTO  runtest     runtest-sentinel   tests
root@node1:/usr/local/src/redis-6.2.7# cp sentinel.conf /apps/redis/etc/redis-sentinel.conf
root@node1:/usr/local/src/redis-6.2.7# cd
root@node1:~# vim /apps/redis/etc/redis-sentinel.conf
bind 0.0.0.0
port 26379
daemonize no
pidfile /apps/redis/run/redis-sentinel.pid
logfile /apps/redis/log/sentinel.log
dir /tmp
sentinel monitor mymaster 10.0.0.100 6379 2	#mymaster是集群的名称,此行指定当前mymaster集群中master服务器的地址和端口
#2为法定人数限制(quorum),即有几个sentinel认为master down了就进行故障转移,一般此值是所有sentinel节点(一般总数是>=3的 奇数,如:3,5,7等)的一半以上的整数值,比如,总数是3,即3/2=1.5,取整为2,是master的ODOWN客观下线的依据
sentinel auth-pass mymaster wm521314	#mymaster集群中master的密码,注意此行要在上面行的下面
sentinel down-after-milliseconds mymaster 3000	#(SDOWN)判断mymaster集群中所有节点的主观下线的时间,单位:毫秒,建议3000
sentinel parallel-syncs mymaster 1	#发生故障转移后,可以同时向新master同步数据的slave的数量,数字越小总同步时间越长,但可以减轻新master的负载压力
sentinel failover-timeout mymaster 180000	#所有slaves指向新的master所需的超时时间,单位:毫秒
sentinel deny-scripts-reconfig yes	#禁止修改脚本

root@node1:~# scp /apps/redis/etc/redis-sentinel.conf 10.0.0.101:/apps/redis/etc/
root@node1:~# scp /apps/redis/etc/redis-sentinel.conf 10.0.0.102:/apps/redis/etc/
           

三个哨兵服务器的配置都如下:

root@node1:~# egrep -v "^#|^$" /apps/redis/etc/redis-sentinel.conf
bind 0.0.0.0
port 26379
daemonize no
pidfile /apps/redis/run/redis-sentinel.pid
logfile /apps/redis/log/sentinel.log
dir /tmp
sentinel monitor mymaster 10.0.0.100 6379 2
sentinel auth-pass mymaster wm521314
sentinel down-after-milliseconds mymaster 3000
acllog-max-len 128
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 180000
sentinel deny-scripts-reconfig yes
SENTINEL resolve-hostnames no
SENTINEL announce-hostnames no
           

1.3.2、启动哨兵

三台哨兵服务器都要启动

#我这里是编译安装,所以在所有节点生成新的service文件
root@node1:~# vim /lib/systemd/system/redis-sentinel.service
[Unit]
Description=Redis Sentinel
After=network.target

[Service]
ExecStart=/apps/redis/bin/redis-sentinel /apps/redis/etc/redis-sentinel.conf --supervised systemd
ExecStop=/bin/kill -s QUIT $MAINPID
User=redis
Group=redis
RuntimeDirectory=redis
RuntimeDirectoryMode=0755

[Install]
WantedBy=multi-user.target

root@node1:~# scp /lib/systemd/system/redis-sentinel.service 10.0.0.101:/lib/systemd/system/
root@node1:~# scp /lib/systemd/system/redis-sentinel.service 10.0.0.102:/lib/systemd/system/

#刚刚复制了配置文件到redis安装目录了,这里需要把目录的权限修改一下,不然服务启动不了,下面修改权限是所有的节点都要修改
root@node1:~# chown -R redis.redis /apps/redis/
root@node2:~# chown -R redis.redis /apps/redis/
root@node3:~# chown -R redis.redis /apps/redis/

#在所有的主从节点都重新加载以下配置文件,并设置服务开机自启动
root@node1:~# systemctl daemon-reload
root@node1:~# systemctl enable --now redis-sentinel.service

#如果是编译安装,在所有哨兵服务器执行下面操作也可以启动哨兵
root@node1:~# /apps/redis/bin/redis-sentinel /apps/redis/etc/redis-sentinel.conf
           

1.3.3、验证哨兵端口

root@node1:~# ss -tln
State       Recv-Q       Send-Q              Local Address:Port              Peer Address:Port       
LISTEN      0            128                       0.0.0.0:42804                  0.0.0.0:*          
LISTEN      0            128                 127.0.0.53%lo:53                     0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:22                     0.0.0.0:*          
LISTEN      0            128                     127.0.0.1:6010                   0.0.0.0:*          
LISTEN      0            64                        0.0.0.0:36864                  0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:24576                  0.0.0.0:*          
LISTEN      0            64                        0.0.0.0:2049                   0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:52456                  0.0.0.0:*          
LISTEN      0            511                       0.0.0.0:26379                  0.0.0.0:*          
LISTEN      0            511                       0.0.0.0:6379                   0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:111                    0.0.0.0:*          
LISTEN      0            128                          [::]:59698                     [::]:*          
LISTEN      0            128                          [::]:22                        [::]:*          
LISTEN      0            128                         [::1]:6010                      [::]:*          
LISTEN      0            128                          [::]:22746                     [::]:*          
LISTEN      0            64                           [::]:2049                      [::]:*          
LISTEN      0            64                           [::]:35426                     [::]:*          
LISTEN      0            128                          [::]:58824                     [::]:*          
LISTEN      0            128                          [::]:111                       [::]:*

root@node2:~# ss -tln
State       Recv-Q       Send-Q              Local Address:Port              Peer Address:Port       
LISTEN      0            511                       0.0.0.0:26379                  0.0.0.0:*          
LISTEN      0            511                       0.0.0.0:6379                   0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:111                    0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:19474                  0.0.0.0:*          
LISTEN      0            64                        0.0.0.0:28020                  0.0.0.0:*          
LISTEN      0            128                 127.0.0.53%lo:53                     0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:22                     0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:29496                  0.0.0.0:*          
LISTEN      0            128                     127.0.0.1:6010                   0.0.0.0:*          
LISTEN      0            64                        0.0.0.0:2049                   0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:53602                  0.0.0.0:*          
LISTEN      0            128                          [::]:57036                     [::]:*          
LISTEN      0            128                          [::]:111                       [::]:*          
LISTEN      0            128                          [::]:53136                     [::]:*          
LISTEN      0            128                          [::]:22                        [::]:*          
LISTEN      0            128                         [::1]:6010                      [::]:*          
LISTEN      0            128                          [::]:11002                     [::]:*          
LISTEN      0            64                           [::]:21372                     [::]:*          
LISTEN      0            64                           [::]:2049                      [::]:*

root@node3:~# ss -tln
State       Recv-Q       Send-Q              Local Address:Port              Peer Address:Port       
LISTEN      0            128                       0.0.0.0:57072                  0.0.0.0:*          
LISTEN      0            128                 127.0.0.53%lo:53                     0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:22                     0.0.0.0:*          
LISTEN      0            128                     127.0.0.1:6010                   0.0.0.0:*          
LISTEN      0            64                        0.0.0.0:2049                   0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:60290                  0.0.0.0:*          
LISTEN      0            64                        0.0.0.0:14852                  0.0.0.0:*          
LISTEN      0            511                       0.0.0.0:26379                  0.0.0.0:*          
LISTEN      0            511                       0.0.0.0:6379                   0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:15822                  0.0.0.0:*          
LISTEN      0            128                       0.0.0.0:111                    0.0.0.0:*          
LISTEN      0            128                          [::]:22                        [::]:*          
LISTEN      0            128                         [::1]:6010                      [::]:*          
LISTEN      0            128                          [::]:32992                     [::]:*          
LISTEN      0            128                          [::]:32096                     [::]:*          
LISTEN      0            64                           [::]:2049                      [::]:*          
LISTEN      0            128                          [::]:42850                     [::]:*          
LISTEN      0            64                           [::]:20716                     [::]:*          
LISTEN      0            128                          [::]:111                       [::]:*                        
           

1.3.4、查看哨兵日志

root@node1:~# tail -f /apps/redis/log/sentinel.log 
1298:X 23 May 2022 22:24:23.240 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
1298:X 23 May 2022 22:24:23.241 * monotonic clock: POSIX clock_gettime
1298:X 23 May 2022 22:24:23.241 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
1298:X 23 May 2022 22:24:23.241 * Running mode=sentinel, port=26379.
1298:X 23 May 2022 22:24:23.251 # Sentinel ID is 919e8fe5e9963e805552e879847c3274ed8a040c
1298:X 23 May 2022 22:24:23.251 # +monitor master mymaster 10.0.0.100 6379 quorum 2
1298:X 23 May 2022 22:24:23.252 * +slave slave 10.0.0.102:6379 10.0.0.102 6379 @ mymaster 10.0.0.100 6379
1298:X 23 May 2022 22:24:23.253 * +slave slave 10.0.0.101:6379 10.0.0.101 6379 @ mymaster 10.0.0.100 6379
1298:X 23 May 2022 22:24:25.253 * +sentinel sentinel c8d2fb34edbf68e35022c915563a4f9693d19410 10.0.0.101 26379 @ mymaster 10.0.0.100 6379
1298:X 23 May 2022 22:24:25.300 * +sentinel sentinel 8a281d471b4d21f123dae1edf58cab8585b95e9c 10.0.0.102 26379 @ mymaster 10.0.0.100 6379
           
root@node2:~# tail -f /apps/redis/log/sentinel.log 
3478:X 23 May 2022 14:24:23.248 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
3478:X 23 May 2022 14:24:23.248 * monotonic clock: POSIX clock_gettime
3478:X 23 May 2022 14:24:23.248 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
3478:X 23 May 2022 14:24:23.248 * Running mode=sentinel, port=26379.
3478:X 23 May 2022 14:24:23.260 # Sentinel ID is c8d2fb34edbf68e35022c915563a4f9693d19410
3478:X 23 May 2022 14:24:23.260 # +monitor master mymaster 10.0.0.100 6379 quorum 2
3478:X 23 May 2022 14:24:23.260 * +slave slave 10.0.0.102:6379 10.0.0.102 6379 @ mymaster 10.0.0.100 6379
3478:X 23 May 2022 14:24:23.261 * +slave slave 10.0.0.101:6379 10.0.0.101 6379 @ mymaster 10.0.0.100 6379
3478:X 23 May 2022 14:24:25.263 * +sentinel sentinel 919e8fe5e9963e805552e879847c3274ed8a040c 10.0.0.100 26379 @ mymaster 10.0.0.100 6379
3478:X 23 May 2022 14:24:25.295 * +sentinel sentinel 8a281d471b4d21f123dae1edf58cab8585b95e9c 10.0.0.102 26379 @ mymaster 10.0.0.100 6379
           
root@node3:~# tail -f /apps/redis/log/sentinel.log 
1466:X 23 May 2022 22:24:23.251 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
1466:X 23 May 2022 22:24:23.251 * monotonic clock: POSIX clock_gettime
1466:X 23 May 2022 22:24:23.251 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
1466:X 23 May 2022 22:24:23.251 * Running mode=sentinel, port=26379.
1466:X 23 May 2022 22:24:23.261 # Sentinel ID is 8a281d471b4d21f123dae1edf58cab8585b95e9c
1466:X 23 May 2022 22:24:23.261 # +monitor master mymaster 10.0.0.100 6379 quorum 2
1466:X 23 May 2022 22:24:23.262 * +slave slave 10.0.0.102:6379 10.0.0.102 6379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:24:23.262 * +slave slave 10.0.0.101:6379 10.0.0.101 6379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:24:25.285 * +sentinel sentinel c8d2fb34edbf68e35022c915563a4f9693d19410 10.0.0.101 26379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:24:25.300 * +sentinel sentinel 919e8fe5e9963e805552e879847c3274ed8a040c 10.0.0.100 26379 @ mymaster 10.0.0.100 6379
           

1.3.5、当前sentinel状态

root@node1:~# redis-cli -p 26379
127.0.0.1:26379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.0.0.100:6379,slaves=2,sentinels=3
           

1.3.6、验证数据是否同步

root@node1:~# redis-cli -a wm521314
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> set name ZG
OK
127.0.0.1:6379> set age 22
OK

root@node2:~# redis-cli -a wm521314
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> KEYS *
1) "age"
2) "name"
127.0.0.1:6379> get name
"ZG"
127.0.0.1:6379> get age
"22"

root@node3:~# redis-cli -a wm521314
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> KEYS *
1) "age"
2) "name"
127.0.0.1:6379> get name
"ZG"
127.0.0.1:6379> get age
"22"
           

1.3.7、制造Redis Master节点故障故障转移并查看其过程

root@node1:~# killall redis-server

#观察日志,这里会看到10.0.0.102变成新的master了
root@node1:~# tail -f /apps/redis/log/sentinel.log
1298:X 23 May 2022 22:24:23.240 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
1298:X 23 May 2022 22:24:23.241 * monotonic clock: POSIX clock_gettime
1298:X 23 May 2022 22:24:23.241 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
1298:X 23 May 2022 22:24:23.241 * Running mode=sentinel, port=26379.
1298:X 23 May 2022 22:24:23.251 # Sentinel ID is 919e8fe5e9963e805552e879847c3274ed8a040c
1298:X 23 May 2022 22:24:23.251 # +monitor master mymaster 10.0.0.100 6379 quorum 2
1298:X 23 May 2022 22:24:23.252 * +slave slave 10.0.0.102:6379 10.0.0.102 6379 @ mymaster 10.0.0.100 6379
1298:X 23 May 2022 22:24:23.253 * +slave slave 10.0.0.101:6379 10.0.0.101 6379 @ mymaster 10.0.0.100 6379
1298:X 23 May 2022 22:24:25.253 * +sentinel sentinel c8d2fb34edbf68e35022c915563a4f9693d19410 10.0.0.101 26379 @ mymaster 10.0.0.100 6379
1298:X 23 May 2022 22:24:25.300 * +sentinel sentinel 8a281d471b4d21f123dae1edf58cab8585b95e9c 10.0.0.102 26379 @ mymaster 10.0.0.100 6379
1298:X 23 May 2022 22:39:35.874 # +sdown master mymaster 10.0.0.100 6379
1298:X 23 May 2022 22:39:35.876 # +new-epoch 1
1298:X 23 May 2022 22:39:35.878 # +vote-for-leader 8a281d471b4d21f123dae1edf58cab8585b95e9c 1
1298:X 23 May 2022 22:39:35.952 # +odown master mymaster 10.0.0.100 6379 #quorum 3/2
1298:X 23 May 2022 22:39:35.952 # Next failover delay: I will not start a failover before Mon May 23 22:45:36 2022
1298:X 23 May 2022 22:39:36.977 # +config-update-from sentinel 8a281d471b4d21f123dae1edf58cab8585b95e9c 10.0.0.102 26379 @ mymaster 10.0.0.100 6379
1298:X 23 May 2022 22:39:36.977 # +switch-master mymaster 10.0.0.100 6379 10.0.0.102 6379
1298:X 23 May 2022 22:39:36.977 * +slave slave 10.0.0.101:6379 10.0.0.101 6379 @ mymaster 10.0.0.102 6379
1298:X 23 May 2022 22:39:36.977 * +slave slave 10.0.0.100:6379 10.0.0.100 6379 @ mymaster 10.0.0.102 6379
1298:X 23 May 2022 22:39:40.038 # +sdown slave 10.0.0.100:6379 10.0.0.100 6379 @ mymaster 10.0.0.102 6379

#查看各节点上哨兵信息:
root@node1:~# redis-cli -p 26379
127.0.0.1:26379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.0.0.102:6379,slaves=2,sentinels=3

#故障转移后的redis配置文件会被自动修改
root@node2:~# grep ^replicaof /apps/redis/etc/redis.conf
replicaof 10.0.0.102 6379

#哨兵配置文件的sentinel monitor IP 同样也会被修改
root@node2:~# grep "^[a-Z]" /apps/redis/etc/redis-sentinel.conf
bind 0.0.0.0
port 26379
daemonize no
pidfile "/apps/redis/run/redis-sentinel.pid"
logfile "/apps/redis/log/sentinel.log"
dir "/tmp"
sentinel monitor mymaster 10.0.0.102 6379 2
sentinel auth-pass mymaster wm521314
sentinel down-after-milliseconds mymaster 3000
acllog-max-len 128
sentinel deny-scripts-reconfig yes
sentinel resolve-hostnames no
sentinel announce-hostnames no
protected-mode no
supervised systemd
maxclients 4064
user default on nopass ~* &* +@all
sentinel myid c8d2fb34edbf68e35022c915563a4f9693d19410
sentinel config-epoch mymaster 1
sentinel leader-epoch mymaster 1
sentinel current-epoch 1
sentinel known-replica mymaster 10.0.0.100 6379
sentinel known-replica mymaster 10.0.0.101 6379
sentinel known-sentinel mymaster 10.0.0.102 26379 8a281d471b4d21f123dae1edf58cab8585b95e9c
sentinel known-sentinel mymaster 10.0.0.100 26379 919e8fe5e9963e805552e879847c3274ed8a040c

#进入redis查看现在info replication信息
root@node3:~# redis-cli -a wm521314
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> INFO replication
# Replication
role:master
connected_slaves:1
slave0:ip=10.0.0.101,port=6379,state=online,offset=345542,lag=1
master_failover_state:no-failover
master_replid:9e7e93af69bee27494f2592f2f45ae5475583f8a
master_replid2:9935b9865c7268c1aac772d2ed2a0deea625cdb5
master_repl_offset:345542
second_repl_offset:182386
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:345542

root@node2:~# redis-cli -a wm521314
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> INFO replication
# Replication
role:slave
master_host:10.0.0.102
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_read_repl_offset:331269
slave_repl_offset:331269
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:9e7e93af69bee27494f2592f2f45ae5475583f8a
master_replid2:9935b9865c7268c1aac772d2ed2a0deea625cdb5
master_repl_offset:331269
second_repl_offset:182386
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:331269
           

1.3.8、查看选举新主的过程日志监测

root@node3:~# tail -f /apps/redis/log/sentinel.log
1466:X 23 May 2022 22:24:23.251 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
1466:X 23 May 2022 22:24:23.251 * monotonic clock: POSIX clock_gettime
1466:X 23 May 2022 22:24:23.251 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
1466:X 23 May 2022 22:24:23.251 * Running mode=sentinel, port=26379.
1466:X 23 May 2022 22:24:23.261 # Sentinel ID is 8a281d471b4d21f123dae1edf58cab8585b95e9c
1466:X 23 May 2022 22:24:23.261 # +monitor master mymaster 10.0.0.100 6379 quorum 2
1466:X 23 May 2022 22:24:23.262 * +slave slave 10.0.0.102:6379 10.0.0.102 6379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:24:23.262 * +slave slave 10.0.0.101:6379 10.0.0.101 6379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:24:25.285 * +sentinel sentinel c8d2fb34edbf68e35022c915563a4f9693d19410 10.0.0.101 26379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:24:25.300 * +sentinel sentinel 919e8fe5e9963e805552e879847c3274ed8a040c 10.0.0.100 26379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:35.789 # +sdown master mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:35.866 # +odown master mymaster 10.0.0.100 6379 #quorum 2/2
1466:X 23 May 2022 22:39:35.866 # +new-epoch 1
1466:X 23 May 2022 22:39:35.866 # +try-failover master mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:35.869 # +vote-for-leader 8a281d471b4d21f123dae1edf58cab8585b95e9c 1
1466:X 23 May 2022 22:39:35.874 # 919e8fe5e9963e805552e879847c3274ed8a040c voted for 8a281d471b4d21f123dae1edf58cab8585b95e9c 1
1466:X 23 May 2022 22:39:35.874 # c8d2fb34edbf68e35022c915563a4f9693d19410 voted for 8a281d471b4d21f123dae1edf58cab8585b95e9c 1
1466:X 23 May 2022 22:39:35.925 # +elected-leader master mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:35.925 # +failover-state-select-slave master mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:35.992 # +selected-slave slave 10.0.0.102:6379 10.0.0.102 6379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:35.992 * +failover-state-send-slaveof-noone slave 10.0.0.102:6379 10.0.0.102 6379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:36.077 * +failover-state-wait-promotion slave 10.0.0.102:6379 10.0.0.102 6379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:36.889 # +promoted-slave slave 10.0.0.102:6379 10.0.0.102 6379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:36.889 # +failover-state-reconf-slaves master mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:36.971 * +slave-reconf-sent slave 10.0.0.101:6379 10.0.0.101 6379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:37.898 * +slave-reconf-inprog slave 10.0.0.101:6379 10.0.0.101 6379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:37.898 * +slave-reconf-done slave 10.0.0.101:6379 10.0.0.101 6379 @ mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:37.973 # -odown master mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:37.973 # +failover-end master mymaster 10.0.0.100 6379
1466:X 23 May 2022 22:39:37.973 # +switch-master mymaster 10.0.0.100 6379 10.0.0.102 6379
1466:X 23 May 2022 22:39:37.973 * +slave slave 10.0.0.101:6379 10.0.0.101 6379 @ mymaster 10.0.0.102 6379
1466:X 23 May 2022 22:39:37.973 * +slave slave 10.0.0.100:6379 10.0.0.100 6379 @ mymaster 10.0.0.102 6379
1466:X 23 May 2022 22:39:41.041 # +sdown slave 10.0.0.100:6379 10.0.0.100 6379 @ mymaster 10.0.0.102 6379
           

1.3.9、恢复故障的原master重新加入redis集群

#重新其他原master的redis服务,会发现会自动指向新的master
root@node1:~# systemctl restart redis
root@node1:~# grep ^replicaof /apps/redis/etc/redis.conf
replicaof 10.0.0.102 6379

#在原master上观察状态
root@node1:~# redis-cli -a wm521314
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> INFO replication
# Replication
role:slave
master_host:10.0.0.102
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_read_repl_offset:427125
slave_repl_offset:427125
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:9e7e93af69bee27494f2592f2f45ae5475583f8a
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:427125
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:387282
repl_backlog_histlen:39844

root@node1:~# redis-cli -p 26379
127.0.0.1:26379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.0.0.102:6379,slaves=2,sentinels=3

#在回到新master节点观察状态
root@node3:~# redis-cli -a wm521314
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> INFO replication
# Replication
role:master
connected_slaves:2
slave0:ip=10.0.0.101,port=6379,state=online,offset=447650,lag=0
slave1:ip=10.0.0.100,port=6379,state=online,offset=447934,lag=0
master_failover_state:no-failover
master_replid:9e7e93af69bee27494f2592f2f45ae5475583f8a
master_replid2:9935b9865c7268c1aac772d2ed2a0deea625cdb5
master_repl_offset:447934
second_repl_offset:182386
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:447934
#在新主查看一下日志
root@node3:~# tail -f /apps/redis/log/sentinel.log
........
1466:X 23 May 2022 22:39:41.041 # +sdown slave 10.0.0.100:6379 10.0.0.100 6379 @ mymaster 10.0.0.102 6379
1466:X 23 May 2022 22:56:29.257 # -sdown slave 10.0.0.100:6379 10.0.0.100 6379 @ mymaster 10.0.0.102 6379
           

一键安装redis的脚本

root@node1:~# vim install_redis.sh
#!/bin/bash

SRC_DIR=/usr/local/src
COLOR="echo -e \\033[01;31m"
END='\033[0m'
CPUS=`lscpu | awk '/^CPU\(s\)/{print $2}'`

URL='https://download.redis.io/releases/'
VERSION=redis-6.2.7
PASSWORD=wm521314
INSTALL_DIR=/apps/redis

os() {
    if grep -Eqi "CentOS" /etc/issue || grep -Eq "CentOS" /etc/*-release;then
        rpm -q redhat-lsb-core &> /dev/null || { ${COLOR}"安装lsb_release工具"${END};yum -y install redhat-lsb-core &> /dev/null; }
    fi
    OS_ID=`lsb_release -is`
}

install() {
    if [ ${OS_ID} == "CentOS" ] &> /dev/null;then
        yum -y install gcc jemalloc-devel || { ${COLOR}"安装软件包失败,请检查网络配置"${END}; exit; }
        rpm -q wget &> /dev/null || yum -y install wget &> /dev/null
    else
        apt -y install make gcc libjemalloc-dev || { ${COLOR}"安装软件包失败,请检查网络配置"${END}; exit; }
    fi

    cd ${SRC_DIR}
    wget ${URL}${VERSION}.tar.gz || { ${COLOR}"Redis源码下载失败"${END}; exit; }
    tar xf ${VERSION}.tar.gz
    cd ${VERSION}
    make -j ${CPUS} PREFIX=${INSTALL_DIR} install && ${COLOR}"Redis编译安装完成"${END} || { ${COLOR}"Redis编译安装失败"${END}; exit; }
    ln -s ${INSTALL_DIR}/bin/redis-* /usr/bin/
    mkdir -p ${INSTALL_DIR}/{etc,log,data,run}
    cp redis.conf ${INSTALL_DIR}/etc/
    sed -i -e 's/bind 127.0.0.1.*/bind 0.0.0.0/' -e "/# requirepass/a requirepass ${PASSWORD}" -e "/^dir .*/c dir ${INSTALL_DIR}/data/" -e "/logfile .*/c logfile ${INSTALL_DIR}/log/redis-6379.log" -e "/^pidfile .*/c pidfile ${INSTALL_DIR}/run/redis_6379.pid" ${INSTALL_DIR}/etc/redis.conf

    if id redis &> /dev/null ;then
        ${COLOR}"Redis用户已经存在"${END}
    else
        useradd -r -s /sbin/nologin redis
        ${COLOR}"Redis用户创建成功"${END}
    fi

    chown -R redis.redis ${INSTALL_DIR}

    if [ ${OS_ID} == "CentOS" ] &> /dev/null;then
        echo 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.d/rc.local
        chmod +x /etc/rc.d/rc.local
    else
        cat >> /lib/systemd/system/rc-local.service <<-EOF
[Install]
WantedBy=multi-user.target
EOF
        echo '#!/bin/bash' > /etc/rc.local
        echo 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.local
        chmod +x /etc/rc.local
    fi
    
    cat > /lib/systemd/system/redis.service <<-EOF
[Unit]
Description=Redis persistent key-value database
After=network.target

[Service]
ExecStart=${INSTALL_DIR}/bin/redis-server ${INSTALL_DIR}/etc/redis.conf --supervised systemd
ExecStop=/bin/kill -s QUIT \$MAINPID
#Type=notify                                                                                                                                                                                                    
User=redis
Group=redis
RuntimeDirectory=redis
RuntimeDirectoryMode=0755

[Install]
WantedBy=multi-user.target
EOF
    systemctl daemon-reload
    systemctl enable --now redis &> /dev/null && ${COLOR}"Redis服务启动成功,Redis信息如下:"${END} || { ${COLOR}"Redis服务启动失败"${END};exit; }
    sleep 1
    redis-cli -a ${PASSWORD} INFO Server 2> /dev/null
}

main() {
    os
    install
}

main