一、概念
1、使用哨兵模式的目的
為了解決Redis的主從複制不支援高可用的性能,Redis實作了Sentinel哨兵機制解決方案;
2、什麼是哨兵模式
由一個或多個Sentinel去監聽任意多個主服務以及主伺服器下的所有從伺服器,并在被監視的主伺服器進入下線狀态時,自動将下線的主伺服器屬下的某個從伺服器更新為新的主伺服器,然後由新的主伺服器代替已經下線的從伺服器,并且Sentinel可以互相監視。
3、哨兵模式如何工作的
(1)每個Sentinel(哨兵)程序以每秒鐘一次的頻率向整個叢集中的Master主伺服器,Slave從伺服器以及其他Sentinel(哨兵)程序發送一個 PING 指令。
(2)如果一個執行個體(instance)距離最後一次有效回複 PING 指令的時間超過 down-after-milliseconds 選項所指定的值, 則這個執行個體會被 Sentinel(哨兵)程序标記為主觀下線(SDOWN)
(3)如果一個Master主伺服器被标記為主觀下線(SDOWN),則正在監視這個Master主伺服器的所有 Sentinel(哨兵)程序要以每秒一次的頻率确認Master主伺服器的确進入了主觀下線狀态
(4)當有足夠數量的 Sentinel(哨兵)程序(大于等于配置檔案指定的值)在指定的時間範圍内确認Master主伺服器進入了主觀下線狀态(SDOWN), 則Master主伺服器會被标記為客觀下線(ODOWN)
(5)在一般情況下, 每個 Sentinel(哨兵)程序會以每 10 秒一次的頻率向叢集中的所有Master主伺服器、Slave從伺服器發送 INFO 指令。
(6)當Master主伺服器被 Sentinel(哨兵)程序标記為客觀下線(ODOWN)時,Sentinel(哨兵)程序向下線的 Master主伺服器的所有 Slave從伺服器發送 INFO 指令的頻率會從 10 秒一次改為每秒一次。
(7)若沒有足夠數量的 Sentinel(哨兵)程序同意 Master主伺服器下線, Master主伺服器的客觀下線狀态就會被移除。若 Master主伺服器重新向 Sentinel(哨兵)程序發送 PING 指令傳回有效回複,Master主伺服器的主觀下線狀态就會被移除。
4、哨兵模式的優點
(1)哨兵模式是基于主從複制模式的,所有主從複制的優點,哨兵模式都具有。
(2)主從可以自動切換,系統更健壯,可用性更高。
5、哨兵模式的缺點
Redis較難支援線上擴容,在叢集容量達到上限時線上擴容會變得很複雜。
二、單哨兵redis叢集搭建
1、節點介紹
master(192.168.230.21): 預設主服務
slaves1(192.168.230.22): 預設從服務
slaves2(192.168.230.23): 預設從服務
2、master的redis.conf主要配置
daemonize yes
port 6379
bind 192.168.230.21
requirepass "123456"
3、master的sentinel.conf主要配置
port 26379
sentinel monitor mymaster 192.168.230.23 6379 1
sentinel auth-pass mymaster 123456
sentinel down-after-milliseconds mymaster 30000
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 180000
說明:
sentinel monitor master-group-name hostname port quorum:
quorum指的是至少多少個哨兵要一緻同意,master程序挂掉了,或者slave程序挂掉了,或者要啟動一個故障轉移操作 ;
down-after-milliseconds:
超過多少毫秒跟一個redis執行個體斷了連接配接,哨兵就可能認為這個redis執行個體挂了 。
parallel-syncs:
新的master被切換之後,同時有多少個slave被切換到去連接配接新master,重新做同步,數字越低,花費的時間越多。假設redis是1個master,4個slave,然後master當機了,4個slave中有1個切換成了master,剩下3個slave就要挂到新的master上面去,這個時候:
如果parallel-syncs是1,那麼3個slave,一個一個地挂接到新的master上面去,1個挂接完,而且從新的master sync完資料之後,再挂接下一個;
如果parallel-syncs是3,那麼一次性就會把所有slave挂接到新的master上去 。
failover-timeout:
執行故障轉移的timeout逾時時長。
sentinel auth-pass:
是設定主節點的密碼。
4、slaves1的redis.conf的主要配置
daemonize yes
port 6379
bind 192.168.230.22
requirepass 123456
#連接配接主機和端口号
slaveof master 6379
#設定連接配接的主機密碼
masterauth 123456
5、slaves2的redis.conf的主要配置
daemonize yes
port 6379
bind 192.168.230.23
requirepass 123456
#連接配接主機和端口号
slaveof master 6379
#設定連接配接的主機密碼
masterauth 123456
6、在master啟動redis和哨兵
[[email protected] bin]# ./redis-server ./redis.conf
[[email protected] bin]# ./redis-cli -h 192.168.230.21 -a 123456
[[email protected] bin]# ./redis-sentinel /opt/softWare/redis3.0/redis-3.0.0/sentinel.conf
[[email protected] bin]# ./redis-sentinel /opt/softWare/redis3.0/redis-3.0.0/sentinel.conf
7325:X 27 Mar 15:24:00.102 * Increased maximum number of open files to 10032 (it was originally set to 1024).
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 3.0.0 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in sentinel mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 26379
| `-._ `._ / _.-' | PID: 7325
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
7325:X 27 Mar 15:24:00.103 # Sentinel runid is 64976baecdc6b9f5d39ee3de67ea124ba8441551
7325:X 27 Mar 15:24:00.103 # +monitor master mymaster 192.168.230.21 6379 quorum 1
7、分别啟動slaves1、slaves2的redis從服務
[[email protected] bin]# ./redis-server ./redis.conf
[[email protected] bin]# ./redis-cli -h 192.168.230.22 -a 123456
[[email protected] bin]# ./redis-server ./redis.conf
[[email protected] bin]# ./redis-cli -h 192.168.230.23 -a 123456
8、角色驗證
192.168.230.21:6379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.230.22,port=6379,state=online,offset=9381,lag=1
slave1:ip=192.168.230.23,port=6379,state=online,offset=9381,lag=1
master_repl_offset:9524
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:9523
192.168.230.22:6379> info replication
# Replication
role:slave
master_host:192.168.230.21
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:10253
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
192.168.230.23:6379> info replication
# Replication
role:slave
master_host:192.168.230.21
master_port:6379
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:10696
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
9、模拟slaves1的redis服務挂掉
[[email protected] ~]# netstat -lnp | grep 6379
tcp 0 0 192.168.230.22:6379 0.0.0.0:* LISTEN 6658/./redis-server
[[email protected] ~]# kill -9 6658
[[email protected] ~]# netstat -lnp | grep 6379
哨兵輸出:
7325:X 27 Mar 15:32:09.916 # +sdown slave 192.168.230.22:6379 192.168.230.22 6379 @ mymaster 192.168.230.21 6379
此時:
192.168.230.21:6379> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=192.168.230.23,port=6379,state=online,offset=32280,lag=0
master_repl_offset:32280
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:32279
192.168.230.23:6379> info replication
# Replication
role:slave
master_host:192.168.230.21
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:33881
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
10、将slaves1的redis服務回複正常
[[email protected] bin]# ./redis-server ./redis.conf
[[email protected] bin]# ./redis-cli -h 192.168.230.22 -a 123456
哨兵輸出:
7325:X 27 Mar 15:34:16.091 * +reboot slave 192.168.230.22:6379 192.168.230.22 6379 @ mymaster 192.168.230.21 6379
7325:X 27 Mar 15:34:16.162 # -sdown slave 192.168.230.22:6379 192.168.230.22 6379 @ mymaster 192.168.230.21 6379
此時:
192.168.230.21:6379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.230.23,port=6379,state=online,offset=40622,lag=0
slave1:ip=192.168.230.22,port=6379,state=online,offset=40622,lag=0
master_repl_offset:40765
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:40764
11、模拟master的redis主服務挂掉
[[email protected] ~]# netstat -lnp | grep 6379
tcp 0 0 0.0.0.0:26379 0.0.0.0:* LISTEN 7325/./redis-sentin
tcp 0 0 192.168.230.21:6379 0.0.0.0:* LISTEN 7294/./redis-server
tcp6 0 0 :::26379 :::* LISTEN 7325/./redis-sentin
[[email protected] ~]# kill -9 7294
[[email protected] ~]# netstat -lnp | grep 6379
tcp 0 0 0.0.0.0:26379 0.0.0.0:* LISTEN 7325/./redis-sentin
tcp6 0 0 :::26379 :::* LISTEN 7325/./redis-sentin
7325:X 27 Mar 15:36:45.697 # +sdown master mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:45.697 # +odown master mymaster 192.168.230.21 6379 #quorum 1/1
7325:X 27 Mar 15:36:45.697 # +new-epoch 1
7325:X 27 Mar 15:36:45.697 # +try-failover master mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:45.698 # +vote-for-leader 64976baecdc6b9f5d39ee3de67ea124ba8441551 1
7325:X 27 Mar 15:36:45.698 # +elected-leader master mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:45.698 # +failover-state-select-slave master mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:45.789 # +selected-slave slave 192.168.230.22:6379 192.168.230.22 6379 @ mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:45.789 * +failover-state-send-slaveof-noone slave 192.168.230.22:6379 192.168.230.22 6379 @ mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:45.866 * +failover-state-wait-promotion slave 192.168.230.22:6379 192.168.230.22 6379 @ mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:46.735 # +promoted-slave slave 192.168.230.22:6379 192.168.230.22 6379 @ mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:46.735 # +failover-state-reconf-slaves master mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:46.794 * +slave-reconf-sent slave 192.168.230.23:6379 192.168.230.23 6379 @ mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:47.785 * +slave-reconf-inprog slave 192.168.230.23:6379 192.168.230.23 6379 @ mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:47.785 * +slave-reconf-done slave 192.168.230.23:6379 192.168.230.23 6379 @ mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:47.884 # +failover-end master mymaster 192.168.230.21 6379
7325:X 27 Mar 15:36:47.884 # +switch-master mymaster 192.168.230.21 6379 192.168.230.22 6379
7325:X 27 Mar 15:36:47.885 * +slave slave 192.168.230.23:6379 192.168.230.23 6379 @ mymaster 192.168.230.22 6379
7325:X 27 Mar 15:36:47.888 * +slave slave 192.168.230.21:6379 192.168.230.21 6379 @ mymaster 192.168.230.22 6379
7325:X 27 Mar 15:37:17.925 # +sdown slave 192.168.230.21:6379 192.168.230.21 6379 @ mymaster 192.168.230.22 6379
由輸出日志可知:主服務挂掉,已經将從服務的230.22選為主服務;
此時:
192.168.230.22:6379> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=192.168.230.23,port=6379,state=online,offset=2068,lag=1
master_repl_offset:2068
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:2067
192.168.230.23:6379> info replication
# Replication
role:slave
master_host:192.168.230.22
master_port:6379
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:8929
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
12、再将master上的redis服務重新起來
[[email protected] bin]# ./redis-server ./redis.conf
[[email protected] bin]# ./redis-cli -h 192.168.230.21 -a 123456
哨兵輸出:
7325:X 27 Mar 15:40:01.285 * +convert-to-slave slave 192.168.230.21:6379 192.168.230.21 6379 @ mymaster 192.168.230.22 6379
此時master上的redis服務已經變為了從服務:
192.168.230.21:6379> info replication
# Replication
role:slave
master_host:192.168.230.22
master_port:6379
master_link_status:down
master_last_io_seconds_ago:-1
master_sync_in_progress:0
slave_repl_offset:1
master_link_down_since_seconds:1585294864
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
192.168.230.22:6379> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=192.168.230.23,port=6379,state=online,offset=21765,lag=0
master_repl_offset:21908
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:21907
192.168.230.23:6379> info replication
# Replication
role:slave
master_host:192.168.230.22
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:23523
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0