主從複制的局限性
手動故障轉移
- master當機,redis服務不可用
- slave資料同步中斷
- 手動故障轉移
- 選出一個slave節點執行
,成為master節點slaveof no one
- 其他slave節點執行
,進行主從複制slaveof new master
- 對于調用redis服務的用戶端,如何讓用戶端感覺master發生變化,做出相應的處理是比較困難的。
寫能力和存儲能力受限
- 隻有master節點可以做寫入操作,存儲能力十分有限。
Redis Sentinel
全文除非有特殊聲明,否則全部預設為redis5.0版本

一套sentinel可以監控多套master-slave服務,使用master name配置作為辨別。
安裝與配置
安裝配置redis-server
- 建立一個redis配置檔案redis-7000.conf,按照最簡配置
port 7000
daemonize yes
pidfile /usr/local/software/redis/data/redis-7000.pid
logfile "/usr/local/software/redis/data/7000.log"
dir "/usr/local/software/redis/data"
借助sed指令快速生成slave節點配置為檔案
sed "s/7000/7001/g" redis-7000.conf > redis-7001.conf
sed "s/7000/7002/g" redis-7000.conf > redis-7002.conf
#配置主從關系
echo "slaveof 127.0.0.1 7000">> redis-7001.conf
echo "slaveof 127.0.0.1 7000" >> redis-7002.conf
#分别啟動redis的7000,7001,7002節點
./redis-server redis-7000.conf
./redis-server redis-7001.conf
./redis-server redis-7002.conf
- 使用用戶端連接配接redis-7000服務
我們不難看出目前7000端口的redis是master節點,它有兩個slave節點端口分别是7001和7002,并且全部處于線上狀态。
redis-cli -p 7000
127.0.0.1:7000> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=ip,port=7001,state=online,offset=789019,lag=0
slave1:ip=ip,port=7002,state=online,offset=789019,lag=0
master_replid:e27b673924f62d27605f5d095924ec5c287ced02
master_replid2:bc30e94bea8b34c0b5ba9f815f316edd8a05aa33
master_repl_offset:789019
second_repl_offset:265200
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:789019
- 使用用戶端連接配接redis-7001服務
在這裡我們很清楚的看到,7001節點為slave,他的master的host和端口号就是我們的7000端口的 redis節點。
[[email protected] redis]# ./redis-cli -p 7001
127.0.0.1:7001> info replication
# Replication
role:slave
master_host:ip
master_port:7000
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:919032
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:e27b673924f62d27605f5d095924ec5c287ced02
master_replid2:bc30e94bea8b34c0b5ba9f815f316edd8a05aa33
master_repl_offset:919032
second_repl_offset:265200
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:919032
127.0.0.1:7001>
安裝配置redis-server
- 配置開啟sentinel監控主節點(sentinel是特殊的redis)
過濾注釋和空行,篩出sentinel最簡配置
cat sentinel.conf | grep -v "#" | grep -v "^$" > redis-sentinel.conf
設定為背景啟動,設定日志檔案和工作目錄
port 26379
daemonize yes
pidfile /var/run/redis-sentinel-26379.pid
logfile "26379.log"
dir /usr/local/software/redis/data/
sentinel monitor mymaster 127.0.0.1 7000 2
sentinel down-after-milliseconds mymaster 30000
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 180000
sentinel deny-scripts-reconfig yes
#同理,配置26380,26381端口的配置檔案,使用redis-sentinel指令啟動。
./redis-sentinel redis-sentinel-26379.conf
./redis-sentinel redis-sentinel-26380.conf
./redis-sentinel redis-sentinel-26381.conf
連接配接26379端口的sentinel
127.0.0.1:26379> info Sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=ip:7002,slaves=2,sentinels=3
127.0.0.1:26379>
-
多機器部署,保證高可用
主從不部署在同一台機器上,redis-sentinel不與redis服務部署在同一台機器上,保證高可用
故障轉移
故障轉移(自動轉移)
- 多個sentinel發現并确認master出問題
- 選舉一個sentinel成為上司
- 選出一個slave成為新的master
- 通知其餘的slave成為新的master的slave
- 通知用戶端主從變化
- 等待老的master複活,讓它成為新的master的slave
故障轉移小實驗
在啟動redis 7000(master),7001(slave),7002(slave)三個服務和26379,26380,26381三個sentinel服務後,建立一個java項目JedisTest。
(1) 測試代碼:
public static void main(String[] args) {
String masterName = "mymaster";
Set<String> sentinelSet = new HashSet<String>();
sentinelSet.add("ip:26379");
sentinelSet.add("ip:26380");
sentinelSet.add("ip:26381");
JedisSentinelPool jedisSentinelPool = new JedisSentinelPool(masterName, sentinelSet);
int counter = 0;
while (true) {
Jedis jedis = null;
try {
counter ++;
jedis = jedisSentinelPool.getResource();
int index = new Random().nextInt(100000);
String key = "k-" + index;
String value = "v-" + index;
jedis.set(key, value);
if(counter % 100 == 0){
log.info("info,key={},value={}", key, value);
}
TimeUnit.MILLISECONDS.sleep(10);
} catch (Exception e) {
e.printStackTrace();
}finally {
if (jedis != null) {
jedis.close();
}
}
}
}
(2) 測試結果如下:
18:56:43.431 [main] INFO com.gy.redisTest.RedisSentinelFailOverTest - info,key=k-22950,value=v-22950
18:56:44.919 [main] INFO com.gy.redisTest.RedisSentinelFailOverTest - info,key=k-98952,value=v-98952
18:56:46.423 [main] INFO com.gy.redisTest.RedisSentinelFailOverTest - info,key=k-43036,value=v-43036
18:56:47.928 [main] INFO com.gy.redisTest.RedisSentinelFailOverTest - info,key=k-92698,value=v-92698
18:56:49.401 [main] INFO com.gy.redisTest.RedisSentinelFailOverTest - info,key=k-32185,value=v-32185
18:56:50.887 [main] INFO com.gy.redisTest.RedisSentinelFailOverTest - info,key=k-66828,value=v-66828
18:56:52.370 [main] INFO com.gy.redisTest.RedisSentinelFailOverTest - info,key=k-55874,value=v-55874
18:56:53.848 [main] INFO com.gy.redisTest.RedisSentinelFailOverTest - info,key=k-5782,value=v-5782
(3)模拟master當機,kill -9 port
控制台如下:
redis.clients.jedis.exceptions.JedisConnectionException: Unexpected end of stream.
at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:199)
at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40)
at redis.clients.jedis.Protocol.process(Protocol.java:151)
at redis.clients.jedis.Protocol.read(Protocol.java:215)
at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:340)
at redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:239)
at redis.clients.jedis.Jedis.set(Jedis.java:121)
at com.gy.redisTest.RedisSentinelFailOverTest.main(RedisSentinelFailOverTest.java:36)
redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
at redis.clients.util.Pool.getResource(Pool.java:53)
at redis.clients.jedis.JedisSentinelPool.getResource(JedisSentinelPool.java:209)
at com.gy.redisTest.RedisSentinelFailOverTest.main(RedisSentinelFailOverTest.java:32)
Caused by: redis.clients.jedis.exceptions.JedisConnectionException: java.net.ConnectException: Connection refused: connect
at redis.clients.jedis.Connection.connect(Connection.java:207)
at redis.clients.jedis.BinaryClient.connect(BinaryClient.java:93)
at redis.clients.jedis.BinaryJedis.connect(BinaryJedis.java:1767)
at redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:106)
at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:868)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
at redis.clients.util.Pool.getResource(Pool.java:49)
... 2 more
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at redis.clients.jedis.Connection.connect(Connection.java:184)
... 9 more
(4) 等待一段時間後,控制台恢複正常,故障自動轉移完成
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at redis.clients.jedis.Connection.connect(Connection.java:184)
... 9 more
18:57:27.035 [main] INFO com.gy.redisTest.RedisSentinelFailOverTest - info,key=k-99501,value=v-99501
18:57:28.604 [main] INFO com.gy.redisTest.RedisSentinelFailOverTest - info,key=k-14578,value=v-14578
檢視日志
什麼是主觀下線?什麼是客觀下線?
- 主觀下線是每個sentinel對redis節點失敗的“偏見”。
# 節點不可達的預設時間(ping不通)
sentinel down-after-milliseconds <master-name> <milliseconds>
- 客觀下線是所有sentinel節點對redis節點(master)失敗達成一緻意見(達到法定人數)。
#quorum 法定人數
sentinel monitor <master-name> <ip> <redis-port> <quorum>
三個定時任務
- 每10秒每個sentinel對master和slave執行info
- 發現slave節點
- 确認主從關系
- 每兩秒每個sentinel通過master節點的channel交換資訊(pub/sub)
- 通過_sentinel_:hello頻道互動
- 互動對節點的“看法”和自身資訊
- 每一秒每個sentinel對其他sentinel和redis執行ping
上司者選舉
- 選舉的目的: 隻有一個sentinel節點去完成故障轉移
-
選舉的過程:
(1)每個做主觀下線的sentinel都要向其他sentinel節點發送指令,請求成為上司者。
(2)收到指令的sentinel節點如果沒有投票給其他sentinel節點,那麼同意該請求。否則拒絕。
(3)當某個sentinel發現自己的票數超過半數并達到“法定人數”,那麼它将成為上司者,執行故障轉移。
(4)如果此過程中有多個節點成為上司者,等待一段時間重新選舉。
故障轉移(sentinel上司者完成)
- 從slave節點中選取一個“合适的”節點成為新的master。
- 對該slave節點執行slaveof no one,成為新的master。
- 通知剩餘的slave節點,成為新master的slave,進行主從複制
- 仍舊對下線的redis節點保持關注,當其恢複後,讓它成為master的新的slave節點
選擇“合适”的master
- 選擇slave_priority(優先級高)最大的slave節點,如果存在則傳回,不存在繼續。
- 選擇複制偏移量最大的節點(資料同步最完整),如果存在則傳回,不存在繼續。
- 選擇runId最小的節點
日志分析
(1)檢視7000主節點的日志
30712:C 15 Jul 2019 18:44:03.552 * RDB: 4 MB of memory used by copy-on-write
30703:M 15 Jul 2019 18:44:03.566 * Background saving terminated with success
30703:M 15 Jul 2019 18:44:03.566 * Synchronization with replica ip:7001 succeeded
30703:M 15 Jul 2019 18:44:05.656 * Replica ip:7002 asks for synchronization
30703:M 15 Jul 2019 18:44:05.656 * Full resync requested by replica 39.107.69.86:7002
30703:M 15 Jul 2019 18:44:05.656 * Starting BGSAVE for SYNC with target: disk
30703:M 15 Jul 2019 18:44:05.657 * Background saving started by pid 30719
30719:C 15 Jul 2019 18:44:05.661 * DB saved on disk
30719:C 15 Jul 2019 18:44:05.661 * RDB: 4 MB of memory used by copy-on-write
30703:M 15 Jul 2019 18:44:05.672 * Background saving terminated with success
30703:M 15 Jul 2019 18:44:05.672 * Synchronization with replica ip:7002 succeeded
由于我們模拟當機使用kill指令,主節點日志并沒有太多的資訊回報,直到當機前仍是在做資料同步。
(2) 檢視7001和7002從節點日志
30713:C 15 Jul 2019 18:44:03.596 * SYNC append only file rewrite performed
30713:C 15 Jul 2019 18:44:03.596 * AOF rewrite: 4 MB of memory used by copy-on-write
30708:S 15 Jul 2019 18:44:03.633 * Background AOF rewrite terminated with success
30708:S 15 Jul 2019 18:44:03.633 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
30708:S 15 Jul 2019 18:44:03.634 * Background AOF rewrite finished successfully
30708:S 15 Jul 2019 18:58:28.241 # Connection with master lost.
30708:S 15 Jul 2019 18:58:28.241 * Caching the disconnected master state.
30708:S 15 Jul 2019 18:58:28.846 * Connecting to MASTER ip:7000
30708:S 15 Jul 2019 18:58:28.847 * MASTER <-> REPLICA sync started
30708:S 15 Jul 2019 18:58:28.849 # Error condition on socket for SYNC: Connection refused
30708:S 15 Jul 2019 18:58:29.850 * Connecting to MASTER ip:7000
30708:S 15 Jul 2019 18:58:29.850 * MASTER <-> REPLICA sync started
30708:S 15 Jul 2019 18:58:29.853 # Error condition on socket for SYNC: Connection refused
30708:S 15 Jul 2019 18:58:30.851 * Connecting to MASTER ip:7000
30708:S 15 Jul 2019 18:58:30.852 * MASTER <-> REPLICA sync started
30708:S 15 Jul 2019 18:58:30.854 # Error condition on socket for SYNC: Connection refused
30708:S 15 Jul 2019 18:58:31.855 * Connecting to MASTER ip:7000
30708:S 15 Jul 2019 18:58:31.855 * MASTER <-> REPLICA sync started
...
30708:S 15 Jul 2019 18:58:56.919 * MASTER <-> REPLICA sync started
30708:S 15 Jul 2019 18:58:56.922 # Error condition on socket for SYNC: Connection refused
30708:S 15 Jul 2019 18:58:57.919 * Connecting to MASTER ip:7000
30708:S 15 Jul 2019 18:58:57.920 * MASTER <-> REPLICA sync started
30708:S 15 Jul 2019 18:58:57.922 # Error condition on socket for SYNC: Connection refused
30708:M 15 Jul 2019 18:58:58.628 # Setting secondary replication ID to d1c40b043f9fbc70ef8435d26897219c71ab97d7, valid up to offset: 379160. New replication ID is fdf1866660032a8cfd27167bf52a899d4e0dd7a5
30708:M 15 Jul 2019 18:58:58.628 * Discarding previously cached master state.
30708:M 15 Jul 2019 18:58:58.628 * MASTER MODE enabled (user request from 'id=7 addr=ip:50284 fd=11 name=sentinel-dbea5725-cmd age=285 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=154 qbuf-free=32614 obl=36 oll=0 omem=0 events=r cmd=exec')
30708:M 15 Jul 2019 18:58:58.629 # CONFIG REWRITE executed with success.
30708:M 15 Jul 2019 18:58:59.040 * Replica ip:7002 asks for synchronization
30708:M 15 Jul 2019 18:58:59.040 * Partial resynchronization request from ip:7002 accepted. Sending 663 bytes of backlog starting from offset 379160.
30715:S 15 Jul 2019 18:58:57.026 # Error condition on socket for SYNC: Connection refused
30715:S 15 Jul 2019 18:58:58.024 * Connecting to MASTER ip:7000
30715:S 15 Jul 2019 18:58:58.024 * MASTER <-> REPLICA sync started
30715:S 15 Jul 2019 18:58:58.026 # Error condition on socket for SYNC: Connection refused
30715:S 15 Jul 2019 18:58:58.719 * REPLICAOF ip:7001 enabled (user request from 'id=7 addr=ip:57200 fd=12 name=sentinel-dbea5725-cmd age=285 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=291 qbuf-free=32477 obl=36 oll=0 omem=0 events=r cmd=exec')
30715:S 15 Jul 2019 18:58:58.721 # CONFIG REWRITE executed with success.
30715:S 15 Jul 2019 18:58:59.027 * Connecting to MASTER ip:7001
30715:S 15 Jul 2019 18:58:59.027 * MASTER <-> REPLICA sync started
30715:S 15 Jul 2019 18:58:59.029 * Non blocking connect for SYNC fired the event.
30715:S 15 Jul 2019 18:58:59.032 * Master replied to PING, replication can continue...
30715:S 15 Jul 2019 18:58:59.039 * Trying a partial resynchronization (request d1c40b043f9fbc70ef8435d26897219c71ab97d7:379160).
30715:S 15 Jul 2019 18:58:59.042 * Successful partial resynchronization with master.
30715:S 15 Jul 2019 18:58:59.042 # Master replication ID changed to fdf1866660032a8cfd27167bf52a899d4e0dd7a5
30715:S 15 Jul 2019 18:58:59.042 * MASTER <-> REPLICA sync: Master accepted a Partial Resynchronization.
- 從日志中我們不難發現在6點58分的時候,slave節點與master節點失聯,并且slave一直嘗試連接配接master節點。
- 在58分58秒7001節點接收到一條請求,希望讓它成為新的master,并進行了配置重寫,7002節點嘗試從7001節點請求同步資料。
- 在58分58秒7002節點接收到一條請求,成為7001的從節點,并重寫配置資訊。7002節點嘗試部分重新同步,它記錄的master ID進行變更,主節點接受了部分重新同步。(同時,從這裡我們也能看出,新版的redis相較舊版本4.0之前做了優化,主從切換可以嘗試進行部分複制,不再絕對的進行全量複制)
(3)驗證日志分析結果
- 7001節點角色是master,slave節點端口為7002
127.0.0.1:7001> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=ip,port=7002,state=online,offset=1339423,lag=1
master_replid:fdf1866660032a8cfd27167bf52a899d4e0dd7a5
master_replid2:d1c40b043f9fbc70ef8435d26897219c71ab97d7
master_repl_offset:1339423
second_repl_offset:379160
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:290848
repl_backlog_histlen:1048576
- sentinel節點資訊顯示,主節點端口為7001,同時7002成為它的從節點,但是為什麼slave顯示為2,是因為sentinel會等待7000節點啟動,一旦7000節點啟動,會通知7000節點成為7001的從節點。
127.0.0.1:26379> info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=ip:7001,slaves=2,sentinels=3
- 在三個sentinel節點的日志中可以看出,26379、26380、26381先後将7000節點主觀下線(+sdown),計數器+1(“新紀元”+1),26380節點發現主觀下線數達到配置的“法定人數”,準備對7000master節點進行客觀下線(+odown)。開始投票選舉,26380希望成為上司者,26379和26381也投票給了26380節點,選舉它成為上司者。上司者通知7001節點slaveof no one 成為master,通知7000節點slave of 7001通知7002節點slave of 7001.
# 26379節點日志
# 主觀下線7000節點
30790:X 15 Jul 2019 18:58:58.291 # +sdown master mymaster ip 7000
30790:X 15 Jul 2019 18:58:58.420 # +new-epoch 1
# 投票為26380成為上司者
30790:X 15 Jul 2019 18:58:58.422 # +vote-for-leader dbea572500678a7e3523f5c2d30aee38c771982c 1
30790:X 15 Jul 2019 18:58:58.718 # +config-update-from sentinel dbea572500678a7e3523f5c2d30aee38c771982c ip 26380 @ mymaster ip 7000
# 切換7000主節點為7001主節點
30790:X 15 Jul 2019 18:58:58.718 # +switch-master mymaster ip 7000 ip 7001
30790:X 15 Jul 2019 18:58:58.718 * +slave slave ip:7002 ip 7002 @ mymaster ip 7001
30790:X 15 Jul 2019 18:58:58.718 * +slave slave ip:7000 ip 7000 @ mymaster ip 7001
30790:X 15 Jul 2019 18:59:28.728 # +sdown slave ip:7000 ip 7000 @ mymaster ip 7001
# 26380節點日志
30795:X 15 Jul 2019 18:54:13.809 # Sentinel ID is dbea572500678a7e3523f5c2d30aee38c771982c
# 主觀下線7000節點
30795:X 15 Jul 2019 18:58:58.335 # +sdown master mymaster ip 7000
30795:X 15 Jul 2019 18:58:58.411 # +odown master mymaster ip 7000 #quorum 2/2
30795:X 15 Jul 2019 18:58:58.411 # +new-epoch 1
30795:X 15 Jul 2019 18:58:58.411 # +try-failover master mymaster ip 7000
# 希望自己成為上司者
30795:X 15 Jul 2019 18:58:58.416 # +vote-for-leader dbea572500678a7e3523f5c2d30aee38c771982c 1
# 26379投票給我成為上司者
30795:X 15 Jul 2019 18:58:58.422 # f67443294644bcaca75a83bd9aeb0baade1d6ecc voted for dbea572500678a7e3523f5c2d30aee38c771982c 1
# 26381投票給我成為上司者
30795:X 15 Jul 2019 18:58:58.422 # ad712f71928204dc55033dd391968a99388fcd98 voted for dbea572500678a7e3523f5c2d30aee38c771982c 1
30795:X 15 Jul 2019 18:58:58.471 # +elected-leader master mymaster ip 7000
# 故障轉移--選擇故障master
30795:X 15 Jul 2019 18:58:58.471 # +failover-state-select-slave master mymaster ip 7000
# 選擇7001成為master
30795:X 15 Jul 2019 18:58:58.571 # +selected-slave slave ip:7001 ip 7001 @ mymaster ip 7000
# slaveof no one
30795:X 15 Jul 2019 18:58:58.571 * +failover-state-send-slaveof-noone slave ip:7001 ip 7001 @ mymaster ip 7000
30795:X 15 Jul 2019 18:58:58.627 * +failover-state-wait-promotion slave ip:7001 ip 7001 @ mymaster ip 7000
30795:X 15 Jul 2019 18:58:58.633 # +promoted-slave slave ip:7001 ip 7001 @ mymaster ip 7000
30795:X 15 Jul 2019 18:58:58.633 # +failover-state-reconf-slaves master mymaster ip 7000
30795:X 15 Jul 2019 18:58:58.717 * +slave-reconf-sent slave ip:7002 ip 7002 @ mymaster ip 7000
# 客觀下線
30795:X 15 Jul 2019 18:58:59.555 # -odown master mymaster ip 7000
30795:X 15 Jul 2019 18:58:59.671 * +slave-reconf-inprog slave ip:7002 ip 7002 @ mymaster ip 7000
30795:X 15 Jul 2019 18:58:59.671 * +slave-reconf-done slave ip:7002 ip 7002 @ mymaster ip 7000
30795:X 15 Jul 2019 18:58:59.732 # +failover-end master mymaster ip 7000
# 切換7000主節點為7001主節點
30795:X 15 Jul 2019 18:58:59.732 # +switch-master mymaster ip 7000 ip 7001
30795:X 15 Jul 2019 18:58:59.732 * +slave slave ip:7002 ip 7002 @ mymaster ip 7001
30795:X 15 Jul 2019 18:58:59.732 * +slave slave ip:7000 ip 7000 @ mymaster ip 7001
30795:X 15 Jul 2019 18:59:29.752 # +sdown slave ip:7000 ip 7000 @ mymaster ip 7001
# 26381節點日志
# 主觀下線7000節點
30800:X 15 Jul 2019 18:58:58.342 # +sdown master mymaster ip 7000
30800:X 15 Jul 2019 18:58:58.420 # +new-epoch 1
30800:X 15 Jul 2019 18:58:58.422 # +vote-for-leader dbea572500678a7e3523f5c2d30aee38c771982c 1
30800:X 15 Jul 2019 18:58:58.433 # +odown master mymaster ip 7000 #quorum 3/2
30800:X 15 Jul 2019 18:58:58.433 # Next failover delay: I will not start a failover before Mon Jul 15 19:04:59 2019
30800:X 15 Jul 2019 18:58:58.718 # +config-update-from sentinel dbea572500678a7e3523f5c2d30aee38c771982c ip 26380 @ mymaster ip 7000
# 切換7000主節點為7001主節點
30800:X 15 Jul 2019 18:58:58.718 # +switch-master mymaster ip 7000 ip 7001
30800:X 15 Jul 2019 18:58:58.718 * +slave slave ip:7002 ip 7002 @ mymaster ip 7001
30800:X 15 Jul 2019 18:58:58.718 * +slave slave ip:7000 ip 7000 @ mymaster ip 7001
30800:X 15 Jul 2019 18:59:28.769 # +sdown slave ip:7000 ip 7000 @ mymaster ip 7001
日常運維
主節點下線
做一個手動的故障轉移,忽略主觀下線、客觀下線等,直接進行故障轉移,完成主節點下線
從節點下線
要考慮是臨時下線還是永久下線,例如是否清理資料。
主節點上線
使用sentinel failover 進行替換,然後調高我們希望上線的從節點的slave_priority(優先級)
從節點上線
直接使用slaveof 指令進行主從複制即可。