天天看點

keepalived+redis 實作高可用的自動故障轉移failover

keepalived+redis 實作高可用的自動故障轉移failover
 

 在A伺服器(10.0.11.2),B伺服器(10.0.12.2)上均安裝redis,keepalived(安裝方法略)

 A作為預設的master,B作為slave(在redis的配置檔案中加上 SLAVEOF 10.0.11.2 6379)即可

 A,B上的Redis均開啟本地化政策。appendonly yes


A伺服器的配置
keepalived配置檔案内容
 -------begin------

 ! Configuration File for keepalived



 global_defs {

    lvs_id LVS_redis

 }


 vrrp_script chk_redis {

 
script "/opt/redis/sh/redis_check.sh"
weight -20
 
interval 2

 }


 vrrp_instance VI_1 {

    state backup
     #state MASTER
     interface bond0
     virtual_router_id 51
     nopreempt
     priority 200
     advert_int 5
     authentication {
         auth_type PASS
         auth_pass 1111
     }     track_script {

chk_redis

     }

     virtual_ipaddress {

         10.0.11.0

     }

     notify_master /opt/redis/sh/redis_master.sh

     notify_backup /opt/redis/sh/redis_backup.sh

     notify_fault /opt/redis/sh/redis_fault.sh

     notify_stop /opt/redis/sh/redis_stop.sh

 }

 -----end-----

說明:
 global_defs 部分的郵件可以随便寫,要實作郵件通知則要按真實填寫

script "/opt/redis/sh/redis_check.sh" #監控腳本的路徑
weight -20   #redis連接配接失敗優先級-20,優先級調整則會觸發keepalived的狀态轉移,vip同時會漂移
 interval 5
 #監控的頻率 

state MASTER #預設的狀态
state backup #備份狀态  這裡兩個都設定為備份狀态,
nopreempt 設定為不搶占,靠優先級來确定誰是master
 interface bond0
 #網卡名

 virtual_router_id 51 
#A,B 伺服器設定一樣即可

 priority 200
 #優先級 比B設大即可

 advert_int 2
 #貌似廣播的頻率,不确定

 authentication {
 #A,B 伺服器設定一樣即可

auth_type PASS

auth_pass 1111

 }

     track_script {
 #監控的名稱,上面設定的

chk_redis

     }

     virtual_ipaddress {
 #虛拟IP, 用戶端就用這個IP來通路redis

         10.0.11.0

     }

 #以下是各個狀态下執行的腳本路徑
 

 notify_master /opt/redis/sh/redis_master.sh
  #成為master

 notify_backup /opt/redis/sh/redis_backup.sh
  #成為backup

 notify_fault /opt/redis/sh/redis_fault.sh
 #監控腳本 exit 1 時

 notify_stop /opt/redis/sh/redis_stop.sh
 #keepalived 服務停止時

/opt/redis/sh/redis_check.sh
 --begin--



 #!/bin/bash

 ALIVE=`/usr/local/bin/redis-cli PING`

 LOGFILE="/opt/redis/logs/keepalived-redis-state.log"

 if [ "$ALIVE" == "PONG" ]; then

    echo $ALIVE

     #echo "check master  pong" >> $LOGFILE

    exit 0

 else

    echo $ALIVE

    exit 1

 fi

 --end--

/opt/redis/sh/redis_master.sh
 --begin--


 #!/bin/bash

 REDISCLI="/usr/local/bin/redis-cli"

 LOGFILE="/opt/redis/logs/keepalived-redis-state.log"

 echo "[master]" >> $LOGFILE

 date >> $LOGFILE

 echo "Being master...." >> $LOGFILE 2>&1

 echo "Run SLAVEOF cmd ..." >> $LOGFILE

 $REDISCLI SLAVEOF 10.0.12.2 6379 >> $LOGFILE  2>&1

 sleep 15 

 echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE

 $REDISCLI SLAVEOF NO ONE

 $REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1

 --end--

/opt/redis/sh/redis_backup.sh
 --begin--

 REDISCLI="/usr/local/bin/redis-cli"

 LOGFILE="/opt/redis/logs/keepalived-redis-state.log"

 echo "[backup]" >> $LOGFILE

 date >> $LOGFILE

 echo "Being slave...." >> $LOGFILE 2>&1

 sleep 15

 echo "Run SLAVEOF cmd..." >> $LOGFILE 

 $REDISCLI SLAVEOF 10.0.12.2 6379 >> $LOGFILE  2>&1

 --end--

/opt/redis/sh/redis_fault.sh
 --begin--

 #!/bin/bash



 LOGFILE="/opt/redis/logs/keepalived-redis-state.log"

 echo "[fault]" >> $LOGFILE

 date >> $LOGFILE

 sh /opt/redis/sh/redis_backup.sh

 --end--

/opt/redis/sh/redis_stop.sh
 --begin--

 #!/bin/bash

 LOGFILE="/opt/redis/logs/keepalived-redis-state.log"

 echo "[stop]" >> $LOGFILE

date >> $LOGFILE
sh /opt/redis/sh/redis_backup.sh
 --end--

 B伺服器的配置
 

keepalived配置檔案内容
 -------begin------

 ! Configuration File for keepalived


 global_defs {

    lvs_id LVS_redis

 }


 vrrp_script chk_redis {

 
script "/opt/redis/sh/redis_check.sh"
weight -20
 
interval 2

 }


 vrrp_instance VI_1 {

     state BACKUP

     interface bond0

     virtual_router_id 51

     priority 190

     authentication {

         auth_type PASS

         auth_pass 1111

     }

     track_script {

chk_redis

     }

     virtual_ipaddress {

         10.0.11.0

     }

     notify_master /opt/redis/sh/redis_master.sh

     notify_backup /opt/redis/sh/redis_backup.sh

     notify_fault /opt/redis/sh/redis_fault.sh

     notify_stop /opt/redis/sh/redis_stop.sh

 }

 -----end-----


/opt/redis/sh/redis_check.sh
 --begin--

 #!/bin/bash

 ALIVE=`/usr/local/bin/redis-cli -h 10.0.11.2 -p 6379 PING`

 LOGFILE="/opt/redis/logs/keepalived-redis-state.log"

 if [ "$ALIVE" != "PONG" ]; then

   
echo $ALIVE

  exit 0

 else

  
 echo $ALIVE

  exit 1

 fi

 --end--

/opt/redis/sh/redis_master.sh
 --begin--

 #!/bin/bash

 REDISCLI="/usr/local/bin/redis-cli"

 LOGFILE="/opt/redis/logs/keepalived-redis-state.log"

 echo "[master]" >> $LOGFILE

 date >> $LOGFILE

 echo "Being master...." >> $LOGFILE 2>&1

 ##echo "master Run SLAVEOF 10.0.11.2 cmd ..." >> $LOGFILE

 ##REDISCLI SLAVEOF 10.0.11.2 6379 >> $LOGFILE  2>&1

 #sleep 10 

 echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE

 $REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1



 --end--

/opt/redis/sh/redis_backup.sh
 --begin--

 #!/bin/bash

 REDISCLI="/usr/local/bin/redis-cli"

 LOGFILE="/opt/redis/logs/keepalived-redis-state.log"

 echo "[backup]" >> $LOGFILE

 date >> $LOGFILE

 echo "Being slave...." >> $LOGFILE 2>&1

 #sleep 10

 echo "backup Run SLAVEOF 10.0.11.2 cmd..." >> $LOGFILE 

 $REDISCLI SLAVEOF 10.0.11.2 6379 >> $LOGFILE  2>&1

 --end--

/opt/redis/sh/redis_fault.sh
 --begin--

 #!/bin/bash



 LOGFILE="/opt/redis/logs/keepalived-redis-state.log"

 echo "[fault]" >> $LOGFILE

date >> $LOGFILE
sh /opt/redis/sh/redis_backup.sh
 --end--



 /opt/redis/sh/redis_stop.sh

 --begin--

 #!/bin/bash

 LOGFILE="/opt/redis/logs/keepalived-redis-state.log"

 echo "[stop]" >> $LOGFILE

date >> $LOGFILE
sh /opt/redis/sh/redis_backup.sh
--end--
腳本說明:
 腳本的邏輯就是當A,B上的redis服務正常是A為master,B為slave

 如果檢測到A服務不正常則B成為master, “/usr/local/bin/redis-cli SLAVEOF NO ONE” 這個指令就是關閉資料同步,變成Redis 的master.

 如果A服務起來後,A切回master,在變成master前從B上同步最新的資料。同時在B上要 執行 “/usr/local/bin/redis-cli SLAVEOF 10.0.11.2 6379”

 讓B再做為A的slave,不然B還是master.

 在redis的主從架構中,可以用“/usr/local/bin/redis-cli -h 10.0.11.2 INFO” 來檢視各個目前的狀态。 看目前伺服器是master還是slave;



 在指令行下 

 “tail -30 /opt/redis/logs/keepalived-redis-state.log”  檢視keepalived的狀态轉換 



 “tail -30 /var/log/messages” 檢視 keepalived虛拟IP的變化。



 以上是在實際生産環境中測試過的,雖然有的的地方可能不大合理,但故障轉移可以實作,資料也不會丢。之前按網上的教程做的vip可以切換,但資料這塊有問題,是以改成這樣。

 如有更好的方法請告知,多謝!


keepalived運作原理
keepalived預設隻能做到對網絡故障和keepalived本身的監控,即當出現網絡故障或者keepalived本身出現問題時,進行切換。但我們更關注的是機器上運作的業務,如果業務出問題了VIP沒有變化,整體來說還是失敗的。這時候就需要根據業務程序的運作狀态決定是否需要進行主備切換。還好keepalived提供了這樣一個自定義腳本監控功能,用這個來實作業務的控制
方案的整體思路:
     通過keepalived的自定義腳本功能監控本機的redis服務狀态,當監控腳本檢測到redis服務出現異常時,則改變本機keepalived的優先級,同時這會導緻master/backup角色的變化,而keepalived在角色變化時也會觸發一些機制執行相關腳本,這就為我們改變redis的master/slave狀态提供了機會,這樣做的目的是為了是redis的master/slave直接的資料保持一緻。


     在keepalived+redis的使用過程中有四種情況:


     1 一種是keepalived挂了,同時redis也挂了,這樣的話直接VIP飄走之後,是不需要進行redis資料同步的,因為redis挂了,你也無法去master上同步,不過會損失已經寫在master上卻還沒同步到slave上面的這部分資料。


     2 另一種是keepalived挂了,redis沒挂,這時候VIP飄走後,redis的master/slave還是老的對應關系,如果不變化的話會把資料寫入redis slave中,進而不會同步到master上去,這就要借助監控腳本反轉redis的master/slave關系。這時候就要預留一點時間進行資料同步,然後反轉master/slave。


     3 還有一種是keepalived沒挂,redis挂了,這時候根據監控腳本會檢測到redis挂了,并且降低keepalived master的優先級,同樣會導緻VIP飄走,情況和第二種一樣,也是需要進行資料同步,然後反轉目前redis的master/slave關系的。


     4 随後一種是keepalived沒挂,redis也沒挂,大吉大利啊,什麼都不用操作。      

繼續閱讀