基于MaxScale中间件的MySQL读写分离

概述

maxscale 基于keepalived的高可用，通过VIP提供服务

maxscale官网：https://mariadb.com/downloads/mariadb-tx/maxscale

maxscale文档（比官网查看方便）https://github.com/mariadb-corporation/MaxScale/tree/2.2/Documentation

mysql基于GTID模式的主从复制，可以在主库故障后快速修复复制状态。

故障切换示意图

基于MaxScale中间件的MySQL读写分离

环境介绍

NAME	VERSION	IP	PORT	COMMENT
maxscale	2.2.6	172.16.10.114	4306,6603	4306为读写分离端口，6603为管理端口
master	5.6.39	172.16.10.114	3308	GTID复制
slave	5.6.39	172.16.10.114	3309	GTID复制

maxscale安装配置

mysql GTID配置

开启GTID需要在配置文件中加入以下参数：

gtid_mode = ON

enforce_gtid_consistency = ON

log_slave_updates = ON

主从配置命令：

CHANGE MASTER TO MASTER_HOST=host, MASTER_PORT=port, MASTER_USER=user, MASTER_PASSWORD=password, MASTER_AUTO_POSITION=1;

注意：

开启GTID的主从复制，在备份时，会自动加上GTID的相关信息，通过备份进行恢复时，会报错，其原因是备份中的GTID与实例执行的GTID冲突，同时也防止事件重复执行，解决些报错有2种办法：

法一，在备份时候添加参数--set-gtid-purged=OFF，不备份GTID相关信息。

法二，在进行还原时，先进入实例执行reset master，再进行数据还原

开启GTID以后，无法使用sql_slave_skip_counter跳过事务，跳过错误的方法为：

set grid_next='xxxxxxxx'; #待跳过的GTID

begin;

commit; #通过产生一个空事务来占据此GTID

change master to master_auto_position=1;

MaxScale安装配置

wget -c https://downloads.mariadb.com/MaxScale/2.2.6/rhel/6/x86_64/maxscale-2.2.6-1.rhel.6.x86_64.rpm

yum install maxscale-2.2.6-1.rhel.6.x86_64.rpm

cat /etc/maxscale.cnf

[maxscale]

threads=8 #线程配置，默认为1

auth_connect_timeout=3600

auth_read_timeout=3600

auth_write_timeout=3600

[server1] #配置后端服务器

type=server

address=172.16.10.114

port=3308

protocol=MySQLBackend

server_weight=1

[server2]

type=server

address=172.16.10.114

port=3309

protocol=MySQLBackend

server_weight=1

#[server3]

#type=server

#address=172.16.10.114

#port=3310

#protocol=MySQLBackend

#server_weight=1

[readwritesplit] #读写分离配置

type=service

router=readwritesplit

servers=server1,server2

user=connect

passwd=connect

weightby=server_weight

max_slave_replication_lag=10 #允许最大主从延迟，当主从延迟超过该值时，不再向从库分发读请求

[Read Service] #配置读服务，虽然字面意思为读服务，也可以执行DML,DDL等操作，取决于对用户的授权，因此可以理解为连接服务

type=service

router=readconnroute

router_options=master

servers=server1,server2

user=connect

passwd=connect

weightby=server_weight

[MySQL Monitor] #监控配置

type=monitor

module=mariadbmon

servers=server1,server2

user=monitor

passwd=monitor

auto_failover=true #是否故障自动切换

auto_rejoin=true #故障实例恢复后自动加入集群

detect_standalone_master=true #探测独立的master,是否允许集群中最后一个实例成为主库

allow_cluster_recovery=false #是否允许集群自动恢复

#failcount=3 #在集群中最后一个实例成为主库前检查其它从库是否存活的次数,默认为5

#monitor_interval=10000 #探测间隔，单位毫秒，默认2000

detect_stale_master=true #当集群中只剩下主或主从复制全出错时，是否允许主提供服务

#detect_stale_slave=false

script=/tmp/reset_slave.sh #在下面的events发生时，执行的脚本

events=master_down #配置在发生什么事件时，执行上面的脚本

[Splitter-Service] #配置读写分离监听端口

type=listener

service=readwritesplit

protocol=MySQLClient

port=4306

[Read Listener] #配置读服务监听端口

type=listener

service=Read Service

protocol=MySQLClient

port=4307

[MaxAdmin Service] #配置管理服务

type=service

router=cli

[MaxAdmin Listener] #配置管理服务端口

type=listener

service=MaxAdmin Service

protocol=maxscaled

port=6603

上面的配置中涉及2个用户，一个是连接数据库的用户，一个是监控用户，其授权分别如下；

connect

CREATE USER 'connect'@'172.16.10.114' IDENTIFIED BY 'connect';

GRANT SELECT ON mysql.user TO 'connect'@'172.16.10.114';

GRANT SELECT ON mysql.db TO 'connect'@'172.16.10.114';

GRANT SELECT ON mysql.tables_priv TO 'connect'@'172.16.10.114';

GRANT SHOW DATABASES ON *.* TO 'connect'@'172.16.10.114';

monitor

CREATE USER 'monitor'@'172.16.10.114' IDENTIFIED BY 'monitor';

GRANT RELOAD, SUPER, REPLICATION CLIENT ON *.* to 'monitor'@'172.16.10.114';

注意：

如果配置maxscale的高可用，还需要配置用户能通过另外一台机器到mysql库的相关权限。
应用对数据库的连接权限，因中间多了一层maxscale,需要同一用户，同时允许应用IP,maxscale IP都可以连接到数据库，并且密码相同。举例如下：

maxscale IP为172.16.10.114,应用IP 172.16.10.238,访问数据库test,其授权为：

grant select,update,delete,insert on test.* to test_rw@'172.16.10.114' identified by 'test_rw';

grant select,update,delete,insert on test.* to test_rw@'172.16.10.238' identified by 'test_rw';

grant select on test.* to test_r@'172.16.10.114' identified by 'test_r';

grant select on test.* to test_r@'172.16.10.238' identified by 'test_r';

reset_salve.sh脚本内容，其目的主要是在master宕掉后，执行的脚本，清除从库的复制信息，使从库可以提升为主库，如果从库有复制的相关信息，其不能提升为主库提供写服务。

此脚本在主从切换后，需要修改连接IP为新从库的IP

cat /tmp/reset_slave.sh

mysql -h127.0.0.1 -P3309 -uthunder -pthunder -Nse 'stop slave;reset slave all;'

启动maxscale

/etc/init.d/maxscale start

管理maxscale

通过maxadmin命令，默认用户名和密码为admin/mariadb

maxadmin -h127.0.0.1 -P6603 -uadmin -p

也可能通过maxctrl命令通过API来管理，大致命令相同，maxadmin属于交互式，maxctrl属于非交互

MaxScale> help

Available commands:

add:

add user - Add an administrative account for using maxadmin over the network

add readonly-user - Add a read-only account for using maxadmin over the network

add server - Add a new server to a service

remove:

remove user - Remove account for using maxadmin over the network

remove server - Remove a server from a service or a monitor

create:

create server - Create a new server

create listener - Create a new listener for a service

create monitor - Create a new monitor

destroy:

destroy server - Destroy a server

destroy listener - Destroy a listener

destroy monitor - Destroy a monitor

alter:

alter server - Alter server parameters

alter monitor - Alter monitor parameters

alter service - Alter service parameters

alter maxscale - Alter maxscale parameters

set:

set server - Set the status of a server

set pollsleep - Set poll sleep period

set nbpolls - Set non-blocking polls

set log_throttling - Set the log throttling configuration

clear:

clear server - Clear server status

disable:

disable log-priority - Disable a logging priority

disable sessionlog-priority - [Deprecated] Disable a logging priority for a particular session

disable root - Disable root access

disable syslog - Disable syslog logging

disable maxlog - Disable MaxScale logging

disable account - Disable Linux user

enable:

enable log-priority - Enable a logging priority

enable sessionlog-priority - [Deprecated] Enable a logging priority for a session

enable root - Enable root user access to a service

enable syslog - Enable syslog logging

enable maxlog - Enable MaxScale logging

enable account - Activate a Linux user account for administrative MaxAdmin use

enable readonly-account - Activate a Linux user account for read-only MaxAdmin use

flush:

flush log - Flush the content of a log file and reopen it

flush logs - Flush the content of a log file and reopen it

list:

list clients - List all the client connections to MaxScale

list dcbs - List all active connections within MaxScale

list filters - List all filters

list listeners - List all listeners

list modules - List all currently loaded modules

list monitors - List all monitors

list services - List all services

list servers - List all servers

list sessions - List all the active sessions within MaxScale

list threads - List the status of the polling threads in MaxScale

list commands - List registered commands

reload:

reload config - [Deprecated] Reload the configuration

reload dbusers - Reload the database users for a service

restart:

restart monitor - Restart a monitor

restart service - Restart a service

restart listener - Restart a listener

shutdown:

shutdown maxscale - Initiate a controlled shutdown of MaxScale

shutdown monitor - Stop a monitor

shutdown service - Stop a service

shutdown listener - Stop a listener

show:

show dcbs - Show all DCBs

show dbusers - [deprecated] Show user statistics

show authenticators - Show authenticator diagnostics for a service

show epoll - Show the polling system statistics

show eventstats - Show event queue statistics

show filter - Show filter details

show filters - Show all filters

show log_throttling - Show the current log throttling setting (count, window (ms), suppression (ms))

show modules - Show all currently loaded modules

show monitor - Show monitor details

show monitors - Show all monitors

show persistent - Show the persistent connection pool of a server

show server - Show server details

show servers - Show all servers

show serversjson - Show all servers in JSON

show services - Show all configured services in MaxScale

show service - Show a single service in MaxScale

show session - Show session details

show sessions - Show all active sessions in MaxScale

show tasks - Show all active housekeeper tasks in MaxScale

show threads - Show the status of the worker threads in MaxScale

show users - Show enabled Linux accounts

show version - Show the MaxScale version number

sync:

sync logs - Flush log files to disk

call:

call command - Call module command

ping:

ping workers - Ping Workers

Type `help COMMAND` to see details of each command.

Where commands require names as arguments and these names contain

whitespace either the \ character may be used to escape the whitespace

or the name may be enclosed in double quotes ".

查看当前mysql实例状态

MaxScale> list servers

Servers.

-------------------+-----------------+-------+-------------+--------------------

Server | Address | Port | Connections | Status

-------------------+-----------------+-------+-------------+--------------------

server1 | 172.16.10.114 | 3308 | 0 | Master, Running

server2 | 172.16.10.114 | 3309 | 0 | Slave, Running

-------------------+-----------------+-------+-------------+--------------------

[[email protected] ~]# maxctrl list servers

┌─────────┬───────────────┬──────┬─────────────┬────

│ Server │ Address │ Port │ Connections │ State │ GTID │

├─────────┼───────────────┼──────┼─────────────┼────

│ server1 │ 172.16.10.114 │ 3308 │ 0 │ Master, Running │ │

├─────────┼───────────────┼──────┼─────────────┼────

│ server2 │ 172.16.10.114 │ 3309 │ 0 │ Slave, Running │ │

└─────────┴───────────────┴──────┴─────────────┴────

复制故障后状态：

MaxScale> list servers

Servers.

-------------------+-----------------+-------+-------------+--------------------

Server | Address | Port | Connections | Status

-------------------+-----------------+-------+-------------+--------------------

server1 | 172.16.10.114 | 3308 | 0 | Master, Running

server2 | 172.16.10.114 | 3309 | 0 | Running

-------------------+-----------------+-------+-------------+--------------------

解决主从复制问题

从库宕机恢复步骤：

MaxScale> list servers

Servers.

-------------------+-----------------+-------+-------------+--------------------

Server | Address | Port | Connections | Status

-------------------+-----------------+-------+-------------+--------------------

server1 | 172.16.10.114 | 3308 | 0 | Master, Running

server2 | 172.16.10.114 | 3309 | 0 | Maintenance, Down

-------------------+-----------------+-------+-------------+--------------------

将从库重新加到读写分离集群中

启动从库

启动复制start slave;

等主从追赶上后，在maxscale里面执行

MaxScale> clear server server2 maintenance

MaxScale> list servers

Servers.

-------------------+-----------------+-------+-------------+--------------------

Server | Address | Port | Connections | Status

-------------------+-----------------+-------+-------------+--------------------

server1 | 172.16.10.114 | 3308 | 0 | Master, Running

server2 | 172.16.10.114 | 3309 | 0 | Slave, Running

-------------------+-----------------+-------+-------------+--------------------

主库宕机后恢复步骤：

MaxScale> list servers

Servers.

-------------------+-----------------+-------+-------------+--------------------

Server | Address | Port | Connections | Status

-------------------+-----------------+-------+-------------+--------------------

server1 | 172.16.10.114 | 3308 | 0 | Down

server2 | 172.16.10.114 | 3309 | 0 | Slave, Running

-------------------+-----------------+-------+-------------+--------------------

MaxScale> list servers

Servers.

-------------------+-----------------+-------+-------------+--------------------

Server | Address | Port | Connections | Status

-------------------+-----------------+-------+-------------+--------------------

server1 | 172.16.10.114 | 3308 | 0 | Maintenance, Down

server2 | 172.16.10.114 | 3309 | 0 | Master, Running

-------------------+-----------------+-------+-------------+--------------------

切换时间约为10s, 为以下2个参数的乘积，最长时间不超过monitor_interval*(failcount+1)

#failcount=3 #在集群中最后一个实例成为主库前检查其它从库是否存活的次数,默认为5

#monitor_interval=10000 #探测间隔，单位毫秒，默认2000

此时从库接替原主库接受读写请求，恢复主从架构，原主库将成为新的从库，启动实例，做主从复制：

change master to master_host='172.16.10.114', master_port=3309, master_user='repl', master_password='repl4slave', master_auto_position=1;

检查复制状态

show slave status;

将新从库加入到读写分离集群中：

MaxScale> list servers

Servers.

-------------------+-----------------+-------+-------------+--------------------

Server | Address | Port | Connections | Status

-------------------+-----------------+-------+-------------+--------------------

server1 | 172.16.10.114 | 3308 | 0 | Maintenance, Down

server2 | 172.16.10.114 | 3309 | 0 | Master, Running

-------------------+-----------------+-------+-------------+--------------------

MaxScale> list servers

Servers.

-------------------+-----------------+-------+-------------+--------------------

Server | Address | Port | Connections | Status

-------------------+-----------------+-------+-------------+--------------------

server1 | 172.16.10.114 | 3308 | 0 | Maintenance, Down

server2 | 172.16.10.114 | 3309 | 0 | Master, Running

-------------------+-----------------+-------+-------------+--------------------

MaxScale> clear server server1 maintenance

MaxScale> list servers

Servers.

-------------------+-----------------+-------+-------------+--------------------

Server | Address | Port | Connections | Status

-------------------+-----------------+-------+-------------+--------------------

server1 | 172.16.10.114 | 3308 | 0 | Slave, Running

server2 | 172.16.10.114 | 3309 | 0 | Master, Running

-------------------+-----------------+-------+-------------+--------------------

keepalived安装与配置

如不考虑maxscale的高可用，可忽略以下内容

INSTALL

#yum install keepalived

CONFIG ON MASTER

#cat /etc/keepalived/keepalived.conf

vrrp_script chk_myscript {

script "/opt/soft/is_maxscale_running.sh"

interval 2 # check every 2 seconds

fall 2 # require 2 failures for KO

rise 2 # require 2 successes for OK

}

vrrp_instance VI_1 {

state MASTER

interface em1

virtual_router_id 51

priority 150

advert_int 1

authentication {

auth_type PASS

auth_pass mypass

}

virtual_ipaddress {

192.168.1.13 #VIP

}

track_script {

chk_myscript

}

notify "/opt/soft/notify_script.sh"

}

CONFIG ON STANDBY

#cat /etc/keepalived/keepalived.conf

vrrp_script chk_myscript {

script "/opt/soft/is_maxscale_running.sh"

interval 2 # check every 2 seconds

fall 2 # require 2 failures for KO

rise 2 # require 2 successes for OK

}

vrrp_instance VI_1 {

state MASTER

interface em1

virtual_router_id 51

priority 100

advert_int 1

authentication {

auth_type PASS

auth_pass mypass

}

virtual_ipaddress {

192.168.1.13

}

track_script {

chk_myscript

}

notify "/opt/soft/notify_script.sh"

}

#cat is_maxscale_running.sh

#!/bin/bash

fileName="maxadmin_output.txt"

rm $fileName

timeout 2s maxadmin -h127.0.0.1 -uadmin -pmariadb list servers > $fileName

to_result=$?

if [ $to_result -ge 1 ]

then

echo Timed out or error, timeout returned $to_result

exit 3

else

echo MaxAdmin success, rval is $to_result

echo Checking maxadmin output sanity

grep1=$(grep server1 $fileName)

grep2=$(grep server2 $fileName)

if [ "$grep1" ] && [ "$grep2" ]

then

echo All is fine

exit 0

else

echo Something is wrong

exit 3

#cat notify_script.sh

#!/bin/bash

TYPE=$1

NAME=$2

STATE=$3

OUTFILE=./state.txt

touch $OUTFILE

case $STATE in

"MASTER") echo "Setting this MaxScale node to active mode" > $OUTFILE

maxctrl alter maxscale passive false

exit 0

;;

"BACKUP") echo "Setting this MaxScale node to passive mode" > $OUTFILE

maxctrl alter maxscale passive true

exit 0

;;

"FAULT") echo "MaxScale failed the status check." > $OUTFILE

maxctrl alter maxscale passive true

exit 0

;;

*) echo "Unknown state" > $OUTFILE

exit 1

;;

esac

启动keepalived

/etc/init.d/keepalived start

分别停掉keepalived与maxscale，观察VIP漂移

基于MaxScale中间件的MySQL读写分离