天天看点

AntDB读写分离探索AntDB读写分离探索

AntDB读写分离探索

本次测试过程将AntDB集群的datanode slave备机节点作为可读节点来实现。这样做的主要目的是合理利用资源,提高系统吞吐量。

适用场景: 针对同一时间有大量的读操作和较少的写操作,并且读操作对数据的实时性要求并没有那么高。在此前提下,可以这么设计解决方案。

一、环境准备

本次测试过程只考虑一主一从的场景,datanode 一主两从的场景原理类似,感兴趣的朋友可以动手试试。

搭建一个2C2D(datanode 一主一从同步)的antdb集群:

AntDB读写分离探索AntDB读写分离探索
postgres=# monitor all;
 nodename |    nodetype     | status | description |     host     | port  
----------+-----------------+--------+-------------+--------------+-------
 coord1   | coordinator     | t      | running     | 10.21.20.175 |  6603
 coord2   | coordinator     | t      | running     | 10.21.20.176 |  6603
 db1      | datanode master | t      | running     | 10.21.20.175 | 14000
 db1      | datanode slave  | t      | running     | 10.21.20.176 | 14001
 db2      | datanode master | t      | running     | 10.21.20.176 | 15000
 db2      | datanode slave  | t      | running     | 10.21.20.175 | 15001
 gtm      | gtm master      | t      | running     | 10.21.20.175 |  7663
(7 rows)

postgres=# list node;
  name  |    host    |      type       | mastername | port  | sync_state |            path            | initialized | incluster 
--------+------------+-----------------+------------+-------+------------+----------------------------+-------------+-----------
 coord1 | localhost1 | coordinator     |            |  6603 |            | /data/zgy/pgxc_data/coord1 | t           | t
 coord2 | localhost2 | coordinator     |            |  6603 |            | /data/zgy/pgxc_data/coord2 | t           | t
 db1    | localhost1 | datanode master |            | 14000 |            | /data/zgy/pgxc_data/db1    | t           | t
 db1    | localhost2 | datanode slave  | db1        | 14001 | sync       | /data/zgy/pgxc_data/db1s   | t           | t
 db2    | localhost2 | datanode master |            | 15000 |            | /data/zgy/pgxc_data/db2    | t           | t
 db2    | localhost1 | datanode slave  | db2        | 15001 | sync       | /data/zgy/pgxc_data/db2s   | t           | t
 gtm    | localhost1 | gtm master      |            |  7663 |            | /data/zgy/pgxc_data/gtm    | t           | t
(7 rows)
           

二、数据准备

连接coord1,创建table并写入数据:

postgres=# create table a (id int ,name varchar(),score int) distribute by hash(id);                         
CREATE TABLE
postgres=# insert into a select generate_series(,),'jay'||generate_series(,),ceil(random()*);  
INSERT  
postgres=# select count(*) from a;                
 count 
-------
   100
(1 row)
           

三、修改coord2节点的pgxc_node关于datanode节点信息

这里也可以通过adbmgr,添加一个coord3,部署在datanode slave节点所在的服务器。本次实验由于服务器只有2台,就只基于原集群进行改造。

psql (PGXC 1.2devel, based on PG 9.3.13 ADB 2.2devel 69c858e353)
Type "help" for help.

postgres=# select * from pgxc_node;
 node_name | node_type | node_port |  node_host   | nodeis_primary | nodeis_preferred |   node_id   
-----------+-----------+-----------+--------------+----------------+------------------+-------------
 coord1    | C         |      6603 | 10.21.20.175 | f              | f                |  1885696643
 coord2    | C         |      6603 | 10.21.20.176 | f              | f                | -1197102633
 db1       | D         |     14000 | 10.21.20.175 | t              | f                |    -2885965
 db2       | D         |     15000 | 10.21.20.176 | f              | t                |   -79866771
(4 rows)

postgres=# update pgxc_node set node_port=,node_host='10.21.20.176' where node_name='db1';
UPDATE 
postgres=# update pgxc_node set node_port=,node_host='10.21.20.175' where node_name='db2';  
UPDATE 
postgres=# select pgxc_pool_reload();
 pgxc_pool_reload 
------------------
 t
(1 row)
           

到此,读写分离配置完成。此刻,coord1提供读写datanode master节点,coord2提供读datanode slave节点。

我们来看下效果:

连接coord2,执行简单查询:

psql (PGXC 1.2devel, based on PG 9.3.13 ADB 2.2devel 69c858e353)
Type "help" for help.

postgres=# select count(*) from a;
 count 
-------
   100
(1 row)
postgres=# explain verbose select count(*) from a;
                                      QUERY PLAN                                       
---------------------------------------------------------------------------------------
 Aggregate  (cost=2.50..2.51 rows=1 width=0)
   Output: pg_catalog.count(*)
   ->  Data Node Scan on "__REMOTE_GROUP_QUERY__"  (cost=0.00..0.00 rows=1000 width=0)
         Output: (count(*))
         Node/s: db1, db2
         Remote query: SELECT count(*) FROM ONLY public.a a WHERE true
( rows)
           

此时查看进程,发现dbslave节点产生backend进程:

424707 ?        S      0:00 /data/zgy/app/bin/postgres --datanode -D /data/zgy/pgxc_data/db1s -i
424709 ?        Ss     0:00  \_ postgres: logger process   
424710 ?        Ss     0:00  \_ postgres: startup process   recovering 000000010000000000000003
424711 ?        Ss     0:00  \_ postgres: checkpointer process   
424712 ?        Ss     0:00  \_ postgres: writer process   
424713 ?        Ss     0:00  \_ postgres: stats collector process   
424715 ?        Ss     0:00  \_ postgres: wal receiver process   streaming 0/3045C90
425072 ?        Ss     0:00  \_ postgres: zgy postgres 10.21.20.176(20063) idle
           

连接coord1进行读写操作:

[[email protected] highavailable]$ coord1
psql (PGXC 1.2devel, based on PG 9.3.13 ADB 2.2devel 69c858e353)
Type "help" for help.

postgres=# explain verbose select count(*) from a; 
                                      QUERY PLAN                                       
---------------------------------------------------------------------------------------
 Aggregate  (cost=2.50..2.51 rows=1 width=0)
   Output: pg_catalog.count(*)
   ->  Data Node Scan on "__REMOTE_GROUP_QUERY__"  (cost=0.00..0.00 rows=1000 width=0)
         Output: (count(*))
         Node/s: db1, db2
         Remote query: SELECT count(*) FROM ONLY public.a a WHERE true
( rows)

postgres=# insert into a select generate_series(,),'jay'||generate_series(,),ceil(random()*);           
INSERT  
postgres=# select count(*) from a;
 count 
-------
   110
(1 row)
           

由于antdb 2.2版本,datanode 主备nodename必须相同,读写分离看起来不是很明显,我们可以这样操作:

- 通过adbmgr模块,将datanode master 节点都手动stop,来验证coord2是否仍可读:

postgres=# monitor all ;
 nodename |    nodetype     | status | description |     host     | port  
----------+-----------------+--------+-------------+--------------+-------
 coord1   | coordinator     | t      | running     | 10.21.20.175 |  6603
 coord2   | coordinator     | t      | running     | 10.21.20.176 |  6603
 db1      | datanode master | t      | running     | 10.21.20.175 | 14000
 db1      | datanode slave  | t      | running     | 10.21.20.176 | 14001
 db2      | datanode master | t      | running     | 10.21.20.176 | 15000
 db2      | datanode slave  | t      | running     | 10.21.20.175 | 15001
 gtm      | gtm master      | t      | running     | 10.21.20.175 |  7663
(7 rows)

postgres=# stop datanode master db1 MODE F;
NOTICE:  10.21.20.175, pg_ctl  stop -D /data/zgy/pgxc_data/db1 -Z datanode -m fast -o -i -w -c
 nodename | status | description 
----------+--------+-------------
 db1      | t      | success
(1 row)

postgres=# stop datanode master db2 MODE F; 
NOTICE:  10.21.20.176, pg_ctl  stop -D /data/zgy/pgxc_data/db2 -Z datanode -m fast -o -i -w -c
 nodename | status | description 
----------+--------+-------------
 db2      | t      | success
(1 row)

postgres=# monitor all ;
 nodename |    nodetype     | status | description |     host     | port  
----------+-----------------+--------+-------------+--------------+-------
 coord1   | coordinator     | t      | running     | 10.21.20.175 |  6603
 coord2   | coordinator     | t      | running     | 10.21.20.176 |  6603
 db1      | datanode master | f      | not running | 10.21.20.175 | 14000
 db1      | datanode slave  | t      | running     | 10.21.20.176 | 14001
 db2      | datanode master | f      | not running | 10.21.20.176 | 15000
 db2      | datanode slave  | t      | running     | 10.21.20.175 | 15001
 gtm      | gtm master      | t      | running     | 10.21.20.175 |  7663
(7 rows)
           

分别连接coord1 和coord2 读表a的数据:

[[email protected] highavailable]$ coord1
psql (PGXC 1.2devel, based on PG 9.3.13 ADB 2.2devel 69c858e353)
Type "help" for help.

postgres=# select count(*) from a;
ERROR:  error message from poolmgr:reconnect three thimes , could not connect to server: Connection refused
        Is the server running on host "10.21.20.175" and accepting
        TCP/IP connections on port 14000?


[[email protected] ~]$ coord2
psql (PGXC 1.2devel, based on PG 9.3.13 ADB 2.2devel 69c858e353)
Type "help" for help.

postgres=# select count(*) from a;
 count 
-------
   110
(1 row)
           

四、异常场景

有以下几个场景需要考虑:

- 1、datanode 备机节点异常(宕机,网络异常等等),业务查询将出错

- 2、datanode matser发生一次切换后,需要人工介入添加备机,再将备机的信息更新到读的coord列表

- 3、datanode 备机流复制延时(异步wal日志文件延时&同步节点recover数据延时),都将会导致查询的数据不准确

- 4、提供读的coord节点 ,不能作为写操作的提供者

4.1 datanode master 节点发生异常,触发failover 操作后:

AntDB读写分离探索AntDB读写分离探索

模拟过程:

postgres=# monitor all ;
 nodename |    nodetype     | status | description |     host     | port  
----------+-----------------+--------+-------------+--------------+-------
 coord1   | coordinator     | t      | running     | 10.21.20.175 |  6603
 coord2   | coordinator     | t      | running     | 10.21.20.176 |  6603
 db1      | datanode master | t      | running     | 10.21.20.175 | 14000
 db1      | datanode slave  | t      | running     | 10.21.20.176 | 14001
 db2      | datanode master | t      | running     | 10.21.20.176 | 15000
 db2      | datanode slave  | t      | running     | 10.21.20.175 | 15001
 gtm      | gtm master      | t      | running     | 10.21.20.175 |  7663
(7 rows)

postgres=# failover datanode db1;
NOTICE:  lock cluster: SELECT PG_PAUSE_CLUSTER();
NOTICE:  10.21.20.175, pg_ctl  stop -D /data/zgy/pgxc_data/db1 -Z datanode -m immediate -o -i -w -c
NOTICE:  10.21.20.176, pg_ctl  promote -w -D /data/zgy/pgxc_data/db1s
NOTICE:  unlock cluster: SELECT PG_UNPAUSE_CLUSTER();
WARNING:  the datanode master "db1" has no slave or extra, it is better to append a new datanode slave node
 nodename | status | description 
----------+--------+-------------
 db1      | t      | success
(1 row)

postgres=# monitor all ;
 nodename |    nodetype     | status | description |     host     | port  
----------+-----------------+--------+-------------+--------------+-------
 coord1   | coordinator     | t      | running     | 10.21.20.175 |  6603
 coord2   | coordinator     | t      | running     | 10.21.20.176 |  6603
 db1      | datanode master | t      | running     | 10.21.20.176 | 14001
 db2      | datanode master | t      | running     | 10.21.20.176 | 15000
 db2      | datanode slave  | t      | running     | 10.21.20.175 | 15001
 gtm      | gtm master      | t      | running     | 10.21.20.175 |  7663
(6 rows)
           

分别连接coord1和coord2,执行操作:

[[email protected] highavailable]$ coord1
psql (PGXC 1.2devel, based on PG 9.3.13 ADB 2.2devel 69c858e353)
Type "help" for help.

postgres=# select * from pgxc_node;
 node_name | node_type | node_port |  node_host   | nodeis_primary | nodeis_preferred |   node_id   
-----------+-----------+-----------+--------------+----------------+------------------+-------------
 coord1    | C         |      6603 | 10.21.20.175 | f              | f                |  1885696643
 coord2    | C         |      6603 | 10.21.20.176 | f              | f                | -1197102633
 db1       | D         |     14001 | 10.21.20.176 | t              | f                |    -2885965
 db2       | D         |     15000 | 10.21.20.176 | f              | t                |   -79866771
(4 rows)

postgres=# select count(*) from a;
 count 
-------
   110
(1 row)

postgres=# insert into a select generate_series(,),'jay'||generate_series(,),ceil(random()*)              
postgres-# ;
INSERT  
postgres=# explain verbose select * from a;
                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Data Node Scan on "__REMOTE_FQS_QUERY__"  (cost=0.00..0.00 rows=0 width=0)
   Output: a.id, a.name, a.score
   Node/s: db1, db2
   Remote query: SELECT id, name, score FROM public.a a
( rows)

postgres=# 

[[email protected] highavailable]$ coord2
psql (PGXC devel, based on PG  ADB devel c858e353)
Type "help" for help.

postgres=# select * from pgxc_node;
 node_name | node_type | node_port |  node_host   | nodeis_primary | nodeis_preferred |   node_id   
-----------+-----------+-----------+--------------+----------------+------------------+-------------
 coord1    | C         |      6603 | 10.21.20.175 | f              | f                |  1885696643
 coord2    | C         |      6603 | 10.21.20.176 | f              | f                | -1197102633
 db1       | D         |     14001 | 10.21.20.176 | t              | t                |    -2885965
 db2       | D         |     15000 | 10.21.20.176 | f              | f                |   -79866771
(4 rows)

postgres=#  select count(*) from a;
 count 
-------
   120
(1 row)
postgres=# explain verbose select * from a;
                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Data Node Scan on "__REMOTE_FQS_QUERY__"  (cost=0.00..0.00 rows=0 width=0)
   Output: a.id, a.name, a.score
   Node/s: db1, db2
   Remote query: SELECT id, name, score FROM public.a a
( rows)
           

通过adbmgr rewind将宕掉的原master节点作为db1备节点再次加到集群中来:

备注:antdb2.2版本使用rewind的前提是在初始化集群时带上参数 –datachecksums , 否则会报错:ERROR: rewind datanode slave “db1” fail, target master need to use data checksums. 如果初始化没带参数,可以通过append datanode slave nodename的方式添加备机。

[zgy@INTEL175 highavailable]$ adbmgr
psql (PGXC devel, based on PG  ADB devel c858e353)
Type "help" for help.

postgres=# list node;
  name  |    host    |      type       | mastername | port  | sync_state |            path            | initialized | incluster 
--------+------------+-----------------+------------+-------+------------+----------------------------+-------------+-----------
 coord1 | localhost1 | coordinator     |            |   |            | /data/zgy/pgxc_data/coord1 | t           | t
 coord2 | localhost2 | coordinator     |            |   |            | /data/zgy/pgxc_data/coord2 | t           | t
 db1    | localhost2 | datanode master |            |  |            | /data/zgy/pgxc_data/db1s   | t           | t
 db2    | localhost2 | datanode master |            |  |            | /data/zgy/pgxc_data/db2    | t           | t
 db2    | localhost1 | datanode slave  | db2        |  | sync       | /data/zgy/pgxc_data/db2s   | t           | t
 gtm    | localhost1 | gtm master      |            |   |            | /data/zgy/pgxc_data/gtm    | t           | t
( rows)

postgres=# add datanode slave db1(host='localhost1',port=,path='/data/zgy/pgxc_data/db1');       
ADD NODE
postgres=# rewind datanode slave db1;
NOTICE:  pg_ctl restart datanode slave "db1"
NOTICE:  , pg_ctl  restart -D /data/zgy/pgxc_data/db1 -Z datanode -m fast -o -i -w -c -l /data/zgy/pgxc_data/db1/logfile
NOTICE:  wait max  seconds to check datanode slave "db1" running normal
NOTICE:  pg_ctl stop datanode slave "db1" with fast mode
NOTICE:  , pg_ctl  stop -D /data/zgy/pgxc_data/db1 -Z datanode -m fast -o -i -w -c
NOTICE:  wait max  seconds to check datanode slave "db1" stop complete
NOTICE:  update datanode master "db1" pg_hba.conf for the rewind node db1
NOTICE:  update gtm master "gtm" pg_hba.conf for the rewind node db1
NOTICE:  on datanode master "db1" execute "checkpoint"
NOTICE:  ,  /data/zgy/app/bin/pg_controldata '/data/zgy/pgxc_data/db1s' | grep 'Minimum recovery ending location:' |awk '{print $5}'
NOTICE:  receive msg: {"result":"0/0"}
NOTICE:  ,  /data/zgy/app/bin/pg_controldata '/data/zgy/pgxc_data/db1s' |grep 'Min recovery ending loc' |awk '{print $6}'
NOTICE:  receive msg: {"result":"0"}
NOTICE:  , pg_rewind  --target-pgdata /data/zgy/pgxc_data/db1 --source-server='host=10.21.20.176 port=14001 user=zgy dbname=postgres' -N db1

postgres=# monitor all ;
 nodename |    nodetype     | status | description |     host     | port  
----------+-----------------+--------+-------------+--------------+-------
 coord1   | coordinator     | t      | running     |  |  
 coord2   | coordinator     | t      | running     |  |  
 db1      | datanode master | t      | running     |  | 
 db1      | datanode slave  | t      | running     |  | 
 db2      | datanode master | t      | running     |  | 
 db2      | datanode slave  | t      | running     |  | 
 gtm      | gtm master      | t      | running     |  |  
( rows)
           

添加db1后,需要手动修改coord2的pgxc_node 关于datanode 节点信息:

[[email protected] highavailable]$ coord2
psql (PGXC 1.2devel, based on PG 9.3.13 ADB 2.2devel 69c858e353)
Type "help" for help.

postgres=# 
postgres=# select * from pgxc_node;
 node_name | node_type | node_port |  node_host   | nodeis_primary | nodeis_preferred |   node_id   
-----------+-----------+-----------+--------------+----------------+------------------+-------------
 coord1    | C         |      6603 | 10.21.20.175 | f              | f                |  1885696643
 coord2    | C         |      6603 | 10.21.20.176 | f              | f                | -1197102633
 db1       | D         |     14001 | 10.21.20.176 | t              | t                |    -2885965
 db2       | D         |     15000 | 10.21.20.176 | f              | f                |   -79866771
(4 rows)

postgres=# update pgxc_node set node_port=,node_host='10.21.20.175'  where node_name='db1';                                 
UPDATE 
postgres=# update pgxc_node set node_port=,node_host='10.21.20.175'  where node_name='db2';  
UPDATE 
postgres=# select pgxc_pool_reload();
 pgxc_pool_reload 
------------------
 t
(1 row)
           

此时,coord1 读写datanode master节点,而coord2又重新读取datanode slave节点。

4.2 datanode slave 节点发生异常:

antdb监控脚本,针对datanode slave节点状态异常后,主动尝试重新拉起,重试间隔和次数均无法拉起。此时就需要将slave节点移出集群。提供读的coord的pgxc_node 的datanode节点信息需要同步更新。

AntDB读写分离探索AntDB读写分离探索

模拟过程:

连接adbmgr,手动停止datanode slave db1:

postgres=# stop datanode slave db1 mode f;
NOTICE:  10.21.20.175, pg_ctl  stop -D /data/zgy/pgxc_data/db1 -Z datanode -m fast -o -i -w -c
 nodename | status | description 
----------+--------+-------------
 db1      | t      | success
(1 row)

postgres=# monitor all ;
 nodename |    nodetype     | status | description |     host     | port  
----------+-----------------+--------+-------------+--------------+-------
 coord1   | coordinator     | t      | running     | 10.21.20.175 |  6603
 coord2   | coordinator     | t      | running     | 10.21.20.176 |  6603
 db1      | datanode master | t      | running     | 10.21.20.176 | 14001
 db1      | datanode slave  | f      | not running | 10.21.20.175 | 14000
 db2      | datanode master | t      | running     | 10.21.20.176 | 15000
 db2      | datanode slave  | t      | running     | 10.21.20.175 | 15001
 gtm      | gtm master      | t      | running     | 10.21.20.175 |  7663
(7 rows)
           

此时连接coord2,查询table a,报错:

[[email protected] highavailable]$ coord2
psql (PGXC 1.2devel, based on PG 9.3.13 ADB 2.2devel 69c858e353)
Type "help" for help.

postgres=# select count(*) from a;
ERROR:  error message from poolmgr:reconnect three thimes , could not connect to server: Connection refused
        Is the server running on host "10.21.20.175" and accepting
        TCP/IP connections on port 14000?

postgres=# 
           

更新coord2的pgxc_node datanode db1的信息:

postgres=# select * from pgxc_node;
 node_name | node_type | node_port |  node_host   | nodeis_primary | nodeis_preferred |   node_id   
-----------+-----------+-----------+--------------+----------------+------------------+-------------
 coord1    | C         |      6603 | 10.21.20.175 | f              | f                |  1885696643
 coord2    | C         |      6603 | 10.21.20.176 | f              | f                | -1197102633
 db1       | D         |     14000 | 10.21.20.175 | t              | t                |    -2885965
 db2       | D         |     15001 | 10.21.20.175 | f              | f                |   -79866771
(4 rows)

postgres=# update pgxc_node set node_port=,node_host='10.21.20.176'  where node_name='db1';                 
UPDATE 
postgres=# select pgxc_pool_reload();
 pgxc_pool_reload 
------------------
 t
(1 row)

postgres=# select count(*) from a; 
 count 
-------
   120
(1 row)
           

由于之前添加一主一从为同步流复制方式,因此此时coord1写会hang住,一直等待db1的同步从节点返回结果,如下:

[[email protected] ~]$ coord1
psql (PGXC 1.2devel, based on PG 9.3.13 ADB 2.2devel 69c858e353)
Type "help" for help.

postgres=# select count(*) from a;
 count 
-------
   120
(1 row)

postgres=# insert into a select generate_series(,),'jay'||generate_series(,),ceil(random()*)              
;




           

对于这样的场景,可以连接adbmgr 修改db1的流复制方式:

[[email protected] highavailable]$ adbmgr
psql (PGXC 1.2devel, based on PG 9.3.13 ADB 2.2devel 69c858e353)
Type "help" for help.

postgres=# alter datanode slave db1  (sync_state='async');                
ALTER NODE
postgres=# monitor all ;
 nodename |    nodetype     | status | description |     host     | port  
----------+-----------------+--------+-------------+--------------+-------
 coord1   | coordinator     | t      | running     | 10.21.20.175 |  6603
 coord2   | coordinator     | t      | running     | 10.21.20.176 |  6603
 db1      | datanode master | t      | running     | 10.21.20.176 | 14001
 db1      | datanode slave  | f      | not running | 10.21.20.175 | 14000
 db2      | datanode master | t      | running     | 10.21.20.176 | 15000
 db2      | datanode slave  | t      | running     | 10.21.20.175 | 15001
 gtm      | gtm master      | t      | running     | 10.21.20.175 |  7663
(7 rows)
           

宕掉的datanode slave节点,可以通过adbmgr remove集群后drop掉。

postgres=# remove datanode slave db1;
WARNING:  the datanode master "db1" has no synchronous slave or extra node
REMOVE NODE
postgres=# drop datanode slave db1;
DROP NODE
postgres=# monitor all ;
 nodename |    nodetype     | status | description |     host     | port  
----------+-----------------+--------+-------------+--------------+-------
 coord1   | coordinator     | t      | running     | 10.21.20.175 |  6603
 coord2   | coordinator     | t      | running     | 10.21.20.176 |  6603
 db1      | datanode master | t      | running     | 10.21.20.176 | 14001
 db2      | datanode master | t      | running     | 10.21.20.176 | 15000
 db2      | datanode slave  | t      | running     | 10.21.20.175 | 15001
 gtm      | gtm master      | t      | running     | 10.21.20.175 |  7663
(6 rows)
           

4.3 读slave节点的数据可能不准确

antdb datanode节点也是基于postgresql 流复制的方式实现的。异步wal日志文件延时&同步节点recover数据延时,极端情况下都将会导致查询的数据不准确。这一点暂时没有想到什么好的解决方案,期待社区大牛们有好的建议。

AntDB:
开源url:https://github.com/ADBSQL/AntDB
QQ交流群: