概述
greenplum擴容segment可以分為2類:
- 單機上的seg增加,我們稱之為“縱向擴充”
- 單機不變,通過增加機器,我們稱之為“橫向擴充”
下面詳細介紹下這兩種擴容方式的具體步驟和方法
gpexpand指令
首先我們來學習一下 gpexpand 指令,基本使用方法如下:
gpexpand
[-f <hosts_file>]
| -i <input_file> [-B <batch_size>] [-V]
| {-d <hh:mm:ss> | -e '<YYYY-MM-DD hh:mm:ss>'}
[-analyze] [-n <parallel_processes>]
| --rollback
| --clean
[-D <database_name>][--verbose] [--silent]
指令參數:
-B <batch_size>
并發數可以指定同時運作ssh指令的數量,預設時16,參數範圍是1-128
一般的預設值即可
-D <database_name>
指定要在其中建立擴充模式的資料庫和表。如果未提供此選項,則為使用環境變量PGDATABASE。資料庫模闆不能使用template1和template0。
-i | --input <input_file>
指定擴充配置檔案的名稱
<hostname>:<address>:<port>:<fselocation>:<dbid>:<content>:<preferred_role>:<replication_port>
主機名:位址:端口:segment data目錄,注意是全路徑:dbid号:角色
-n <parallel_processes>
同時進行重分布的表的個數,有效的參數為1-16,每一個表重分布必須要有兩個資料庫連接配接,設定之前,要確定最大連接配接數max_connections這個參數可以滿足這個值
-S | --simple_progress
簡單模式,不會顯示提示
-t | --tardir <directory>
指定要放置tar檔案的段主機上的臨時目錄。
-v | --verbose
調試模式,列印出詳細的執行計劃
-V | --novacuum
建立schema副本時,不對資料字典進行vacuum
縱向拓展
原Greenplum叢集狀态
原叢集的一個master節點和2個資料節點,每一個資料節點上有4個primary計算節點(為了顯示清晰,就沒有增加mirror節點了)具體資訊如下:
postgres=# SELECT * from gp_segment_configuration ;
dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
1 | -1 | p | p | s | u | 5432 | gw_mdw1 | gw_mdw1 | |
2 | 0 | p | p | s | u | 40000 | gw_sdw1 | gw_sdw1 | |
6 | 4 | p | p | s | u | 40000 | gw_sdw2 | gw_sdw2 | |
3 | 1 | p | p | s | u | 40001 | gw_sdw1 | gw_sdw1 | |
7 | 5 | p | p | s | u | 40001 | gw_sdw2 | gw_sdw2 | |
4 | 2 | p | p | s | u | 40002 | gw_sdw1 | gw_sdw1 | |
8 | 6 | p | p | s | u | 40002 | gw_sdw2 | gw_sdw2 | |
5 | 3 | p | p | s | u | 40003 | gw_sdw1 | gw_sdw1 | |
9 | 7 | p | p | s | u | 40003 | gw_sdw2 | gw_sdw2 | |
(9 rows)
現在準備通過在每台機器上再增加1個節點,來擴容segment,簡單的來說就是縱向擴容
生成參數檔案
指定一個資料庫,在此資料庫下生成schema
再建立一個seg_hosts的檔案,即所有資料節點的主機名
[gpadmin@gw_mdw1 ~]$ cat seg_hosts
gw_sdw1
gw_sdw2
執行指令,生成參數檔案
[gpadmin@gw_mdw1 ~]$ gpexpand -f seg_hosts -D test
20190327:23:18:01:007122 gpexpand:gw_mdw1:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.1.0 build 6'
20190327:23:18:01:007122 gpexpand:gw_mdw1:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.1.0 build 6) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Jun 11 2014 17:23:40'
20190327:23:18:01:007122 gpexpand:gw_mdw1:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state
System Expansion is used to add segments to an existing GPDB array.
gpexpand did not detect a System Expansion that is in progress.
Before initiating a System Expansion, you need to provision and burn-in
the new hardware. Please be sure to run gpcheckperf/gpcheckos to make
sure the new hardware is working properly.
Please refer to the Admin Guide for more information.
Would you like to initiate a new System Expansion Yy|Nn (default=N):
> y <------确認添加
How many new primary segments per host do you want to add? (default=0):
> 1 <------每台機器上添加1個計算節點
Enter new primary data directory 1:
> /data/primary <------增加的計算節點存放的目錄
Generating configuration file...
20190327:23:19:03:007122 gpexpand:gw_mdw1:gpadmin-[INFO]:-Generating input file...
Input configuration files were written to 'gpexpand_inputfile_20190327_231903' and 'None'.
Please review the file and make sure that it is correct then re-run
with: gpexpand -i gpexpand_inputfile_20190327_231903 -D test
20190327:23:19:03:007122 gpexpand:gw_mdw1:gpadmin-[INFO]:-Exiting...
參數檔案的内容如下:
[gpadmin@gw_mdw1 ~]$ cat gpexpand_inputfile_20190327_231903
gw_sdw1:gw_sdw1:40004:/data/primary/gpseg8:10:8:p
gw_sdw2:gw_sdw2:40004:/data/primary/gpseg9:11:9:p
如果存放的目錄有所改變,可以手動去修改此檔案,将該計算節點存放在自己想要放在的位置
利用參數檔案執行拓展指令
[gpadmin@gw_mdw1 ~]$ gpexpand -i gpexpand_inputfile_20190327_231903 -D test
20190327:23:20:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.1.0 build 6'
20190327:23:20:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.1.0 build 6) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Jun 11 2014 17:23:40'
.......(省略資訊)
20190327:23:22:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-************************************************
20190327:23:22:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-Initialization of the system expansion complete.
20190327:23:22:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-To begin table expansion onto the new segments
20190327:23:22:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-rerun gpexpand
20190327:23:22:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-************************************************
20190327:23:22:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-Exiting...
檢視新添加的狀态
此時去資料庫裡檢視相關節點資訊,可以看到已經增加了兩個節點了。
postgres=# SELECT * from gp_segment_configuration ;
dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
1 | -1 | p | p | s | u | 5432 | gw_mdw1 | gw_mdw1 | |
2 | 0 | p | p | s | u | 40000 | gw_sdw1 | gw_sdw1 | |
6 | 4 | p | p | s | u | 40000 | gw_sdw2 | gw_sdw2 | |
3 | 1 | p | p | s | u | 40001 | gw_sdw1 | gw_sdw1 | |
7 | 5 | p | p | s | u | 40001 | gw_sdw2 | gw_sdw2 | |
4 | 2 | p | p | s | u | 40002 | gw_sdw1 | gw_sdw1 | |
8 | 6 | p | p | s | u | 40002 | gw_sdw2 | gw_sdw2 | |
5 | 3 | p | p | s | u | 40003 | gw_sdw1 | gw_sdw1 | |
9 | 7 | p | p | s | u | 40003 | gw_sdw2 | gw_sdw2 | |
10 | 8 | p | p | s | u | 40004 | gw_sdw1 | gw_sdw1 | |
11 | 9 | p | p | s | u | 40004 | gw_sdw2 | gw_sdw2 | |
(11 rows)
資料重分布
執行重分布指令将資料重分布
[gpadmin@gw_mdw1 ~]$ gpexpand -a -d 1:00:00 -D test
此時還沒有完成,去指定的資料庫下面檢視重分布狀态
test=# select * from gpexpand.status;
status | updated
------------+----------------------------
SETUP | 2019-03-27 23:21:49.299565
SETUP DONE | 2019-03-27 23:21:53.737479
(2 rows)
還有多少個表需要重分布,這裡我的已經同步完成,因為是測試環境沒有什麼資料量
test=# select * from gpexpand.expansion_progress ;
name | value
------------------------------+-------
Estimated Expansion Rate |
Estimated Time to Completion |
(2 rows)
完成重分布之後,将schema删除
[gpadmin@gw_mdw1 ~]$ gpexpand -c -D test
橫向拓展
這次我們不僅在每台機器上添加1個計算節點,還添加一個資料節點(也就是一個新機器)
生成配置檔案
這次我們在主機的名單中再增加一個新的機器gw_sdw3,此機器是新添加進來的一個節點。隻是安裝了軟體,沒有任何資料。
[gpadmin@gw_mdw1 ~]$ cat seg_hosts
gw_sdw1
gw_sdw2
gw_sdw3
使用同樣的方法執行,每一節點也選擇增加1個計算節點
[gpadmin@gw_mdw1 ~]$ gpexpand -f seg_hosts -D test
生成檔案如下:
可以看到,雖然隻是添加一個,greenplum檢視到gw_sdw3上并沒有一個節點,就自動為其一次添加6個,是它與其他兩個機器數量一緻
[gpadmin@gw_mdw1 ~]$ cat gpexpand_inputfile_20190328_014748
gw_sdw3:gw_sdw3:40000:/data/primary/gpseg10:12:10:p
gw_sdw3:gw_sdw3:40001:/data/primary/gpseg11:13:11:p
gw_sdw3:gw_sdw3:40002:/data/primary/gpseg12:14:12:p
gw_sdw3:gw_sdw3:40003:/data/primary/gpseg13:15:13:p
gw_sdw3:gw_sdw3:40004:/data/primary/gpseg14:16:14:p
gw_sdw1:gw_sdw1:40005:/data/primary/gpseg15:17:15:p
gw_sdw2:gw_sdw2:40005:/data/primary/gpseg16:18:16:p
gw_sdw3:gw_sdw3:40005:/data/primary/gpseg17:19:17:p
執行擴容指令
執行擴容指令,和上面一樣
[gpadmin@gw_mdw1 ~]$ gpexpand -i gpexpand_inputfile_20190328_014748 -D test
20190328:23:50:44:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.1.0 build 6'
20190328:23:50:44:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.1.0 build 6) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Jun 11 2014 17:23:40'
....(忽略資訊)
20190328:23:54:59:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-Starting Greenplum Database
20190328:23:55:32:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-************************************************
20190328:23:55:32:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-Initialization of the system expansion complete.
20190328:23:55:32:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-To begin table expansion onto the new segments
20190328:23:55:32:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-rerun gpexpand
20190328:23:55:32:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-************************************************
20190328:23:55:32:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-Exiting...
檢視資料庫的節點狀态,發現已經增加了8個節點,其中2個是縱向添加,6個是橫向添加
postgres=# SELECT * from gp_segment_configuration ;
dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
1 | -1 | p | p | s | u | 5432 | gw_mdw1 | gw_mdw1 | |
2 | 0 | p | p | s | u | 40000 | gw_sdw1 | gw_sdw1 | |
6 | 4 | p | p | s | u | 40000 | gw_sdw2 | gw_sdw2 | |
3 | 1 | p | p | s | u | 40001 | gw_sdw1 | gw_sdw1 | |
7 | 5 | p | p | s | u | 40001 | gw_sdw2 | gw_sdw2 | |
4 | 2 | p | p | s | u | 40002 | gw_sdw1 | gw_sdw1 | |
8 | 6 | p | p | s | u | 40002 | gw_sdw2 | gw_sdw2 | |
5 | 3 | p | p | s | u | 40003 | gw_sdw1 | gw_sdw1 | |
9 | 7 | p | p | s | u | 40003 | gw_sdw2 | gw_sdw2 | |
10 | 8 | p | p | s | u | 40004 | gw_sdw1 | gw_sdw1 | |
12 | 10 | p | p | s | u | 40000 | gw_sdw3 | gw_sdw3 | |
11 | 9 | p | p | s | u | 40004 | gw_sdw2 | gw_sdw2 | |
13 | 11 | p | p | s | u | 40001 | gw_sdw3 | gw_sdw3 | |
14 | 12 | p | p | s | u | 40002 | gw_sdw3 | gw_sdw3 | |
15 | 13 | p | p | s | u | 40003 | gw_sdw3 | gw_sdw3 | |
16 | 14 | p | p | s | u | 40004 | gw_sdw3 | gw_sdw3 | |
17 | 15 | p | p | s | u | 40005 | gw_sdw1 | gw_sdw1 | |
18 | 16 | p | p | s | u | 40005 | gw_sdw2 | gw_sdw2 | |
19 | 17 | p | p | s | u | 40005 | gw_sdw3 | gw_sdw3 | |
(19 rows)
重分布表
執行表的重分布指令
[gpadmin@gw_mdw1 ~]$ gpexpand -a -d 1:00:00 -D test
20190329:00:00:56:002723 gpexpand:gw_mdw1:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.1.0 build 6'
20190329:00:00:56:002723 gpexpand:gw_mdw1:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.1.0 build 6) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Jun 11 2014 17:23:40'
20190329:00:00:56:002723 gpexpand:gw_mdw1:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state
20190329:00:01:03:002723 gpexpand:gw_mdw1:gpadmin-[INFO]:-EXPANSION COMPLETED SUCCESSFULLY
20190329:00:01:03:002723 gpexpand:gw_mdw1:gpadmin-[INFO]:-Exiting...
去指定資料庫下檢視,重分布狀态
test=# select * from gpexpand.status;
status | updated
--------------------+----------------------------
SETUP | 2019-03-28 23:54:39.325544
SETUP DONE | 2019-03-28 23:54:48.213527
EXPANSION STARTED | 2019-03-29 00:00:58.467156
EXPANSION COMPLETE | 2019-03-29 00:00:58.979106
(4 rows)
[gpadmin@gw_mdw1 ~]$ gpexpand -c -D test
至此,兩種方式的擴容測試都示範完畢