天天看點

Greenplum擴容節點步驟

概述

greenplum擴容segment可以分為2類:

  • 單機上的seg增加,我們稱之為“縱向擴充”
  • 單機不變,通過增加機器,我們稱之為“橫向擴充”

下面詳細介紹下這兩種擴容方式的具體步驟和方法

gpexpand指令

首先我們來學習一下 gpexpand 指令,基本使用方法如下:

gpexpand 
      [-f <hosts_file>]
      | -i <input_file> [-B <batch_size>] [-V] 
      | {-d <hh:mm:ss> | -e '<YYYY-MM-DD hh:mm:ss>'} 
        [-analyze] [-n <parallel_processes>]
      | --rollback
      | --clean
[-D <database_name>][--verbose] [--silent]
           

指令參數:

-B <batch_size>
并發數可以指定同時運作ssh指令的數量,預設時16,參數範圍是1-128
一般的預設值即可

-D <database_name>
指定要在其中建立擴充模式的資料庫和表。如果未提供此選項,則為使用環境變量PGDATABASE。資料庫模闆不能使用template1和template0。


-i | --input <input_file>
指定擴充配置檔案的名稱
  <hostname>:<address>:<port>:<fselocation>:<dbid>:<content>:<preferred_role>:<replication_port>
主機名:位址:端口:segment data目錄,注意是全路徑:dbid号:角色


-n <parallel_processes>
同時進行重分布的表的個數,有效的參數為1-16,每一個表重分布必須要有兩個資料庫連接配接,設定之前,要確定最大連接配接數max_connections這個參數可以滿足這個值

-S | --simple_progress
 簡單模式,不會顯示提示

-t | --tardir <directory>
指定要放置tar檔案的段主機上的臨時目錄。

-v | --verbose
調試模式,列印出詳細的執行計劃

-V | --novacuum
建立schema副本時,不對資料字典進行vacuum           

縱向拓展

原Greenplum叢集狀态

原叢集的一個master節點和2個資料節點,每一個資料節點上有4個primary計算節點(為了顯示清晰,就沒有增加mirror節點了)具體資訊如下:

postgres=# SELECT * from gp_segment_configuration ;
 dbid | content | role | preferred_role | mode | status | port  | hostname | address | replication_port | san_mounts 
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
    1 |      -1 | p    | p              | s    | u      |  5432 | gw_mdw1  | gw_mdw1 |                  | 
    2 |       0 | p    | p              | s    | u      | 40000 | gw_sdw1  | gw_sdw1 |                  | 
    6 |       4 | p    | p              | s    | u      | 40000 | gw_sdw2  | gw_sdw2 |                  | 
    3 |       1 | p    | p              | s    | u      | 40001 | gw_sdw1  | gw_sdw1 |                  | 
    7 |       5 | p    | p              | s    | u      | 40001 | gw_sdw2  | gw_sdw2 |                  | 
    4 |       2 | p    | p              | s    | u      | 40002 | gw_sdw1  | gw_sdw1 |                  | 
    8 |       6 | p    | p              | s    | u      | 40002 | gw_sdw2  | gw_sdw2 |                  | 
    5 |       3 | p    | p              | s    | u      | 40003 | gw_sdw1  | gw_sdw1 |                  | 
    9 |       7 | p    | p              | s    | u      | 40003 | gw_sdw2  | gw_sdw2 |                  | 
(9 rows)           

現在準備通過在每台機器上再增加1個節點,來擴容segment,簡單的來說就是縱向擴容

生成參數檔案

指定一個資料庫,在此資料庫下生成schema

再建立一個seg_hosts的檔案,即所有資料節點的主機名

[gpadmin@gw_mdw1 ~]$ cat seg_hosts 
gw_sdw1
gw_sdw2           

執行指令,生成參數檔案

[gpadmin@gw_mdw1 ~]$ gpexpand -f seg_hosts -D test
20190327:23:18:01:007122 gpexpand:gw_mdw1:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.1.0 build 6'
20190327:23:18:01:007122 gpexpand:gw_mdw1:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.1.0 build 6) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Jun 11 2014 17:23:40'
20190327:23:18:01:007122 gpexpand:gw_mdw1:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state

System Expansion is used to add segments to an existing GPDB array.
gpexpand did not detect a System Expansion that is in progress.

Before initiating a System Expansion, you need to provision and burn-in
the new hardware.  Please be sure to run gpcheckperf/gpcheckos to make
sure the new hardware is working properly.

Please refer to the Admin Guide for more information.

Would you like to initiate a new System Expansion Yy|Nn (default=N):
> y            <------确認添加

How many new primary segments per host do you want to add? (default=0):
> 1            <------每台機器上添加1個計算節點
Enter new primary data directory 1:
> /data/primary    <------增加的計算節點存放的目錄

Generating configuration file...

20190327:23:19:03:007122 gpexpand:gw_mdw1:gpadmin-[INFO]:-Generating input file...

Input configuration files were written to 'gpexpand_inputfile_20190327_231903' and 'None'.
Please review the file and make sure that it is correct then re-run
with: gpexpand -i gpexpand_inputfile_20190327_231903 -D test
                
20190327:23:19:03:007122 gpexpand:gw_mdw1:gpadmin-[INFO]:-Exiting...           

參數檔案的内容如下:

[gpadmin@gw_mdw1 ~]$ cat gpexpand_inputfile_20190327_231903 
gw_sdw1:gw_sdw1:40004:/data/primary/gpseg8:10:8:p
gw_sdw2:gw_sdw2:40004:/data/primary/gpseg9:11:9:p           

如果存放的目錄有所改變,可以手動去修改此檔案,将該計算節點存放在自己想要放在的位置

利用參數檔案執行拓展指令

[gpadmin@gw_mdw1 ~]$ gpexpand -i gpexpand_inputfile_20190327_231903 -D test
20190327:23:20:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.1.0 build 6'
20190327:23:20:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.1.0 build 6) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Jun 11 2014 17:23:40'
.......(省略資訊)
20190327:23:22:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-************************************************
20190327:23:22:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-Initialization of the system expansion complete.
20190327:23:22:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-To begin table expansion onto the new segments
20190327:23:22:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-rerun gpexpand
20190327:23:22:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-************************************************
20190327:23:22:19:007227 gpexpand:gw_mdw1:gpadmin-[INFO]:-Exiting...           

檢視新添加的狀态

此時去資料庫裡檢視相關節點資訊,可以看到已經增加了兩個節點了。

postgres=# SELECT * from gp_segment_configuration ;
 dbid | content | role | preferred_role | mode | status | port  | hostname | address | replication_port | san_mounts 
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
    1 |      -1 | p    | p              | s    | u      |  5432 | gw_mdw1  | gw_mdw1 |                  | 
    2 |       0 | p    | p              | s    | u      | 40000 | gw_sdw1  | gw_sdw1 |                  | 
    6 |       4 | p    | p              | s    | u      | 40000 | gw_sdw2  | gw_sdw2 |                  | 
    3 |       1 | p    | p              | s    | u      | 40001 | gw_sdw1  | gw_sdw1 |                  | 
    7 |       5 | p    | p              | s    | u      | 40001 | gw_sdw2  | gw_sdw2 |                  | 
    4 |       2 | p    | p              | s    | u      | 40002 | gw_sdw1  | gw_sdw1 |                  | 
    8 |       6 | p    | p              | s    | u      | 40002 | gw_sdw2  | gw_sdw2 |                  | 
    5 |       3 | p    | p              | s    | u      | 40003 | gw_sdw1  | gw_sdw1 |                  | 
    9 |       7 | p    | p              | s    | u      | 40003 | gw_sdw2  | gw_sdw2 |                  | 
   10 |       8 | p    | p              | s    | u      | 40004 | gw_sdw1  | gw_sdw1 |                  | 
   11 |       9 | p    | p              | s    | u      | 40004 | gw_sdw2  | gw_sdw2 |                  | 
(11 rows)           

資料重分布

執行重分布指令将資料重分布

[gpadmin@gw_mdw1 ~]$ gpexpand -a -d 1:00:00 -D test            

此時還沒有完成,去指定的資料庫下面檢視重分布狀态

test=# select * from gpexpand.status;
   status   |          updated           
------------+----------------------------
 SETUP      | 2019-03-27 23:21:49.299565
 SETUP DONE | 2019-03-27 23:21:53.737479
(2 rows)           

還有多少個表需要重分布,這裡我的已經同步完成,因為是測試環境沒有什麼資料量

test=# select * from gpexpand.expansion_progress ;
             name             | value 
------------------------------+-------
 Estimated Expansion Rate     | 
 Estimated Time to Completion | 
(2 rows)           

完成重分布之後,将schema删除

[gpadmin@gw_mdw1 ~]$ gpexpand -c -D test           

橫向拓展

這次我們不僅在每台機器上添加1個計算節點,還添加一個資料節點(也就是一個新機器)

生成配置檔案

這次我們在主機的名單中再增加一個新的機器gw_sdw3,此機器是新添加進來的一個節點。隻是安裝了軟體,沒有任何資料。

[gpadmin@gw_mdw1 ~]$ cat seg_hosts 
gw_sdw1
gw_sdw2
gw_sdw3           

使用同樣的方法執行,每一節點也選擇增加1個計算節點

[gpadmin@gw_mdw1 ~]$ gpexpand -f seg_hosts -D test           

生成檔案如下:

可以看到,雖然隻是添加一個,greenplum檢視到gw_sdw3上并沒有一個節點,就自動為其一次添加6個,是它與其他兩個機器數量一緻

[gpadmin@gw_mdw1 ~]$ cat gpexpand_inputfile_20190328_014748 
gw_sdw3:gw_sdw3:40000:/data/primary/gpseg10:12:10:p
gw_sdw3:gw_sdw3:40001:/data/primary/gpseg11:13:11:p
gw_sdw3:gw_sdw3:40002:/data/primary/gpseg12:14:12:p
gw_sdw3:gw_sdw3:40003:/data/primary/gpseg13:15:13:p
gw_sdw3:gw_sdw3:40004:/data/primary/gpseg14:16:14:p
gw_sdw1:gw_sdw1:40005:/data/primary/gpseg15:17:15:p
gw_sdw2:gw_sdw2:40005:/data/primary/gpseg16:18:16:p
gw_sdw3:gw_sdw3:40005:/data/primary/gpseg17:19:17:p           

執行擴容指令

執行擴容指令,和上面一樣

[gpadmin@gw_mdw1 ~]$ gpexpand -i gpexpand_inputfile_20190328_014748 -D test
20190328:23:50:44:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.1.0 build 6'
20190328:23:50:44:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.1.0 build 6) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Jun 11 2014 17:23:40'
....(忽略資訊)
20190328:23:54:59:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-Starting Greenplum Database
20190328:23:55:32:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-************************************************
20190328:23:55:32:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-Initialization of the system expansion complete.
20190328:23:55:32:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-To begin table expansion onto the new segments
20190328:23:55:32:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-rerun gpexpand
20190328:23:55:32:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-************************************************
20190328:23:55:32:001498 gpexpand:gw_mdw1:gpadmin-[INFO]:-Exiting...           

檢視資料庫的節點狀态,發現已經增加了8個節點,其中2個是縱向添加,6個是橫向添加

postgres=# SELECT * from gp_segment_configuration ;
 dbid | content | role | preferred_role | mode | status | port  | hostname | address | replication_port | san_mounts 
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
    1 |      -1 | p    | p              | s    | u      |  5432 | gw_mdw1  | gw_mdw1 |                  | 
    2 |       0 | p    | p              | s    | u      | 40000 | gw_sdw1  | gw_sdw1 |                  | 
    6 |       4 | p    | p              | s    | u      | 40000 | gw_sdw2  | gw_sdw2 |                  | 
    3 |       1 | p    | p              | s    | u      | 40001 | gw_sdw1  | gw_sdw1 |                  | 
    7 |       5 | p    | p              | s    | u      | 40001 | gw_sdw2  | gw_sdw2 |                  | 
    4 |       2 | p    | p              | s    | u      | 40002 | gw_sdw1  | gw_sdw1 |                  | 
    8 |       6 | p    | p              | s    | u      | 40002 | gw_sdw2  | gw_sdw2 |                  | 
    5 |       3 | p    | p              | s    | u      | 40003 | gw_sdw1  | gw_sdw1 |                  | 
    9 |       7 | p    | p              | s    | u      | 40003 | gw_sdw2  | gw_sdw2 |                  | 
   10 |       8 | p    | p              | s    | u      | 40004 | gw_sdw1  | gw_sdw1 |                  | 
   12 |      10 | p    | p              | s    | u      | 40000 | gw_sdw3  | gw_sdw3 |                  | 
   11 |       9 | p    | p              | s    | u      | 40004 | gw_sdw2  | gw_sdw2 |                  | 
   13 |      11 | p    | p              | s    | u      | 40001 | gw_sdw3  | gw_sdw3 |                  | 
   14 |      12 | p    | p              | s    | u      | 40002 | gw_sdw3  | gw_sdw3 |                  | 
   15 |      13 | p    | p              | s    | u      | 40003 | gw_sdw3  | gw_sdw3 |                  | 
   16 |      14 | p    | p              | s    | u      | 40004 | gw_sdw3  | gw_sdw3 |                  | 
   17 |      15 | p    | p              | s    | u      | 40005 | gw_sdw1  | gw_sdw1 |                  | 
   18 |      16 | p    | p              | s    | u      | 40005 | gw_sdw2  | gw_sdw2 |                  | 
   19 |      17 | p    | p              | s    | u      | 40005 | gw_sdw3  | gw_sdw3 |                  | 
(19 rows)
           

重分布表

執行表的重分布指令

[gpadmin@gw_mdw1 ~]$ gpexpand -a -d 1:00:00 -D test 
20190329:00:00:56:002723 gpexpand:gw_mdw1:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.1.0 build 6'
20190329:00:00:56:002723 gpexpand:gw_mdw1:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.1.0 build 6) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Jun 11 2014 17:23:40'
20190329:00:00:56:002723 gpexpand:gw_mdw1:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state
20190329:00:01:03:002723 gpexpand:gw_mdw1:gpadmin-[INFO]:-EXPANSION COMPLETED SUCCESSFULLY
20190329:00:01:03:002723 gpexpand:gw_mdw1:gpadmin-[INFO]:-Exiting...
           

去指定資料庫下檢視,重分布狀态

test=# select * from gpexpand.status;
       status       |          updated           
--------------------+----------------------------
 SETUP              | 2019-03-28 23:54:39.325544
 SETUP DONE         | 2019-03-28 23:54:48.213527
 EXPANSION STARTED  | 2019-03-29 00:00:58.467156
 EXPANSION COMPLETE | 2019-03-29 00:00:58.979106
(4 rows)           
[gpadmin@gw_mdw1 ~]$ gpexpand -c -D test           

至此,兩種方式的擴容測試都示範完畢