天天看點

DataWorks資料內建任務切分鍵妙用一、資料內建任務切分鍵探索二、切分鍵使用性能測試三、三種方案對比

一、資料內建任務切分鍵探索

對于資料內建任務,這些任務的時間消耗一般都主要花費在資料同步上,當查詢表資料量較大時,其SQL本身在資料庫中查詢就是很慢的,那麼對于這種情況有說明好的優化方法呢?

DataWorks資料內建任務切分鍵妙用一、資料內建任務切分鍵探索二、切分鍵使用性能測試三、三種方案對比

資料內建任務上提供了一個切分鍵的設定,那麼該切分鍵是否可以對源庫SQL查詢有一定的提升,進而提高資料同步任務的整體效率呢?

切分鍵:可以将源資料表中某一列作為切分鍵

建議使用主鍵或有索引的列作為切分鍵

1、如何探究任務究竟怎麼入庫查詢拉取資料呢?

這裡主要講案例中使用到的MySQL資料庫時可以通過什麼方案探究任務如何入庫查詢,我們可以打開MySQL的general_log,general log記錄連接配接到資料庫的所有操作。值得注意的一點是,開啟開操作對資料庫性能有極大影響,是以一般情況下我們僅僅會在分析問題的時候才會在自己測試環境開啟該日志。

general_log參數預設關閉,若我們需要開啟,可在資料庫中動态設定。general_log_file指定路徑即為日志路徑,可在作業系統中tail -100f ${general_log_file}實時檢視日志記錄情況。

mysql> show variables like '%general%';
+------------------+--------------------------------------------+
| Variable_name    | Value                                      |
+------------------+--------------------------------------------+
| general_log      | OFF                                        |
| general_log_file | /var/lib/mysql/izt4niarmwaooao2egw7wiz.log |
+------------------+--------------------------------------------+
2 rows in set (0.01 sec)

mysql> set global general_log=on;
Query OK, 0 rows affected (0.11 sec)

mysql> show variables like '%general%';
+------------------+--------------------------------------------+
| Variable_name    | Value                                      |
+------------------+--------------------------------------------+
| general_log      | ON                                         |
| general_log_file | /var/lib/mysql/izt4niarmwaooao2egw7wiz.log |
+------------------+--------------------------------------------+
2 rows in set (0.01 sec)           

2、不設定切分鍵時,SQL如何執行?

直接使用元SQL入庫查詢

select xxx,xxx from `shardkey`           

3、設定切分鍵時,SQL如何執行?

1、查詢切分鍵的最大最小值範圍

SELECT MIN(id),MAX(id) FROM `shardkey`           

2、根據切分鍵範圍進行切分,範圍割接左閉右開,max值時左閉右閉,除範圍查詢外增加切分鍵is null的查詢,避免遺漏資料

1)根據min/max進行範圍切分
select xxx,xxx from `shardkey`  where  (1 <= id AND id < 19923)
select xxx,xxx from `shardkey`  where  (19923 <= id AND id < 39845)
...
...
select xxx,xxx from `shardkey`  where  (179299 <= id AND id <= 199221)
2)查詢切分鍵 is null的情況
select xxx,xxx from `shardkey`  where  id IS NULL           

3、按照最大并發數進行并發查詢(實際并發數<=任務最大期望并發數)

在任務執行時可通過在資料庫執行show processlist進行監控檢視

當任務最大期望并發數為2時:

DataWorks資料內建任務切分鍵妙用一、資料內建任務切分鍵探索二、切分鍵使用性能測試三、三種方案對比

當任務最大期望并發數為4時:

DataWorks資料內建任務切分鍵妙用一、資料內建任務切分鍵探索二、切分鍵使用性能測試三、三種方案對比

4、切分鍵使用注意

1、推薦使用主鍵作為切分鍵,因為為表主鍵通常情況下比較均勻,是以切分出來的分片也不容易出現資料熱點

2、目前splitPk僅支援整型資料切分,不支援字元串、浮點、日期等其他類型。

二、切分鍵使用性能測試

資料源資訊設定

資料源使用rds for mysql,配置資訊如下:

DataWorks資料內建任務切分鍵妙用一、資料內建任務切分鍵探索二、切分鍵使用性能測試三、三種方案對比

資料源接入,使用阿裡雲執行個體模式接入

DataWorks資料內建任務切分鍵妙用一、資料內建任務切分鍵探索二、切分鍵使用性能測試三、三種方案對比

測試表為shardkey,并向表内插入11407872行記錄。

CREATE TABLE `shardkey` (
  `id` int(40) NOT NULL AUTO_INCREMENT,
  `ref_data_id` int(40) NOT NULL,
  `ref_meta_id` int(40) NOT NULL,
  `ref_attribute_id` int(40) NOT NULL,
  `value` text NOT NULL,
  `creator` varchar(40) DEFAULT NULL,
  `create_time` datetime DEFAULT NULL,
  `update_time` datetime DEFAULT NULL,
  `tenant` varchar(40) DEFAULT NULL,
  `model` varchar(40) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=11602312 DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC

mysql>select count(*) from shardkey;
+--------------------+
| count(*)           |
+--------------------+
| 11407872           |
+--------------------+
傳回行數:[1],耗時:15684 ms.           

方案一:不設定切分鍵,預設最大并發數為2

資料內建任務配置如下:

Reader: mysql
                           column=[["id","ref_data_id","ref_meta_id","ref_attribute_id","value","creator","create_time","update_time","tenant","model"]]
                       connection=[[{"datasource":"rds_for_mysql","table":["`shardkey`"]}]]
Writer: odps
                        partition=[ds=20200203                   ]
                         truncate=[true                          ]
                       datasource=[odps_first                    ]
                           column=[["id","ref_data_id","ref_meta_id","ref_attribute_id","value","creator","create_time","update_time","tenant","model"]]
                      emptyAsNull=[false                         ]
                            table=[shardkey_odps                 ]
Setting:
                       errorLimit=[{"record":""}                 ]
                            speed=[{"concurrent":2,"throttle":false}]           

将資料內建任務儲存并送出到運維中心,對周期任務進行測試,日志分析如下:

=== total summarize info ===
   1. all phase average time info and max time task info:
PHASE                |  AVERAGE USED TIME |       ALL TASK NUM |      MAX USED TIME |        MAX TASK ID | MAX TASK INFO
TASK_TOTAL           |            87.066s |                  1 |            87.066s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_INIT       |             0.006s |                  1 |             0.006s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_PREPARE    |             0.000s |                  1 |             0.000s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_DATA       |            84.567s |                  1 |            84.567s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_POST       |             0.000s |                  1 |             0.000s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_DESTROY    |             0.000s |                  1 |             0.000s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_INIT      |             0.001s |                  1 |             0.001s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_PREPARE   |             0.233s |                  1 |             0.233s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_DATA      |            86.304s |                  1 |            86.304s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_POST      |             0.285s |                  1 |             0.285s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_DESTROY   |             0.000s |                  1 |             0.000s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
SQL_QUERY            |             0.055s |                  1 |             0.055s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
RESULT_NEXT_ALL      |             5.124s |                  1 |             5.124s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
ODPS_BLOCK_CLOSE     |            42.770s |                  1 |            42.770s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WAIT_READ_TIME       |             2.124s |                  1 |             2.124s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WAIT_WRITE_TIME      |            48.318s |                  1 |            48.318s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
 2. record average count and max count task info :
PHASE                |    AVERAGE RECORDS |      AVERAGE BYTES |        MAX RECORDS | MAX RECORD`S BYTES |        MAX TASK ID | MAX TASK INFO
READ_TASK_DATA       |           11407872 |            786.62M |           11407872 |            786.62M |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
2020-02-04 16:18:43.464 [job-128859081] INFO  MetricReportUtil - reportJobMetric is turn off
2020-02-04 16:18:43.464 [job-128859081] INFO  LocalJobContainerCommunicator - Total 11407872 records, 786618535 bytes | Speed 8.34MB/s, 126754 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 48.318s |  All Task WaitReaderTime 2.124s | Percentage 100.00%
2020-02-04 16:18:43.465 [job-128859081] INFO  LogReportUtil - report datax log is turn off
2020-02-04 16:18:43.465 [job-128859081] INFO  JobContainer -
任務啟動時刻                    : 2020-02-04 16:17:09
任務結束時刻                    : 2020-02-04 16:18:43
任務總計耗時                    :                 93s
任務平均流量                    :            8.34MB/s
記錄寫入速度                    :         126754rec/s
讀出記錄總數                    :            11407872
讀寫失敗總數                    :                   0
2020-02-04 16:18:43 INFO =================================================================
2020-02-04 16:18:43 INFO Exit code of the Shell command 0
2020-02-04 16:18:43 INFO --- Invocation of Shell command completed ---
2020-02-04 16:18:43 INFO Shell run successfully!
2020-02-04 16:18:43 INFO Current task status: FINISH
2020-02-04 16:18:43 INFO Cost time is: 95.217s           

方案二:設定主鍵為切分鍵,預設最大并發數為2

Reader: mysql
                           column=[["id","ref_data_id","ref_meta_id","ref_attribute_id","value","creator","create_time","update_time","tenant","model"]]
                       connection=[[{"datasource":"rds_for_mysql","table":["`shardkey`"]}]]
                          splitPk=[id                            ]
Writer: odps
                        partition=[ds=20200203                   ]
                         truncate=[true                          ]
                       datasource=[odps_first                    ]
                           column=[["id","ref_data_id","ref_meta_id","ref_attribute_id","value","creator","create_time","update_time","tenant","model"]]
                      emptyAsNull=[false                         ]
                            table=[shardkey_odps                 ]
Setting:
                       errorLimit=[{"record":""}                 ]
                            speed=[{"concurrent":2,"throttle":false}]           

日志分析如下:

=== total summarize info ===
   1. all phase average time info and max time task info:
PHASE                |  AVERAGE USED TIME |       ALL TASK NUM |      MAX USED TIME |        MAX TASK ID | MAX TASK INFO
TASK_TOTAL           |             9.649s |                 11 |            12.512s |              0-0-9 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_INIT       |             0.002s |                 11 |             0.009s |              0-0-8 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_PREPARE    |             0.000s |                 11 |             0.002s |              0-0-8 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_DATA       |             7.704s |                 11 |            10.269s |              0-0-9 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_POST       |             0.000s |                 11 |             0.000s |              0-0-4 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_DESTROY    |             0.000s |                 11 |             0.000s |              0-0-5 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_INIT      |             0.001s |                 11 |             0.002s |              0-0-6 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_PREPARE   |             0.154s |                 11 |             0.248s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_DATA      |             8.894s |                 11 |            11.674s |              0-0-9 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_POST      |             0.346s |                 11 |             0.612s |              0-0-1 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_DESTROY   |             0.000s |                 11 |             0.000s |              0-0-5 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
SQL_QUERY            |             0.062s |                 11 |             0.089s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
RESULT_NEXT_ALL      |             0.626s |                 11 |             0.823s |              0-0-4 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
ODPS_BLOCK_CLOSE     |             4.172s |                 11 |             6.661s |              0-0-9 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WAIT_READ_TIME       |             0.508s |                 11 |             0.891s |              0-0-4 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WAIT_WRITE_TIME      |             3.549s |                 11 |             5.992s |              0-0-9 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
 2. record average count and max count task info :
PHASE                |    AVERAGE RECORDS |      AVERAGE BYTES |        MAX RECORDS | MAX RECORD`S BYTES |        MAX TASK ID | MAX TASK INFO
READ_TASK_DATA       |            1037079 |             71.51M |            1153917 |             80.41M |              0-0-9 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
2020-02-04 16:20:54.876 [job-128859276] INFO  MetricReportUtil - reportJobMetric is turn off
2020-02-04 16:20:54.876 [job-128859276] INFO  LocalJobContainerCommunicator - Total 11407872 records, 786618535 bytes | Speed 12.50MB/s, 190131 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 39.038s |  All Task WaitReaderTime 5.589s | Percentage 100.00%
2020-02-04 16:20:54.877 [job-128859276] INFO  LogReportUtil - report datax log is turn off
2020-02-04 16:20:54.877 [job-128859276] INFO  JobContainer -
任務啟動時刻                    : 2020-02-04 16:19:51
任務結束時刻                    : 2020-02-04 16:20:54
任務總計耗時                    :                 63s
任務平均流量                    :           12.50MB/s
記錄寫入速度                    :         190131rec/s
讀出記錄總數                    :            11407872
讀寫失敗總數                    :                   0
2020-02-04 16:20:54 INFO =================================================================
2020-02-04 16:20:54 INFO Exit code of the Shell command 0
2020-02-04 16:20:54 INFO --- Invocation of Shell command completed ---
2020-02-04 16:20:54 INFO Shell run successfully!
2020-02-04 16:20:54 INFO Current task status: FINISH
2020-02-04 16:20:54 INFO Cost time is: 64.674s
/home/admin/alisatasknode/taskinfo//20200204/diide/16/19/49/fr8l5g0pvu449c494iy9g8vn/T3_0413124736.log-END-EOF
2020-02-04 16:20:59 : Detail log url: https://di-cn-hangzhou.data.aliyun.com/web/di/instanceLog?id=128859276&resourceGroup=group_253861156446274&requestId=9157dea3-e3cf-42e9-9ee4-c8bc3858ee51&projectId=6043
Exit with SUCCESS.
2020-02-04 16:20:59 [INFO] Sandbox context cleanup temp file success.
2020-02-04 16:20:59 [INFO] Data synchronization ended with return code: [0].
2020-02-04 16:21:02 INFO =================================================================
2020-02-04 16:21:02 INFO Exit code of the Shell command 0
2020-02-04 16:21:02 INFO --- Invocation of Shell command completed ---
2020-02-04 16:21:02 INFO Shell run successfully!
2020-02-04 16:21:02 INFO Current task status: FINISH
2020-02-04 16:21:02 INFO Cost time is: 75.403s           

方案三:設定主鍵為切分鍵,設定最大并發數為4

Reader: mysql
                           column=[["id","ref_data_id","ref_meta_id","ref_attribute_id","value","creator","create_time","update_time","tenant","model"]]
                       connection=[[{"datasource":"rds_for_mysql","table":["`shardkey`"]}]]
                          splitPk=[id                            ]
Writer: odps
                        partition=[ds=20200203                   ]
                         truncate=[true                          ]
                       datasource=[odps_first                    ]
                           column=[["id","ref_data_id","ref_meta_id","ref_attribute_id","value","creator","create_time","update_time","tenant","model"]]
                      emptyAsNull=[false                         ]
                            table=[shardkey_odps                 ]
Setting:
                       errorLimit=[{"record":""}                 ]
                            speed=[{"concurrent":4,"throttle":false}]           
=== total summarize info ===
   1. all phase average time info and max time task info:
PHASE                |  AVERAGE USED TIME |       ALL TASK NUM |      MAX USED TIME |        MAX TASK ID | MAX TASK INFO
TASK_TOTAL           |             5.676s |                 21 |             6.515s |              0-0-2 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_INIT       |             0.002s |                 21 |             0.009s |              0-0-2 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_PREPARE    |             0.000s |                 21 |             0.001s |              0-0-3 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_DATA       |             2.949s |                 21 |             3.566s |              0-0-9 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_POST       |             0.000s |                 21 |             0.000s |              0-0-3 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
READ_TASK_DESTROY    |             0.000s |                 21 |             0.001s |              0-0-1 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_INIT      |             0.000s |                 21 |             0.001s |              0-0-0 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_PREPARE   |             0.126s |                 21 |             0.297s |              0-0-1 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_DATA      |             4.928s |                 21 |             5.679s |              0-0-9 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_POST      |             0.369s |                 21 |             0.626s |             0-0-17 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WRITE_TASK_DESTROY   |             0.000s |                 21 |             0.000s |              0-0-3 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
SQL_QUERY            |             0.071s |                 21 |             0.152s |              0-0-3 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
RESULT_NEXT_ALL      |             0.665s |                 21 |             1.155s |              0-0-9 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
ODPS_BLOCK_CLOSE     |             2.090s |                 21 |             2.715s |             0-0-12 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WAIT_READ_TIME       |             0.613s |                 21 |             1.105s |              0-0-5 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
WAIT_WRITE_TIME      |             0.375s |                 21 |             0.872s |             0-0-14 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
 2. record average count and max count task info :
PHASE                |    AVERAGE RECORDS |      AVERAGE BYTES |        MAX RECORDS | MAX RECORD`S BYTES |        MAX TASK ID | MAX TASK INFO
READ_TASK_DATA       |             543232 |             37.46M |             576959 |             40.21M |             0-0-19 | `shardkey`,jdbcUrl:[jdbc:mysql://100.100.64.101:39001/sansi_test]
2020-02-04 16:23:11.979 [job-128859728] INFO  MetricReportUtil - reportJobMetric is turn off
2020-02-04 16:23:11.979 [job-128859728] INFO  LocalJobContainerCommunicator - Total 11407872 records, 786618535 bytes | Speed 25.01MB/s, 380262 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 7.881s |  All Task WaitReaderTime 12.881s | Percentage 100.00%
2020-02-04 16:23:11.979 [job-128859728] INFO  LogReportUtil - report datax log is turn off
2020-02-04 16:23:11.979 [job-128859728] INFO  JobContainer -
任務啟動時刻                    : 2020-02-04 16:22:38
任務結束時刻                    : 2020-02-04 16:23:11
任務總計耗時                    :                 32s
任務平均流量                    :           25.01MB/s
記錄寫入速度                    :         380262rec/s
讀出記錄總數                    :            11407872
讀寫失敗總數                    :                   0
2020-02-04 16:23:12 INFO =================================================================
2020-02-04 16:23:12 INFO Exit code of the Shell command 0
2020-02-04 16:23:12 INFO --- Invocation of Shell command completed ---
2020-02-04 16:23:12 INFO Shell run successfully!
2020-02-04 16:23:12 INFO Current task status: FINISH
2020-02-04 16:23:12 INFO Cost time is: 34.068s
/home/admin/alisatasknode/taskinfo//20200204/diide/16/22/34/3bjpuktukrcheai76obp1a5q/T3_0413126136.log-END-EOF
2020-02-04 16:23:14 : Detail log url: https://di-cn-hangzhou.data.aliyun.com/web/di/instanceLog?id=128859728&resourceGroup=group_253861156446274&requestId=0b04ad7b-c3ae-4b3c-a27b-6eaadcd69789&projectId=6043
Exit with SUCCESS.
2020-02-04 16:23:14 [INFO] Sandbox context cleanup temp file success.
2020-02-04 16:23:14 [INFO] Data synchronization ended with return code: [0].
2020-02-04 16:23:14 INFO =================================================================
2020-02-04 16:23:14 INFO Exit code of the Shell command 0
2020-02-04 16:23:14 INFO --- Invocation of Shell command completed ---
2020-02-04 16:23:14 INFO Shell run successfully!
2020-02-04 16:23:14 INFO Current task status: FINISH
2020-02-04 16:23:14 INFO Cost time is: 43.964s           

三、三種方案對比

方案 資料量 資料同步耗時 任務整體耗時 任務平均流量 WaitWriterTime WaitReaderTime
方案一 11407872 93s 95.217s 8.34MB/s 48.318s 2.124s
方案二 63s 64.674s 12.50MB/s 39.038s 5.589s
方案三 32s 34.068s 25.01MB/s 7.881s 12.881s
2851968 43s 44.94s 4.66MB/s 16.035s 0.526s
23s 25.177s 9.32MB/s 2.714s 1.705s
25.435s 8.551s 18.145s

從方案一與方案二的對比來看,說明通過設定切分鍵是可以提高資料同步的執行效率,查詢表資料量越大,提升效果越明顯。使用切分鍵來進行查詢,對資料同步效率提升的同時,對于源庫負載也有一定優化效果。

從方案二于方案上的對比來看,說明提高任務期望最大并發數一定程度上也是有益于資料同步的執行效率,查詢表資料量越大,提升效果越明顯。但是設定任務期望最大并發數時需要考慮表資料量大小以及源庫負載可承受最大并發數。