一、送出作業
1、執行指令
./bin/flink run [options] <job-jar> <arguments>
可以使用flink run --help 用來檢視更多指令
2、示例
2.1、不帶參數:
./bin/flink run -c com.yclxiao.cdc.Cdc2DorisSQLDemoWithCheckpoint ./flinkdemo-1.0-SNAPSHOT.jar
2.2、帶參數:
每一個-代表一個參數鍵,後面跟的是值
./bin/flink run -c com.yclxiao.cdc.Cdc2DorisSQLDemoWithCheckpoint ./flinkdemo-1.0-SNAPSHOT.jar -name ycl -age 11
解析的時候直接使用flink自帶類去解析
ParameterTool params = ParameterTool.fromArgs(args);
2.3、從checkpoint送出
增加了參數:
-s /Users/yclxiao/Project/bigdata/flinkdemo/checkpoints/143ac6febfce4274d24bdff6ec83d1c8/chk-170
完整指令:
./bin/flink run -c com.yclxiao.cdc.Cdc2DorisSQLDemoWithCheckpoint -s /Users/yclxiao/Project/bigdata/flinkdemo/checkpoints/143ac6febfce4274d24bdff6ec83d1c8/chk-170 ./flinkdemo-1.0-SNAPSHOT.jar -name ycl -age 11
二、送出作業碰到的問題
先把碰到的問題做個總結,再做詳細解說
1、總結
先把碰到的問題總結一下:
- 資源不夠的問題。解決方式:調整叢集配置檔案。
- 打包時,META-INF下面的SPI沒打進去的問題。解決方式:在pom.xml中增加maven插件。
- pom的依賴配置問題,在FlinkSQL場景下會跟叢集裡的lib包有重複的沖突。解決方式:有些依賴打包時候無需打進去,在flink叢集的lib目錄下存在的jar包,則在打包作業jar時,無需打進去。
- 公有雲上的特殊情況
2、詳細解說
2.1、資源不夠的問題:
錯誤描述:
2023-06-19 15:42:24,452 WARN org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] - Could not fulfill resource requirements of job 032dc3ac91b6128aaedae625b36e0575. Free slots: 0
2023-06-19 15:42:24,452 WARN org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge [] - Could not acquire the minimum required resources, failing slot requests. Acquired: [ResourceRequirement{resourceProfile=ResourceProfile{cpuCores=1, taskHeapMemory=2.283gb (2451151214 bytes), taskOffHeapMemory=0 bytes, managedMemory=2.026gb (2175669399 bytes), networkMemory=518.720mb (543917349 bytes)}, numberOfRequiredSlots=1}]. Current slot pool status: Registered TMs: 1, registered slots: 1 free slots: 0
2023-06-19 15:42:24,454 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: acct_profit[1] -> (Calc[2], Calc[9]) (1/1) (2ff961f809129e63bb6b9b164dd56ca4) switched from SCHEDULED to FAILED on [unassigned resource].
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Could not acquire the minimum required resources.
2023-06-19 15:42:23,440 WARN org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] - Could not fulfill resource requirements of job 032dc3ac91b6128aaedae625b36e0575. Free slots: 0
2023-06-19 15:42:23,440 WARN org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge [] - Could not acquire the minimum required resources, failing slot requests. Acquired: [ResourceRequirement{resourceProfile=ResourceProfile{cpuCores=1, taskHeapMemory=2.283gb (2451151214 bytes), taskOffHeapMemory=0 bytes, managedMemory=2.026gb (2175669399 bytes), networkMemory=518.720mb (543917349 bytes)}, numberOfRequiredSlots=1}]. Current slot pool status: Registered TMs: 1, registered slots: 1 free slots: 0
2023-06-19 15:42:23,441 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: acct_profit[1] -> (Calc[2], Calc[9]) (1/1) (f35db0e3c29aef8507f6d6f7d19e4e90) switched from SCHEDULED to FAILED on [unassigned resource].
可能記憶體設定小了、可能并發配置設定小了、可能是slot設定小了,參考配置flink-conf.yaml:
jobmanager.memory.process.size: 2600m
taskmanager.memory.process.size: 2728m
taskmanager.memory.flink.size: 2280m
taskmanager.numberOfTaskSlots: 10
parallelism.default: 4
2.2、找不到mysql-cdc的問題
是因為打包的時候沒有把所有包的meta-inf合并打包到一起,需要在pom.xml中增加配置:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.4</version>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.bm001.datacompute.cdc.api.CloudAcctProfit2DwsHdjProfitRecordAPI</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
可以參考如下文章:
https://wii.pub/2021/08/23/tools/maven/problems/merge-meta-info/
https://blog.csdn.net/RL_LEEE/article/details/128134800
2.3、jar包重複的問題
有時候本地開發和運作時需要某個jar包,但是丢到叢集去執行時不需要這個jar包。因為叢集的lib中已經存在此jar包。此時會報類似的錯誤:
Caused by: org.apache.flink.table.api.ValidationException: Multiple factories for identifier 'default' that implement 'org.apache.flink.table.delegation.ExecutorFactory' found in the classpath.
解決方法:打包時,需要将pom的scope改成provided
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-runtime</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
3、送出到公有雲上出現的問題
在本地運作OK,送出到測試伺服器也是運作OK,但是丢到公有雲的ECS機器上可能出現一些問題。
3.1、無效的參數 0.0.0.0:8081
無效的參數 0.0.0.0問題,應該是netty通路0.0.0.0被限制了,應該是雲上自己限制的,測試環境沒這個問題,後來改成配置本機ip位址就好了。
rest.address: xx.xx.xx.xx
rest.bind-address: xx.xx.xx.xx
3.2、需要修改tmp臨時檔案的位址,否則會占用系統盤的控件
io.tmp.dirs: /data/software/flink-15.4/tmp
3.3、雲上資料庫使用者權限不夠
到雲上控制台修改使用者權限
Caused by: java.sql.SQLSyntaxErrorException: Access denied; you need (at least one of) the RELOAD privilege(s) for this operation
原文連結:http://www.mangod.top/articles/2023/06/25/1687664663136.html、https://mp.weixin.qq.com/s?__biz=MzI3OTA2MDQyOQ==&mid=2247483777&idx=1&sn=2986cc9a37247bd04054c8c786e2b3e1&chksm=eb4ccb23dc3b4235ca91dec21d7a7e568e467ba80736912b1cb820e49b201a1b722adb7e11d2&token=672230179&lang=zh_CN#rd
感謝你的閱讀,碼字不易,歡迎點贊 關注 收藏!!!