天天看點

DolphinScheduler僞叢集部署

作者:極目館主

使用場景

如果你想體驗更完整的功能,或者更大的任務量,推薦使用僞叢集部署。

僞叢集部署目的是在單台機器部署 DolphinScheduler 服務,該模式下master、worker、api server 都在同一台機器上

版本要求

名稱 版本号
jdk 1.8+
DolphinScheduler 3.1.5
zookeeper 3.1.4
mysql驅動

前置準備工作

僞分布式部署 DolphinScheduler 需要有外部軟體的支援

  • JDK:下載下傳JDK (1.8+),安裝并配置 JAVA_HOME 環境變量,并将其下的 bin 目錄追加到 PATH 環境變量中。如果你的環境中已存在,可以跳過這步。
  • 二進制包:在下載下傳頁面下載下傳 DolphinScheduler 二進制包
  • 資料庫:PostgreSQL (8.2.15+) 或者 MySQL (5.7+),兩者任選其一即可,如 MySQL 則需要 JDBC Driver 8.0.16
  • 注冊中心:ZooKeeper (3.4.6+),下載下傳位址
  • 程序樹分析macOS安裝pstreeFedora/Red/Hat/CentOS/Ubuntu/Debian安裝psmisc
注意: DolphinScheduler 本身不依賴 Hadoop、Hive、Spark,但如果你運作的任務需要依賴他們,就需要有對應的環境支援

準備 DolphinScheduler 啟動環境

配置使用者免密及權限

建立部署使用者,并且一定要配置 sudo 免密。以建立 dolphinscheduler 使用者為例

# 建立使用者需使用 root 登入
useradd dolphinscheduler

# 添加密碼
echo "dolphinscheduler" | passwd --stdin dolphinscheduler

# 配置 sudo 免密
sed -i '$adolphinscheduler  ALL=(ALL)  NOPASSWD: NOPASSWD: ALL' /etc/sudoers
sed -i 's/Defaults    requirett/#Defaults    requirett/g' /etc/sudoers

# 修改目錄權限,使得部署使用者對二進制包解壓後的 apache-dolphinscheduler-*-bin 目錄有操作權限
chown -R dolphinscheduler:dolphinscheduler apache-dolphinscheduler-*-bin           

注意:

因為任務執行服務是以 sudo -u {linux-user} 切換不同 linux 使用者的方式來實作多租戶運作作業,是以部署使用者需要有 sudo 權限,而且是免密的。初學習者不了解的話,完全可以暫時忽略這一點如果發現 /etc/sudoers 檔案中有 "Defaults requirett" 這行,也請注釋掉

配置機器SSH免密登陸

由于安裝的時候需要向不同機器發送資源,是以要求各台機器間能實作SSH免密登陸。配置免密登陸的步驟如下

su dolphinscheduler
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys           
注意: 配置完成後,可以通過運作指令 ssh localhost 判斷是否成功,如果不需要輸入密碼就能ssh登陸則證明成功

啟動zookeeper

進入 zookeeper 的安裝目錄,将 zoo_sample.cfg 配置檔案複制到 conf/zoo.cfg,并将 conf/zoo.cfg 中 dataDir 中的值改成 dataDir=./tmp/zookeeper

# 啟動 zookeeper
./bin/zkServer.sh start           

切換使用者

su dolphinscheduler           

修改相關配置

完成基礎環境的準備後,需要根據你的機器環境修改配置檔案。配置檔案可以在目錄 bin/env 中找到,他們分别是 并命名為 install_env.sh 和 dolphinscheduler_env.sh。

修改install_env.sh檔案

檔案 install_env.sh 描述了哪些機器将被安裝 DolphinScheduler 以及每台機器對應安裝哪些服務。您可以在路徑 bin/env/install_env.sh 中找到此檔案,可通過以下方式更改env變量,export <ENV_NAME>=,配置詳情如下。

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# ---------------------------------------------------------
# INSTALL MACHINE
# ---------------------------------------------------------
# A comma separated list of machine hostname or IP would be installed DolphinScheduler,
# including master, worker, api, alert. If you want to deploy in pseudo-distributed
# mode, just write a pseudo-distributed hostname
# Example for hostnames: ips="ds1,ds2,ds3,ds4,ds5", Example for IPs: ips="192.168.8.1,192.168.8.2,192.168.8.3,192.168.8.4,192.168.8.5"
#ips=${ips:-"ds1,ds2,ds3,ds4,ds5"}
ips="bigdata"
# Port of SSH protocol, default value is 22. For now we only support same port in all `ips` machine
# modify it if you use different ssh port
#sshPort=${sshPort:-"22"}
sshPort="22"
# A comma separated list of machine hostname or IP would be installed Master server, it
# must be a subset of configuration `ips`.
# Example for hostnames: masters="ds1,ds2", Example for IPs: masters="192.168.8.1,192.168.8.2"
#masters=${masters:-"ds1,ds2"}
masters="bigdata"
# A comma separated list of machine <hostname>:<workerGroup> or <IP>:<workerGroup>.All hostname or IP must be a
# subset of configuration `ips`, And workerGroup have default value as `default`, but we recommend you declare behind the hosts
# Example for hostnames: workers="ds1:default,ds2:default,ds3:default", Example for IPs: workers="192.168.8.1:default,192.168.8.2:default,192.168.8.3:default"
#workers=${workers:-"ds1:default,ds2:default,ds3:default,ds4:default,ds5:default"}
workers="bigdata:default"
# A comma separated list of machine hostname or IP would be installed Alert server, it
# must be a subset of configuration `ips`.
# Example for hostname: alertServer="ds3", Example for IP: alertServer="192.168.8.3"
#alertServer=${alertServer:-"ds3"}
alertServer="bigdata"
# A comma separated list of machine hostname or IP would be installed API server, it
# must be a subset of configuration `ips`.
# Example for hostname: apiServers="ds1", Example for IP: apiServers="192.168.8.1"
#apiServers=${apiServers:-"ds1"}
apiServers="bigdata"
# The directory to install DolphinScheduler for all machine we config above. It will automatically be created by `install.sh` script if not exists.
# Do not set this configuration same as the current path (pwd). Do not add quotes to it if you using related path.
#installPath=${installPath:-"/tmp/dolphinscheduler"}
installPath="/opt/module/dolphinscheduler-3.1.5-bin"
# The user to deploy DolphinScheduler for all machine we config above. For now user must create by yourself before running `install.sh`
# script. The user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled than the root directory needs
# to be created by this user
#deployUser=${deployUser:-"dolphinscheduler"}
deployUser="root"
# The root of zookeeper, for now DolphinScheduler default registry server is zookeeper.
#zkRoot=${zkRoot:-"/dolphinscheduler"}
zkRoot="/dolphinscheduler"           

修改dolphinscheduler_env.sh檔案

檔案 ./bin/env/dolphinscheduler_env.sh 描述了下列配置:

  • DolphinScheduler 的資料庫配置,詳細配置方法見初始化資料庫
  • 一些任務類型外部依賴路徑或庫檔案,如 JAVA_HOME 和 SPARK_HOME都是在這裡定義的
  • 注冊中心zookeeper
  • 服務端相關配置,比如緩存,時區設定等

如果您不使用某些任務類型,您可以忽略任務外部依賴項,但您必須根據您的環境更改 JAVA_HOME、注冊中心和資料庫相關配置。

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# JAVA_HOME, will use it to start DolphinScheduler server
#export JAVA_HOME=${JAVA_HOME:-/opt/java/openjdk}
export JAVA_HOME=/opt/module/jdk1.8.0_212
# Database related configuration, set database type, username and password
#export DATABASE=${DATABASE:-postgresql}
export DATABASE="mysql"
#export SPRING_PROFILES_ACTIVE=${DATABASE}
export SPRING_PROFILES_ACTIVE=${DATABASE}
#export SPRING_DATASOURCE_URL
export SPRING_DATASOURCE_URL="jdbc:mysql://bigdata:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8"
#export SPRING_DATASOURCE_USERNAME
export SPRING_DATASOURCE_USERNAME="dolphinscheduler"
#export SPRING_DATASOURCE_PASSWORD
export SPRING_DATASOURCE_PASSWORD="dolphinscheduler"
# DolphinScheduler server related configuration
#export SPRING_CACHE_TYPE=${SPRING_CACHE_TYPE:-none}
export SPRING_CACHE_TYPE="none"
#export SPRING_JACKSON_TIME_ZONE=${SPRING_JACKSON_TIME_ZONE:-UTC}
export SPRING_JACKSON_TIME_ZONE="Asia/Shanghai"
#export MASTER_FETCH_COMMAND_NUM=${MASTER_FETCH_COMMAND_NUM:-10}
export MASTER_FETCH_COMMAND_NUM="10"
# Registry center configuration, determines the type and link of the registry center
#export REGISTRY_TYPE=${REGISTRY_TYPE:-zookeeper}
export REGISTRY_TYPE="zookeeper"
#export REGISTRY_ZOOKEEPER_CONNECT_STRING=${REGISTRY_ZOOKEEPER_CONNECT_STRING:-localhost:2181}
export REGISTRY_ZOOKEEPER_CONNECT_STRING="bigdata:2181"
# Tasks related configurations, need to change the configuration if you use the related tasks.
#export HADOOP_HOME=${HADOOP_HOME:-/opt/soft/hadoop}
export HADOOP_HOME=/opt/module/hadoop-3.1.3
#export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/opt/soft/hadoop/etc/hadoop}
export HADOOP_CONF_DIR=/opt/module/hadoop-3.1.3/etc/hadoop
#export SPARK_HOME1=${SPARK_HOME1:-/opt/soft/spark1}
export SPARK_HOME1=/opt/module/spark-3.0.1
#export SPARK_HOME2=${SPARK_HOME2:-/opt/soft/spark2}
#export PYTHON_HOME=${PYTHON_HOME:-/opt/soft/python}
export PYTHON_HOME=/root/anaconda3/bin/python3
#export HIVE_HOME=${HIVE_HOME:-/opt/soft/hive}
export HIVE_HOME=/opt/module/hive
#export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink}
export FLINK_HOME=/opt/module/flink-1.14.4
#export DATAX_HOME=${DATAX_HOME:-/opt/soft/datax}
#export SEATUNNEL_HOME=${SEATUNNEL_HOME:-/opt/soft/seatunnel}
#export CHUNJUN_HOME=${CHUNJUN_HOME:-/opt/soft/chunjun}
export CHUNJUN_HOME=/opt/module/flinkx-1.11_release/
#export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_HOME/bin:$SEATUNNEL_HOME/bin:$CHUNJUN_HOME/bin:$PATH
export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$PYTHON_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$CHUNJUN_HOME/bin:$PATH           

初始化資料庫

mysql  -udolphinscheduler -pdolphinscheduler
(1)建立資料庫
mysql> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
(2)建立使用者
mysql> CREATE USER 'dolphinscheduler'@'%' IDENTIFIED BY 'dolphinscheduler';
注:
若出現以下錯誤資訊,表明建立使用者的密碼過于簡單。
ERROR 1819 (HY000): Your password does not satisfy the current policy requirements
可提高密碼複雜度或者執行以下指令降低MySQL密碼強度級别。
mysql> set global validate_password_policy=0;
mysql> set global validate_password_length=4;
(3)賦予使用者相應權限
mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphinscheduler'@'%';
mysql> flush privileges;           

更新為mysql驅動

bash  /opt/module/dolphinscheduler-3.1.5-bin/tools/bin/upgrade-schema.sh           
DolphinScheduler僞叢集部署

更新圖

啟動 DolphinScheduler

使用上面建立的部署使用者運作以下指令完成部署,部署後的運作日志将存放在 logs 檔案夾内

bash ./bin/install.sh           
注意: 第一次部署的話,可能出現 5 次sh: bin/dolphinscheduler-daemon.sh: No such file or directory相關資訊,此為非重要資訊直接忽略即可
DolphinScheduler僞叢集部署
[dolphinscheduler@bigdata bin]$ jps
21713 ApiApplicationServer
21666 AlertServer
21586 MasterServer
21629 WorkerServer
22095 Jps           
DolphinScheduler僞叢集部署

jps

登入 DolphinScheduler

外網預設是無法通路的,因為防火牆不允許,是以要開啟防火牆,讓其可以通路這些端口号。

方法一:使用firewall

1、運作指令: firewall-cmd –get-active-zones 運作完成之後,可以看到zone名稱,如下:

DolphinScheduler僞叢集部署

2、執行如下指令指令:

firewall-cmd –zone=public –add-port=6379/tcp –permanent

DolphinScheduler僞叢集部署

3、重新開機防火牆,運作指令: firewall-cmd –reload

DolphinScheduler僞叢集部署

4、檢視端口号是否開啟,運作指令: firewall-cmd –query-port=6379/tcp

DolphinScheduler僞叢集部署

方法二:使用iptables

1、先運作如下指令:

/sbin/iptables -I INPUT -p tcp –dport xxx -j ACCEPT

2、然後運作:

/etc/rc.d/init.d/iptables save

或者這個指令:

1、-A INPUT -m state –state NEW -m tcp -p tcp –dport xxx -j ACCEPT

2、在運作這個指令:

-A INPUT -j REJECT –reject-with icmp-host-prohibited

注:xxx ——表示你要開啟的端口号,如:6379

測試:在windows下按下win+R鍵,輸入cmd,運作指令(需開啟telnet),如果變成空界面表示成功:

telnet 192.168.xx.xx 6379

注:開啟telnet方法如下:

1. 進入控制台,選擇程式;

DolphinScheduler僞叢集部署

2. 點選程式,進入程式和功能;

DolphinScheduler僞叢集部署

3.勾選telnet功能,點選确定即可。

DolphinScheduler僞叢集部署

開放指定端口

root賬戶下開放指定端口
firewall-cmd --zone=public --add-port=12345/tcp --permanent
firewall-cmd --reload
firewall-cmd --query-port=12345/tcp
root賬戶下關閉指定端口
firewall-cmd --remove-port=12345/tcp  --permanent
firewall-cmd --reload
firewall-cmd --query-port=12345/tcp
firewall-cmd --list-ports
檢視是否啟動端口
lsof -i:12345           

浏覽器通路位址 http://bigdata:12345/dolphinscheduler/ui 即可登入系統UI。預設的使用者名和密碼是 admin/dolphinscheduler123

啟停服務

# 一鍵停止叢集所有服務
bash ./bin/stop-all.sh

# 一鍵開啟叢集所有服務
bash ./bin/start-all.sh

# 啟停 Master
bash ./bin/dolphinscheduler-daemon.sh stop master-server
bash ./bin/dolphinscheduler-daemon.sh start master-server

# 啟停 Worker
bash ./bin/dolphinscheduler-daemon.sh start worker-server
bash ./bin/dolphinscheduler-daemon.sh stop worker-server

# 啟停 Api
bash ./bin/dolphinscheduler-daemon.sh start api-server
bash ./bin/dolphinscheduler-daemon.sh stop api-server

# 啟停 Alert
bash ./bin/dolphinscheduler-daemon.sh start alert-server
bash ./bin/dolphinscheduler-daemon.sh stop alert-server           

注意1:: 每個服務在路徑 <service>/conf/dolphinscheduler_env.sh 中都有 dolphinscheduler_env.sh 檔案,這是可以為微 服務需求提供便利。意味着您可以基于不同的環境變量來啟動各個服務,隻需要在對應服務中配置 <service>/conf/dolphinscheduler_env.sh 然後通過 <service>/bin/start.sh 指令啟動即可。但是如果您使用指令 /bin/dolphinscheduler-daemon.sh start <service> 啟動伺服器,它将會用檔案 bin/env/dolphinscheduler_env.sh 覆寫 <service>/conf/dolphinscheduler_env.sh 然後啟動服務,目的是為了減少使用者修改配置的成本.

注意2::服務用途請具體參見《系統架構設計》小節。Python gateway service 預設與 api-server 一起啟動,如果您不想啟動 Python gateway service 請通過更改 api-server 配置檔案 api-server/conf/application.yaml 中的 python-gateway.enabled : false 來禁用它。

群起腳本

#!/bin/bash
if [ $# -lt 1 ]
then
  echo "Input Args Error....."
  exit
fi
for i in bigdata
do
case $1 in
start)
  echo "==================START $i dolphinscheduler-叢集==================="
  ssh $i sh /opt/module/dolphinscheduler-3.1.5-bin/bin/start-all.sh
;;
restart)
 echo "==================RESTART $i dolphinscheduler-叢集==================="
 ssh $i sh /opt/module/dolphinscheduler-3.1.5-bin/bin/stop-all.sh
 ssh $i /opt/module/dolphinscheduler-3.1.5-bin/bin/start-all.sh
;;
stop)
  echo "==================STOP $i dolphinscheduler-叢集==================="
  ssh $i sh /opt/module/dolphinscheduler-3.1.5-bin/bin/stop-all.sh
;;
status)
  echo "==================狀态 $i dolphinscheduler-叢集==================="
  ssh $i sh /opt/module/dolphinscheduler-3.1.5-bin/bin/status-all.sh
;;
*)
 echo "Input Args Error....."
 exit
;;           
sh dolphinscheduler.sh start 啟動
sh dolphinscheduler.sh restart 重新啟動
sh dolphinscheduler.sh stop 停止           
DolphinScheduler僞叢集部署

登入頁面

http://bigdata:12345/dolphinscheduler/ui/login
賬号:admin
密碼:dolphinscheduler123           
DolphinScheduler僞叢集部署
DolphinScheduler僞叢集部署

注意

步驟

進入 MySQL 官網,網址:MySQL 驅動下載下傳位址

https://downloads.mysql.com/archives/c-j/           
DolphinScheduler僞叢集部署

這裡以 Java 開發所需的檔案為例,需要選擇其中的免安裝檔案(壓縮包形式)進行下載下傳。

DolphinScheduler僞叢集部署
DolphinScheduler僞叢集部署

下載下傳之後,将得到一個壓縮檔案,如 mysql-connector-java-8.0.16.zip,解壓該檔案,核心檔案是裡面的 .jar 檔案,如 mysql-connector-java-8.0.16.jar,其它的檔案沒有實質性的作用,但是建議保留,友善起見,可以直接把解壓後的檔案夾放到以後不會删除的任意位置。

繼續閱讀