安裝說明
OS: **CentOS 7**
hadoop: **hadoop-2.7.4**
操作工具:**Xshell** (可同時向多個機器傳輸指令)
基礎環境配置
- 所有節點 修改hosts
[[email protected] ~]# vi /etc/hosts # 新增 192.168.74.139 hadoop-01 192.168.74.140 hadoop-02 192.168.74.141 hadoop-03
- 所有節點 防火牆配置
注:虛拟機可以直接關閉防火牆, 阿裡雲需要配置防火請政策[[email protected] ~]# systemctl status firewalld.service ● firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2021-07-21 19:44:31 +08; 19min ago Docs: man:firewalld(1) Main PID: 717 (firewalld) CGroup: /system.slice/firewalld.service └─717 /usr/bin/python2 -Es /usr/sbin/firewalld --nofork --nopid Jul 21 19:44:29 hadoop-01 systemd[1]: Starting firewalld - dynamic firewall daemon... Jul 21 19:44:31 hadoop-01 systemd[1]: Started firewalld - dynamic firewall daemon. Jul 21 19:44:31 hadoop-01 firewalld[717]: WARNING: AllowZoneDrifting is enabled. This is considered an insecure configuration option. It will be removed in a future release. Please consider disabling it now. [[email protected] ~]# systemctl stop firewalld.service [[email protected] ~]# systemctl disable firewalld.service Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
- 所有節點 生成 ssh密鑰
注: 若原來有密鑰,則先删除[[email protected] ~]# cd [[email protected] ~]# ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/root/.ssh/id_dsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_dsa. Your public key has been saved in /root/.ssh/id_dsa.pub. The key fingerprint is: SHA256:C4IRep0I9fGDxkMFgcDS/450hjrSsA6xp4rDu23oZLw [email protected] The key's randomart image is: +---[DSA 1024]----+ |++o.=+. | |.+oB = | |o +.O o | | . +.. . | |. . .o. S | |oo o.+. . | |+Boo = . | |O=*.. . | |BE+o | +----[SHA256]-----+ [[email protected] ~]# cd /root/.ssh/ [[email protected] .ssh]# cat id_dsa.pub >> authorized_keys
- 密鑰複制
# hadoop-01上執行 [[email protected] .ssh]# ssh-copy-id -i /root/.ssh/id_dsa.pub hadoop-03 # hadoop-02上執行 [[email protected] .ssh]# ssh-copy-id -i /root/.ssh/id_dsa.pub hadoop-03 # hadoop-03上執行 [[email protected] .ssh]# scp /root/.ssh/authorized_keys hadoop-01:/root/.ssh/ [[email protected] .ssh]# scp /root/.ssh/authorized_keys hadoop-02:/root/.ssh/
- 測試ssh登陸
[[email protected] ~]# ssh hadoop-03 Last login: Wed Jul 21 23:13:22 2021 from 192.168.74.1 [[email protected] ~]#
JDK配置
- 所有節點 檢查有無系統自帶java
注: 由于本系統使用的centos7 mini 所有沒有java相關的包, 若有請使用指令: ‘rpm -e --nodeps 軟體名’ 解除安裝[[email protected] ~]# rpm -qa |grep java [[email protected] ~]#
- 所有節點 根目錄建立software目錄,用于安裝軟體并傳輸jdk至software
[[email protected] ~]# mkdir /software [[email protected] ~]# cd /software/ [[email protected] software]# scp jdk-8u181-linux-x64.tar.gz [email protected]:/software [[email protected] software]# scp jdk-8u181-linux-x64.tar.gz [email protected]:/software [[email protected] software]#
- 所有節點 解壓檔案并重命名
[[email protected] software]# tar -zxvf jdk-8u181-linux-x64.tar.gz [[email protected] software]# mv jdk1.8.0_181/ jdk
- 所有節點 修改配置
# 檔案備份 [[email protected] software]# cp /etc/profile /etc/profile_back [[email protected] ~]# vi /etc/profile # 添加 export JAVA_HOME=/software/jdk export PATH=.:$PATH:$JAVA_HOME/bin [[email protected] ~]# source /etc/profile [[email protected] ~]# java -version java version "1.8.0_181" Java(TM) SE Runtime Environment (build 1.8.0_181-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
hadoop安裝
注:先在一台機器上面安裝,然後複制到其他的機器上面即可。
- hadoop-01 解壓并重命名hadoop
[[email protected] software]# tar -xzvf hadoop-2.7.4.tar.gz [[email protected] software]# mv hadoop-2.7.4 hadoop
- 所有節點 配置路徑
[[email protected] ~]# vi /etc/profile export HADOOP_HOME=/software/hadoop export PATH=.:$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin [[email protected] ~]# source /etc/profile
- hadoop-01 hadoop-env.sh 配置
[[email protected] hadoop]# cd /software/hadoop/etc/hadoop/ [[email protected] hadoop]# vi hadoop-env.sh # 添加 export JAVA_HOME=/software/jdk export HADOOP_CLASSPATH=.:$CLASSPATH:$HADOOP_CLASSPATH:$HADOOP_HOME/bin export HADOOP_PID_DIR=/software/hadoop/pids
-
hadoop-01 hdfs-site.xml 配置
<configuration></configuration> 中添加
<property> <name>dfs.datanode.data.dir</name> <value>file:///software/hadoop/data/datanode</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:///software/hadoop/data/namenode</value> </property> <property> <name>dfs.namenode.http-address</name> <value>hadoop-01:50070</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop-02:50090</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property>
-
hadoop-01 yarn-site.xml 配置
<configuration></configuration> 中添加
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>hadoop-01:8025</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>hadoop-01:8030</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>hadoop-01:8050</value> </property>
-
hadoop-01 core-site.xml 配置
<configuration></configuration> 中添加
<property> <name>fs.defaultFS</name> <value>hdfs://hadoop-01/</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>hadoop-01:2181,hadoop-02:2181,hadoop-03:2181</value> </property>
- hadoop-01 slaves 配置
[[email protected] hadoop]# vi slaves # 替換 hadoop-02 hadoop-03
- hadoop-01 yarn-env.sh 配置
[[email protected] hadoop]# vi yarn-env.sh # 添加 export YARN_PID_DIR=/software/hadoop/pids
- hadoop-01 拷貝hadoop到其它的節點
[[email protected] software]# scp -r /software/hadoop hadoop-02:/software/ [[email protected] software]# scp -r /software/hadoop hadoop-03:/software/
- 所有節點
啟動/停止 Hadoop 叢集
-
hadoop-01 格式化檔案系統
結果中出現successfully 則表示初始化成功。
- hadoop-01 start-dfs.sh 啟動報錯
解決辦法: 在所有節點上修改/etc/hadoop/hadoop-env.sh 中設 JAVA_HOME[[email protected] sbin]# cd /software/hadoop/sbin/ [[email protected] sbin]# start-dfs.sh Starting namenodes on [hadoop-01] The authenticity of host 'hadoop-01 (192.168.74.139)' can't be established. ECDSA key fingerprint is SHA256:GYHhhWIAfdzHvTH54Vq36wY0IBckonbF6oPFb4k0ALc. ECDSA key fingerprint is MD5:9d:0f:94:a1:f8:98:ab:1c:c9:54:0f:87:88:91:57:ec. Are you sure you want to continue connecting (yes/no)? yes hadoop-01: Warning: Permanently added 'hadoop-01,192.168.74.139' (ECDSA) to the list of known hosts. hadoop-01: Error: JAVA_HOME is not set and could not be found. hadoop-03: Error: JAVA_HOME is not set and could not be found. hadoop-02: Error: JAVA_HOME is not set and could not be found. Starting secondary namenodes [hadoop-02] hadoop-02: Error: JAVA_HOME is not set and could not be found.
[[email protected] sbin]# vi /software/hadoop/etc/hadoop/hadoop-env.sh # 添加 export JAVA_HOME=/software/jdk # 重新啟動 [[email protected] sbin]# start-dfs.sh Starting namenodes on [hadoop-01] hadoop-01: starting namenode, logging to /software/hadoop/logs/hadoop-root-namenode-hadoop-01.out hadoop-03: starting datanode, logging to /software/hadoop/logs/hadoop-root-datanode-hadoop-03.out hadoop-02: datanode running as process 2078. Stop it first. Starting secondary namenodes [hadoop-02] hadoop-02: starting secondarynamenode, logging to /software/hadoop/logs/hadoop-root-secondarynamenode-hadoop-02.out
- hadoop-01 start-yarn.sh 啟動
[[email protected] sbin]# start-yarn.sh starting yarn daemons starting resourcemanager, logging to /software/hadoop/logs/yarn-root-resourcemanager-hadoop-01.out hadoop-03: starting nodemanager, logging to /software/hadoop/logs/yarn-root-nodemanager-hadoop-03.out hadoop-02: starting nodemanager, logging to /software/hadoop/logs/yarn-root-nodemanager-hadoop-02.out
- 各個節點程序
[[email protected] ~]# jps 1889 ResourceManager 1611 NameNode 2142 Jps
[[email protected] ~]# jps 1712 NodeManager 1534 DataNode 1631 SecondaryNameNode 1839 Jps
[[email protected] ~]# jps 1640 NodeManager 1531 DataNode 1772 Jps
- 浏覽器通路:http://192.168.74.139:50070
- 浏覽器通路:http://192.168.74.139:8088