天天看點

001 Hadoop分布式叢集搭建

安裝說明

OS: **CentOS 7** 
hadoop: **hadoop-2.7.4**
操作工具:**Xshell** (可同時向多個機器傳輸指令)
           

基礎環境配置

  1. 所有節點 修改hosts
    [[email protected] ~]# vi /etc/hosts
    
    # 新增
    192.168.74.139 hadoop-01
    192.168.74.140 hadoop-02
    192.168.74.141 hadoop-03
               
  2. 所有節點 防火牆配置
    [[email protected] ~]# systemctl status firewalld.service
    
    
    ● firewalld.service - firewalld - dynamic firewall daemon
       Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
       Active: active (running) since Wed 2021-07-21 19:44:31 +08; 19min ago
         Docs: man:firewalld(1)
     Main PID: 717 (firewalld)
       CGroup: /system.slice/firewalld.service
               └─717 /usr/bin/python2 -Es /usr/sbin/firewalld --nofork --nopid
    
    Jul 21 19:44:29 hadoop-01 systemd[1]: Starting firewalld - dynamic firewall daemon...
    Jul 21 19:44:31 hadoop-01 systemd[1]: Started firewalld - dynamic firewall daemon.
    Jul 21 19:44:31 hadoop-01 firewalld[717]: WARNING: AllowZoneDrifting is enabled. This is considered an insecure configuration option. It will be removed in a future release. Please consider disabling it now.
    
    [[email protected] ~]# systemctl stop firewalld.service
    [[email protected] ~]# systemctl disable firewalld.service
    
    Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
    Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service. 
               
    注:虛拟機可以直接關閉防火牆, 阿裡雲需要配置防火請政策
  3. 所有節點 生成 ssh密鑰
    [[email protected] ~]# cd
    [[email protected] ~]# ssh-keygen -t dsa
    
    Generating public/private dsa key pair.
    Enter file in which to save the key (/root/.ssh/id_dsa): 
    Created directory '/root/.ssh'.
    Enter passphrase (empty for no passphrase): 
    Enter same passphrase again: 
    Your identification has been saved in /root/.ssh/id_dsa.
    Your public key has been saved in /root/.ssh/id_dsa.pub.
    The key fingerprint is:
    SHA256:C4IRep0I9fGDxkMFgcDS/450hjrSsA6xp4rDu23oZLw [email protected]
    The key's randomart image is:
    +---[DSA 1024]----+
    |++o.=+.          |
    |.+oB =           |
    |o +.O o          |
    | . +.. .         |
    |. . .o. S        |
    |oo  o.+. .       |
    |+Boo =  .        |
    |O=*.. .          |
    |BE+o             |
    +----[SHA256]-----+
    
    [[email protected] ~]# cd /root/.ssh/
    [[email protected] .ssh]# cat id_dsa.pub >> authorized_keys
               
    注: 若原來有密鑰,則先删除
  4. 密鑰複制
    # hadoop-01上執行
    [[email protected] .ssh]# ssh-copy-id -i /root/.ssh/id_dsa.pub hadoop-03
    
    # hadoop-02上執行
    [[email protected] .ssh]# ssh-copy-id -i /root/.ssh/id_dsa.pub hadoop-03
    
    # hadoop-03上執行
    [[email protected] .ssh]# scp /root/.ssh/authorized_keys hadoop-01:/root/.ssh/
    [[email protected] .ssh]# scp /root/.ssh/authorized_keys hadoop-02:/root/.ssh/
               
  5. 測試ssh登陸
    [[email protected] ~]# ssh hadoop-03
    Last login: Wed Jul 21 23:13:22 2021 from 192.168.74.1
    [[email protected] ~]#
               

JDK配置

  1. 所有節點 檢查有無系統自帶java
    [[email protected] ~]#  rpm -qa |grep java
    [[email protected] ~]# 
               
    注: 由于本系統使用的centos7 mini 所有沒有java相關的包, 若有請使用指令: ‘rpm -e --nodeps 軟體名’ 解除安裝
  2. 所有節點 根目錄建立software目錄,用于安裝軟體并傳輸jdk至software
    [[email protected] ~]# mkdir /software
    [[email protected] ~]# cd /software/
    [[email protected] software]# scp jdk-8u181-linux-x64.tar.gz [email protected]:/software	
    [[email protected] software]# scp jdk-8u181-linux-x64.tar.gz [email protected]:/software
    [[email protected] software]# 
               
  3. 所有節點 解壓檔案并重命名
    [[email protected] software]# tar -zxvf jdk-8u181-linux-x64.tar.gz
    [[email protected] software]# mv jdk1.8.0_181/ jdk
               
  4. 所有節點 修改配置
    # 檔案備份
    [[email protected] software]# cp /etc/profile /etc/profile_back  
    [[email protected] ~]# vi /etc/profile
    
    # 添加
    export JAVA_HOME=/software/jdk
    export PATH=.:$PATH:$JAVA_HOME/bin
    
    [[email protected] ~]# source /etc/profile
    [[email protected] ~]# java -version
    
    java version "1.8.0_181"
    Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
               

hadoop安裝

注:先在一台機器上面安裝,然後複制到其他的機器上面即可。

  1. hadoop-01 解壓并重命名hadoop
    [[email protected] software]# tar -xzvf hadoop-2.7.4.tar.gz
    [[email protected] software]# mv hadoop-2.7.4 hadoop
               
  2. 所有節點 配置路徑
    [[email protected] ~]# vi /etc/profile
    
    export HADOOP_HOME=/software/hadoop
    export PATH=.:$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    
    [[email protected] ~]# source /etc/profile
               
  3. hadoop-01 hadoop-env.sh 配置
    [[email protected] hadoop]# cd /software/hadoop/etc/hadoop/
    [[email protected] hadoop]# vi hadoop-env.sh
    
    # 添加
    export JAVA_HOME=/software/jdk
    export HADOOP_CLASSPATH=.:$CLASSPATH:$HADOOP_CLASSPATH:$HADOOP_HOME/bin
    export HADOOP_PID_DIR=/software/hadoop/pids
               
  4. hadoop-01 hdfs-site.xml 配置

    <configuration></configuration> 中添加

    <property>
     <name>dfs.datanode.data.dir</name>
     <value>file:///software/hadoop/data/datanode</value>
    </property>
    <property>
     <name>dfs.namenode.name.dir</name>
     <value>file:///software/hadoop/data/namenode</value>
    </property>
    <property>
     <name>dfs.namenode.http-address</name>
     <value>hadoop-01:50070</value>
    </property>
    <property>
     <name>dfs.namenode.secondary.http-address</name>
     <value>hadoop-02:50090</value>
    </property>
    <property>
     <name>dfs.replication</name>
     <value>1</value>
    </property>
               
  5. hadoop-01 yarn-site.xml 配置

    <configuration></configuration> 中添加

    <property>
     <name>yarn.nodemanager.aux-services</name>
     <value>mapreduce_shuffle</value>
    </property>
    <property>
     <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
     <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
     <name>yarn.resourcemanager.resource-tracker.address</name>
     <value>hadoop-01:8025</value>
    </property>
    <property>
     <name>yarn.resourcemanager.scheduler.address</name>
     <value>hadoop-01:8030</value>
    </property>
    <property>
     <name>yarn.resourcemanager.address</name>
     <value>hadoop-01:8050</value>
    </property>
               
  6. hadoop-01 core-site.xml 配置

    <configuration></configuration> 中添加

    <property>
     <name>fs.defaultFS</name>
     <value>hdfs://hadoop-01/</value>
    </property>
    <property>
     <name>ha.zookeeper.quorum</name>
     <value>hadoop-01:2181,hadoop-02:2181,hadoop-03:2181</value>
    </property>
               
  7. hadoop-01 slaves 配置
    [[email protected] hadoop]# vi slaves
    
    # 替換
    hadoop-02
    hadoop-03
               
  8. hadoop-01 yarn-env.sh 配置
    [[email protected] hadoop]# vi yarn-env.sh
    
    # 添加
    export YARN_PID_DIR=/software/hadoop/pids
               
  9. hadoop-01 拷貝hadoop到其它的節點
    [[email protected] software]# scp -r /software/hadoop hadoop-02:/software/
    [[email protected] software]# scp -r /software/hadoop hadoop-03:/software/
               
  10. 所有節點

啟動/停止 Hadoop 叢集

  1. hadoop-01 格式化檔案系統

    結果中出現successfully 則表示初始化成功。

    001 Hadoop分布式叢集搭建
  2. hadoop-01 start-dfs.sh 啟動報錯
    [[email protected] sbin]# cd /software/hadoop/sbin/
    [[email protected] sbin]# start-dfs.sh
    
    Starting namenodes on [hadoop-01]
    The authenticity of host 'hadoop-01 (192.168.74.139)' can't be established.
    ECDSA key fingerprint is SHA256:GYHhhWIAfdzHvTH54Vq36wY0IBckonbF6oPFb4k0ALc.
    ECDSA key fingerprint is MD5:9d:0f:94:a1:f8:98:ab:1c:c9:54:0f:87:88:91:57:ec.
    Are you sure you want to continue connecting (yes/no)? yes
    hadoop-01: Warning: Permanently added 'hadoop-01,192.168.74.139' (ECDSA) to the list of known hosts.
    hadoop-01: Error: JAVA_HOME is not set and could not be found.
    hadoop-03: Error: JAVA_HOME is not set and could not be found.
    hadoop-02: Error: JAVA_HOME is not set and could not be found.
    Starting secondary namenodes [hadoop-02]
    hadoop-02: Error: JAVA_HOME is not set and could not be found.
    
               
    解決辦法: 在所有節點上修改/etc/hadoop/hadoop-env.sh 中設 JAVA_HOME
    [[email protected] sbin]# vi /software/hadoop/etc/hadoop/hadoop-env.sh
    
    # 添加
    export JAVA_HOME=/software/jdk
    
    # 重新啟動 
    [[email protected] sbin]# start-dfs.sh
    
    Starting namenodes on [hadoop-01]
    hadoop-01: starting namenode, logging to /software/hadoop/logs/hadoop-root-namenode-hadoop-01.out
    hadoop-03: starting datanode, logging to /software/hadoop/logs/hadoop-root-datanode-hadoop-03.out
    hadoop-02: datanode running as process 2078. Stop it first.
    Starting secondary namenodes [hadoop-02]
    hadoop-02: starting secondarynamenode, logging to /software/hadoop/logs/hadoop-root-secondarynamenode-hadoop-02.out
               
  3. hadoop-01 start-yarn.sh 啟動
    [[email protected] sbin]#  start-yarn.sh
    
    starting yarn daemons
    starting resourcemanager, logging to /software/hadoop/logs/yarn-root-resourcemanager-hadoop-01.out
    hadoop-03: starting nodemanager, logging to /software/hadoop/logs/yarn-root-nodemanager-hadoop-03.out
    hadoop-02: starting nodemanager, logging to /software/hadoop/logs/yarn-root-nodemanager-hadoop-02.out
               
  4. 各個節點程序
    [[email protected] ~]# jps
    
    1889 ResourceManager
    1611 NameNode
    2142 Jps
               
    [[email protected] ~]# jps
    
    1712 NodeManager
    1534 DataNode
    1631 SecondaryNameNode
    1839 Jps
               
    [[email protected] ~]# jps
    
    1640 NodeManager
    1531 DataNode
    1772 Jps
               
  5. 浏覽器通路:http://192.168.74.139:50070
    001 Hadoop分布式叢集搭建
  6. 浏覽器通路:http://192.168.74.139:8088
    001 Hadoop分布式叢集搭建

叢集安裝完成!!!

繼續閱讀