天天看點

HDFS高可用群集HA搭建一 .群集架構二.群集規劃三. HDFS高擴充

一 .群集架構

HDFS高可用群集HA搭建一 .群集架構二.群集規劃三. HDFS高擴充

二.群集規劃

namenode	datanode	journalnode		zkfc	zookeeper
bigdata01	yes							yes			yes		yes
bigdata02	yes			yes				yes			yes		yes
bigdata03				yes				yes			yes		yes
           

針對HDFS的HA群集,隻需要啟動HDFS相關的程序就可以了,YARN的相關程序可以不啟動,它們兩個的程序本來就是互相獨立的。

在HDFS的HA群集中,不需要使用SecondaryNameNode程序。

① namenode:  HDFS的主節點

② datanode  :  HDFS的從節點

③ journalnode : JournalNode程序,用來同步edits資訊的 

④ zkfc(DFSZKFailoverController) : 監視namenode的狀态,負責切換namenode節點的狀态

⑤ zookeeper(QuorumPeerMain) : 儲存ha群集節點狀态資訊

環境準備,三個幾點:

bigdata01 192.168.182.100
bigdata02 192.168.182.101
bigdata03 192.168.182.102
           

需提前将:IP、hostname、firewalld、JDK、SSH免密登入等基礎環境設定好。

注意:由于namenode進行故障切換的時候,需要在兩個namenode節點之間互相使用ssh進行連接配接,是以需要免密登入。

2.1 節點規劃 

使用三個節點搭建一個Zookeeper群集

bigdata01
bigdata02
bigdata03
           

2.2 配置Zookeeper

1.解壓安裝包

[ro[email protected] soft]# tar -zxvf apache-zookeeper-3.5.8-bin.tar.gz
           

2.修改配置

[[email protected] soft]# cd apache-zookeeper-3.5.8-bin/conf/
[[email protected] conf]# mv zoo_sample.cfg  zoo.cfg  
dataDir=/data/soft/apache-zookeeper-3.5.8-bin/data
server.0=bigdata01:2888:3888
server.1=bigdata02:2888:3888
server.2=bigdata03:2888:3888
           

建立目錄儲存myid檔案,并向myid檔案中寫入内容

myid中的值其實與zoo.cfg中server後面指定的編号是一一對應的。

編号0對應的是bigdata01這台機器,是以這裡指定0

[[email protected] conf]#cd /data/soft/apache-zookeeper-3.5.8-bin
[[email protected] apache-zookeeper-3.5.8-bin]# mkdir data
[[email protected] apache-zookeeper-3.5.8-bin]# cd data
# 将0寫入myid
[[email protected] data]# echo 0 > myid 
           

3. 将修改好配置的zookeeper拷貝到其它兩個節點

[[email protected] soft]# scp -rq apache-zookeeper-3.5.8-bin bigdata02:/data/soft/
[[email protected] soft]# scp -rq apache-zookeeper-3.5.8-bin bigdata03:/data/soft/
           

4.修改bigdata02和bigdata03上zookeeper中myid檔案的内容

# bigdata02
[[email protected] ~]# cd /data/soft/apache-zookeeper-3.5.8-bin/data/
[[email protected] data]# echo 1 > myid

#  bigdata03

[[email protected] ~]# cd /data/soft/apache-zookeeper-3.5.8-bin/data/
[[email protected] data]# echo 2 > myid
           

5. 啟動zookeeper群集

分别在bigdata01,bigdata02,bigdata03啟動Zookeeper程序

# bigdata01
[[email protected] apache-zookeeper-3.5.8-bin]# bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /data/soft/apache-zookeeper-3.5.8-bin/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

# bigdata02

[[email protected] apache-zookeeper-3.5.8-bin]# bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /data/soft/apache-zookeeper-3.5.8-bin/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

# bigdata03
[[email protected] apache-zookeeper-3.5.8-bin]# bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /data/soft/apache-zookeeper-3.5.8-bin/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

           

6. 驗證

分别在bigdata01,bigdata02,bigdata03上執行jps指令,驗證是否有QuorumPeerMain程序

如果沒有,就到對應節點的logs目錄下檢視zookeeper*-*.out日志檔案

執行 bin/zkServer.sh status 指令會發現一個節點顯示為leader,其它節點為follower

[[email protected] apache-zookeeper-3.5.8-bin]# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data/soft/apache-zookeeper-3.5.8-bin/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: follower

[[email protected] apache-zookeeper-3.5.8-bin]# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data/soft/apache-zookeeper-3.5.8-bin/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: leader


[[email protected] apache-zookeeper-3.5.8-bin]# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /data/soft/apache-zookeeper-3.5.8-bin/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: follower

           

7.如何停止zookeeper群集

想停止zookeeper群集,可以在所有節點 分别執行 bin/zkServer.sh stop 指令即可.

2.3 配置Hadoop群集

1. 解壓hadoop安裝包

[[email protected] soft]# tar -zxvf hadoop-3.2.0.tar.gz
           

2.修改hadoop配置檔案

[[email protected] soft]# cd hadoop-3.2.0/etc/hadoop/
[[email protected] hadoop]# 
           

① hadoop-env.sh

在檔案末尾增加環境變量資訊

[[email protected] hadoop]# vi hadoop-env.sh 
export JAVA_HOME=/data/soft/jdk1.8
export HADOOP_LOG_DIR=/data/hadoop_repo/logs/hadoop
           

② code-site.xml

[[email protected] hadoop]# vi core-site.xml
<configuration>
    # mycluster是叢集的邏輯名稱,需要和hdfs-site.xml中dfs.nameservices值一緻
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/data/hadoop_repo</value>
   </property>
    # 使用者角色配置,不配置此項會導緻web頁面報錯
    <property>
        <name>hadoop.http.staticuser.user</name>
        <value>root</value>
    </property>
    # zookeeper叢集位址
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>bigdata01:2181,bigdata02:2181,bigdata03:2181</value>
    </property>
</configuration>
           

③  hdfs-site.xml

[[email protected] hadoop]# vi hdfs-site.xml 
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    # 自定義的叢集名稱
    <property>
        <name>dfs.nameservices</name>
        <value>mycluster</value>
    </property>
    # 所有的namenode清單,邏輯名稱,不是namenode所在的主機名
    <property>
        <name>dfs.ha.namenodes.mycluster</name>
        <value>nn1,nn2</value>
    </property>
    # namenode之間用于RPC通信的位址,value填寫namenode所在的主機位址
    # 預設端口8020,注意mycluster與nn1要和前面的配置一緻
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn1</name>
        <value>bigdata01:8020</value>
    </property>
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn2</name>
        <value>bigdata02:8020</value>
    </property>
    # namenode的web通路位址,預設端口9870
    <property>
        <name>dfs.namenode.http-address.mycluster.nn1</name>
        <value>bigdata01:9870</value>
    </property>
    <property>
        <name>dfs.namenode.http-address.mycluster.nn2</name>
        <value>bigdata02:9870</value>
    </property>
    # journalnode主機位址,最少三台,預設端口8485
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://bigdata01:8485;bigdata02:8485;bigdata03:8485/mycluster</value>
    </property>
    # 故障時自動切換的實作類
    <property>
        <name>dfs.client.failover.proxy.provider.mycluster</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    # 故障時互相操作方式(namenode要切換active和standby),使用ssh方式
    <property>
      <name>dfs.ha.fencing.methods</name>
      <value>sshfence</value>
    </property>
    # 修改為自己使用者的ssh key存放位址
    <property>
      <name>dfs.ha.fencing.ssh.private-key-files</name>
      <value>/root/.ssh/id_rsa</value>
    </property>
    # namenode日志檔案輸出路徑,即journalnode讀取變更的位置
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/data/hadoop_repo/journalnode</value>
    </property>
    # 啟用自動故障轉移
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>
</configuration>
           

mapred-site.xml 和 yarn-site.xml 根據需要修改,這裡就不修改了,因為我們隻需要啟動HDFS相關服務

④  workers

[[email protected] hadoop]# vi workers
bigdata02
bigdata03
           

⑤  start-dfs.sh

[[email protected] hadoop]# cd /data/soft/hadoop-3.2.0/sbin
[[email protected] sbin]# vi start-dfs.sh
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_ZKFC_USER=root
HDFS_JOURNALNODE_USER=root

           

⑥ stop-dfs.sh

[[email protected] sbin]# vi stop-dfs.sh
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_ZKFC_USER=root
HDFS_JOURNALNODE_USER=root
           

start-yarn.sh, stop-yarn.sh根據需要配置,這裡不需要啟動這些yarn程序

3. 将修改好配置的安裝包拷貝到其他節點

[[email protected] sbin]# cd /data/soft/
[[email protected] soft]# scp -rq hadoop-3.2.0 bigdata02:/data/soft/
[r[email protected] soft]# scp -rq hadoop-3.2.0 bigdata03:/data/soft/
           

4. 格式化HDFS

此步驟隻需要第一個配置HA時執行一次即可

注意:在格式化HDFS之前需要先啟動所有的Journalnode

[[email protected] hadoop-3.2.0]# bin/hdfs --daemon start journalnode
[[email protected] hadoop-3.2.0]# bin/hdfs --daemon start journalnode
[ro[email protected] hadoop-3.2.0]# bin/hdfs --daemon start journalnode
           

任選一個namenode節點執行格式化

[[email protected] hadoop-3.2.0]# bin/hdfs namenode -format
....
....
2026-02-07 00:35:06,212 INFO common.Storage: Storage directory /data/hadoop_repo/dfs/name has been successfully formatted.
2026-02-07 00:35:06,311 INFO namenode.FSImageFormatProtobuf: Saving image file /data/hadoop_repo/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
2026-02-07 00:35:06,399 INFO namenode.FSImageFormatProtobuf: Image file /data/hadoop_repo/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 399 bytes saved in 0 seconds .
2026-02-07 00:35:06,405 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
2026-02-07 00:35:06,432 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at bigdata01/192.168.182.100
************************************************************/
           

看到 had been successfully formatted就說明hdfs格式化成功了

5.啟動namenode

[[email protected] hadoop-3.2.0]# bin/hdfs --daemon start namenode
           

在另一個namenode節點(bigdata02)上同步資訊,看到下面的資訊,說明同步成功

[[email protected] hadoop-3.2.0]# bin/hdfs namenode -bootstrapStandby
....
....
=====================================================
About to bootstrap Standby ID nn2 from:
           Nameservice ID: mycluster
        Other Namenode ID: nn1
  Other NN's HTTP address: http://bigdata01:9870
  Other NN's IPC  address: bigdata01/192.168.182.100:8020
             Namespace ID: 1820763709
            Block pool ID: BP-1332041116-192.168.182.100-1770395706205
               Cluster ID: CID-c12130ca-3a7d-4722-93b0-a79b0df3ed84
           Layout version: -65
       isUpgradeFinalized: true
=====================================================
2026-02-07 00:39:38,594 INFO common.Storage: Storage directory /data/hadoop_repo/dfs/name has been successfully formatted.
2026-02-07 00:39:38,654 INFO namenode.FSEditLog: Edit logging is async:true
2026-02-07 00:39:38,767 INFO namenode.TransferFsImage: Opening connection to http://bigdata01:9870/imagetransfer?getimage=1&txid=0&storageInfo=-65:1820763709:1770395706205:CID-c12130ca-3a7d-4722-93b0-a79b0df3ed84&bootstrapstandby=true
2026-02-07 00:39:38,854 INFO common.Util: Combined time for file download and fsync to all disks took 0.00s. The file download took 0.00s at 0.00 KB/s. Synchronous (fsync) write to disk of /data/hadoop_repo/dfs/name/current/fsimage.ckpt_0000000000000000000 took 0.00s.
2026-02-07 00:39:38,855 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000000000 size 399 bytes.
2026-02-07 00:39:38,894 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at bigdata02/192.168.182.101
************************************************************/
           

6. 格式化zookeeper節點

此步驟隻需執行一次即可

在任意一個節點上操作都可以

[[email protected] hadoop-3.2.0]# bin/hdfs zkfc -formatZK
....
....
2026-02-07 00:42:17,212 INFO zookeeper.ClientCnxn: Socket connection established to bigdata02/192.168.182.101:2181, initiating session
2026-02-07 00:42:17,220 INFO zookeeper.ClientCnxn: Session establishment complete on server bigdata02/192.168.182.101:2181, sessionid = 0x100001104b00098, negotiated timeout = 10000
2026-02-07 00:42:17,244 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/mycluster in ZK.
2026-02-07 00:42:17,249 INFO zookeeper.ZooKeeper: Session: 0x100001104b00098 closed
2026-02-07 00:42:17,251 WARN ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x100001104b00098
2026-02-07 00:42:17,251 INFO zookeeper.ClientCnxn: EventThread shut down for session: 0x100001104b00098
2026-02-07 00:42:17,254 INFO tools.DFSZKFailoverController: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down DFSZKFailoverController at bigdata01/192.168.182.100
************************************************************/
           

看到 Successfully created /hadoop-ha/mycluster in ZK. 說明執行成功.

7.啟動HDFS 高可用叢集

[[email protected] hadoop-3.2.0]# sbin/start-dfs.sh 
Starting namenodes on [bigdata01 bigdata02]
Last login: Sat Feb  7 00:02:27 CST 2026 on pts/0
bigdata01: namenode is running as process 6424.  Stop it first.
Starting datanodes
Last login: Sat Feb  7 00:47:13 CST 2026 on pts/0
Starting journal nodes [bigdata01 bigdata03 bigdata02]
Last login: Sat Feb  7 00:47:13 CST 2026 on pts/0
bigdata02: journalnode is running as process 4864.  Stop it first.
bigdata01: journalnode is running as process 6276.  Stop it first.
bigdata03: journalnode is running as process 2479.  Stop it first.
Starting ZK Failover Controllers on NN hosts [bigdata01 bigdata02]
Last login: Sat Feb  7 00:47:18 CST 2026 on pts/0
           

以後啟動HA群集,執行使用sbin/start-dfs.sh 即可,不需要再執行5,6步的操作(格式化)了

8. 驗證HA群集

此時通路兩個namenode節點的9870端口,一個顯示為active,一個顯示為Standby

http://bigdata01:9870/dfshealth.html

HDFS高可用群集HA搭建一 .群集架構二.群集規劃三. HDFS高擴充

http://bigdata02:9870/dfshealth.html

HDFS高可用群集HA搭建一 .群集架構二.群集規劃三. HDFS高擴充

9. 模拟切換

我們手工停掉active狀态的namenode,驗證standby 是否可以自動切換為active

[[email protected] hadoop-3.2.0]# jps
8758 DFSZKFailoverController
8267 NameNode
1581 QuorumPeerMain
8541 JournalNode
8814 Jps
[[email protected] hadoop-3.2.0]# kill 8267
[[email protected] hadoop-3.2.0]# jps      
8758 DFSZKFailoverController
1581 QuorumPeerMain
8541 JournalNode
8845 Jps
           

此時在檢視bigdata02的資訊,發現它狀态變為了active

接着我們把bigdata01中的namenode啟動起來,發現它狀态變為了standby

[[email protected] hadoop-3.2.0]# bin/hdfs --daemon start namenode 
[[email protected] hadoop-3.2.0]# jps
8898 NameNode
8758 DFSZKFailoverController
8967 Jps
1581 QuorumPeerMain
8541 JournalNode
           

通過驗證,我們已實作HDFS高可用。

以後我們再操作HDFS就應該這樣操作了。

這裡面的mycluster就是在hdfs-site.xml中配置的dfs.nameservices屬性的值。

[[email protected] hadoop-3.2.0]# bin/hdfs dfs -ls hdfs://mycluster/
[[email protected] hadoop-3.2.0]# bin/hdfs dfs -put README.txt hdfs://mycluster/
[[email protected] hadoop-3.2.0]# bin/hdfs dfs -ls hdfs://mycluster/     Found 1 items
-rw-r--r--   2 root supergroup       1361 2026-02-07 00:58 hdfs://mycluster/README.txt
           

10. 停止HDFS群集

[[email protected] hadoop-3.2.0]# sbin/stop-dfs.sh 
Stopping namenodes on [bigdata01 bigdata02]
Last login: Sat Feb  7 00:52:01 CST 2026 on pts/0
Stopping datanodes
Last login: Sat Feb  7 01:03:23 CST 2026 on pts/0
Stopping journal nodes [bigdata01 bigdata03 bigdata02]
Last login: Sat Feb  7 01:03:25 CST 2026 on pts/0
Stopping ZK Failover Controllers on NN hosts [bigdata01 bigdata02]
Last login: Sat Feb  7 01:03:29 CST 2026 on pts/0
           

三. HDFS高擴充

HDFS Federation可以解決單一命名空間存在的問題,使用多個NameNode,每個NameNode負責一個命名空間

這種設計可以提供一下特性:

1 : HDFS群集擴充性。多個NameNode分管一部分目錄,使得一個群集可以擴充到更多節點,不再因記憶體的顯示制約檔案存儲資料。

2 : 性能更高效。 多個NameNode管理不同的資料,且同時對外提供服務,将為使用者提供更高的讀寫吞吐率。

3 : 良好的隔離性。使用者可以根據需要将不同業務資料交由不同NameNode管理,這樣不同業務之間影響很小。

一般Federation會結合HA一起使用:

HDFS高可用群集HA搭建一 .群集架構二.群集規劃三. HDFS高擴充

這裡面用到了4個NameNode和6個DataNode

NN-1、NN-2、NN-3、NN-4

DN-1、DN-2、DN-3、DN-4、DN-5、DN-6

其中NN-1和NN-3配置了HA,提供了一個命名空間,/share

       NN-2和NN-4配置了HA,提供了一個命名空間,/user

這樣後期我們存儲資料的時候,可以根據資料的業務類型來區分是否存儲到share目錄下還是user目錄下,此時HDFS的存儲能力總和就是/share和/user兩個命名空間的總和了.

繼續閱讀