JDK-7u79-64.rpm 安裝
環境:CentOS-6.3,Hadoop-2.5.1,zookeeper-3.4.6,jdk-1.7
準備工作:關閉防火牆,時間同步(必做)
hbase--hadoop--jdk版本對應關系

service iptables stop
ntpdate 0.asia.pool.ntp.org
1、主機映射hadoop1,hadoop2,hadoop3,hadoop4
Vi /etc/hosts
192.168.25.151 hadoop1
192.168.25.152 hadoop2
192.168.25.153 hadoop3
192.168.25.154 hadoop4
2、免密碼登入
hadoop1,hadoop2,hadoop3,hadoop4
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
驗證自己登入到自己上
如:ssh hadoop1
Hadoop1
cd ~/.ssh
Hadoop1的公鑰拷貝到hadoop2,hadoop3,hadoop4
scp ./id_dsa.pub [email protected]:/opt/
scp ./id_dsa.pub [email protected]:/opt/
scp ./id_dsa.pub [email protected]:/opt/
hadoop2上
cat /opt/id_dsa.pub >> ~/.ssh/authorized_keys
hadoop3上
cat /opt/id_dsa.pub >> ~/.ssh/authorized_keys
hadoop4上
cat /opt/id_dsa.pub >> ~/.ssh/authorized_keys
驗證hadoop1登入到hadoop2,hadoop3,hadoop4
ssh hadoop2 exit
ssh hadoop3 exit
ssh hadoop4 exit
hadoop2拷貝到Hadoop1
scp ./id_dsa.pub [email protected]:/opt/
Hadoop1上
cat /opt/id_dsa.pub >> ~/.ssh/authorized_keys
ssh hadoop1
3、環境變量hadoop1,hadoop2,hadoop3,hadoop4
vi ~/.bash_profile
export HADOOP_HOME=/home/hadoop-2.5.1
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export ZOOKEEPER_HOME=/home/zookeeper-3.4.6
export PATH=$PATH:$ZOOKEEPER_HOME/bin
source ~/.bash_profile
四台機器:hadoop1,hadoop2,hadoop3,hadoop4
NN | DN | ZK | ZKFC | JN | RM | NM(任務管理) | |
Hadoop1 | Y | Y | Y | Y | |||
Hadoop2 | Y | Y | Y | Y | Y | Y | Y |
Hadoop3 | Y | Y | Y | Y | |||
Hadoop4 | Y | Y | Y |
在hadoop1上解壓hadoop-2.5.1.tar.gz
1. vim core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://bjsxt</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop</value>
</property>
</configuration>
2. vim hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>bjsxt</value>
</property>
<property>
<name>dfs.ha.namenodes.bjsxt</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.bjsxt.nn1</name>
<value>hadoop1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.bjsxt.nn2</name>
<value>hadoop2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.bjsxt.nn1</name>
<value>hadoop1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.bjsxt.nn2</name>
<value>hadoop2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop2:8485;hadoop3:8485;hadoop4:8485/bjsxt</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.bjsxt</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_dsa</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/hadoop/data</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
</configuration>
3. vim hadoop-env.sh
将其中的JAVA_HOME 修改成為
export JAVA_HOME=/usr/java/jdk1.7.0_79
hadoop1拷貝到hadoop2,hadoop3,hadoop4
[[email protected] home]# scp -r hadoop-2.5.1/ [email protected]:/home/
[[email protected] home]# scp -r hadoop-2.5.1/ [email protected]:/home/
[[email protected] home]# scp -r hadoop-2.5.1/ [email protected]:/home/
4. 準備zookeeper
a) 三台zookeeper:hadoop1,hadoop2,hadoop3
b) 配置或者修改zoo.cfg配置檔案,
vi zoo.cfg
dataDir=/opt/zookeeper
clientPort=2181
tickTime=2000
initLimit=5
syncLimit=2
server.1=hadoop1:2888:3888
server.2=hadoop2:2888:3888
server.3=hadoop3:2888:3888
c) 在所有節點的dataDir目錄中建立
mkdir /opt/zookeeper
在/opt/zookeeper中建立一個myid的檔案,檔案内容對應的節點上分别為1,2,3
5. 配置hadoop中的slaves,分别是
hadoop2
hadoop3
hadoop4
6. 啟動三個zookeeper:zkServer.sh start
7. 啟動三個JournalNode:hadoop-daemon.sh start journalnode
8. 在其中一個namenode(Hadoop1)上格式化:hdfs namenode -format
9. 把剛剛格式化之後的中繼資料拷貝到另外一個namenode(Hadoop2)上 scp -r /opt/hadoop/ [email protected]:/opt/hadoop/
a) 啟動剛剛格式化的namenode(Hadoop1) hadoop-daemon.sh start namenode
b) 在沒有格式化的namenode(Hadoop2)上執行:hdfs namenode -bootstrapStandby
c) 啟動第二個namenode(Hadoop2) hadoop-daemon.sh start namenode
10. 在其中一個namenode(這裡用Hadoop1)上初始化zkfc:hdfs zkfc -formatZK
11. 停止上面節點:stop-dfs.sh
12. 全面啟動:start-dfs.sh
執行完上面步驟之後才可以看到active和standby兩種狀态
12.1 叢集yarn配置,所有節點搭建分布式計算功能
vim mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
vim yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>coolfxl</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop1</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop2</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
</property>
</configuration>
13. 重新開機伺服器的操作:
service iptables stop
ntpdate 0.asia.pool.ntp.org
zkServer.sh start
【
hadoop1 start-dfs.sh
hadoop1 start-yarn.sh
hadoop2 yarn-daemon.sh start resourcemanager
】
或者用
【hadoop1 start-all.sh】
啟動成功之後:http://192.168.25.151:50070
任何一台含有hadoop的節點上,都可以進行建立檔案
對應指令:hdfs dfs -mkdir /test (檔案夾名稱)
讀取檔案内容
hdfs dfs -text /usr/nginx/html/index.html (檔案名稱)
檔案上傳
hadoop fs -put srcFile destFile
hadoop fs -put spark.txt /
跑一個MapReduce任務
hadoop jar /root/wc.jar com.dkjhl.mr.wc.RunJob
浏覽器輸入網址:http://192.168.25.151:8088/cluster 可以看到一下界面