天天看點

hadoop2.2.0源代碼編譯

一、環境說明

虛拟軟體:vmware workstation 10

虛拟機配置:

rhel server release 6.5 (santiago) 2.6.32-431.el6.x86_64

cpu:4核心,記憶體:4g,硬碟:50g

二、前提條件:

1:将rhel6.5的iso檔案作為yum源

2:hadoop-2.2.0-src.tar.gz

3:安裝jdk 1.6.0_43

4:安裝并配置apache-maven 3.0.5(apache-maven-3.0.5-bin.tar.gz)

源碼中building.txt中要求使用3.0,從hadoop2.0版本以後使用maven編譯,之前用ant)

解壓并配置環境變量

mvn -version

5:安裝并配置apache-ant-1.9.3-bin.zip(下載下傳二進制版本的,這個需要編譯安裝findbugs)

ant -version

6:下載下傳并安裝cmake cmake-2.8.12.1,安裝指令如下:

  tar -zxvf cmake-2.8.12.1.tar.gz

  cd cmake-2.8.12.1

  ./bootstrap

  make

  make install

  檢查安裝是否正确

cmake --version(如果能正确顯示版本号,則說明安裝正确)

7:下載下傳并安裝配置findbugs-2.0.2-source.zip

http://sourceforge.jp/projects/sfnet_findbugs/releases/

使用ant編譯安裝。如果不編譯安裝則編譯的時候會報

hadoop-common-project/hadoop-common/${env.findbugs_home}/src/xsl/default.xsl doesn’t exist. -> [help 1]

進入到解壓後的目錄,直接運作ant指令

如果不安裝,則在編譯時會報如下錯誤:

failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (site) on project hadoop-common: an ant buildexception has occured

8:安裝zlib-devel

預設情況下,系統沒有安裝zlib-devel

yum install zlib-devel

failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (make) on project hadoop-common

9: protobuf-2.5.0

yum install gcc-c++ (如果不安裝,則cmake configure失敗)

./configure

make

make check

make install

檢查安裝是否正确

protoc --version((如果能正确顯示版本号,則說明安裝正确)

三、hadoop2.2源碼編譯

1:進入hadoop2.2.0解壓後的源碼目錄

2:執行mvn指令編譯,此過程需要連接配接網絡,編譯的速度取決于你的網速

mvn clean package -pdist,native -dskiptests -dtar

2.1.create binary distribution without native code and without documentation:

$ mvn package -pdist -dskiptests -dtar

2.2.create binary distribution with native code and with documentation:

$ mvn package -pdist,native,docs -dskiptests -dtar

2.3.create source distribution:

$ mvn package -psrc -dskiptests

2.4.create source and binary distributions with native code and documentation:

$ mvn package -pdist,native,docs,src -dskiptests -dtar

2.5.create a local staging version of the website (in /tmp/hadoop-site)

$ mvn clean site; mvn site:stage -dstagingdirectory=/tmp/hadoop-site

3:編譯後的項目釋出版本在hadoop-2.2.0-src/hadoop-dist/target/目錄下

hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/

四、安裝hadoop 單點僞分布式模式

1:配置ssh互信

ssh-keygen -t dsa (或行ssh-keygen -t rsa -p "" 加上-p ""參數隻需要一次回車就可以執行完畢。不加需要三次回車)

cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys 

如果仍然需要輸入密碼,則執行如下指令

chmod 600 ~/.ssh/authorized_keys

2:将編譯後的hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0複制到/data/hadoop/目錄下

3:建議軟連結 ln -s hadoop-2.2.0 hadoop2

4: 在使用者的.bash_profile增加如下變量

export hadoop_home=/data/hadoop/hadoop2

export path=$path:$hadoop_home/bin

export path=$path:$hadoop_home/sbin

export hadoop_mapred_home=${hadoop_home}

export hadoop_common_home=${hadoop_home}

export hadoop_hdfs_home=${hadoop_home}

export yarn_home=${hadoop_home}

5:建議data.dir及namenode.dir目錄

mkdir hdfs

mkdir namenode

chmod -r 755 hdfs

6:修改hadoop-env.sh

export java_home=/usr/java/jdk1.6.0_43

7:修改core-site.xml,hdfs-site.xml,mapred-site.xml,yarn-site.xml,内容見附錄2

8:格式化hdfs檔案系統

執行指令 hadoop namenode -format

9:啟動hdfs,yarn

start-dfs.sh

start-yarn.sh

10:驗證啟動是否成功

jps

如果顯示如下6個程序,則啟動成功

53244 resourcemanager

53083 secondarynamenode

52928 datanode

53640 jps

52810 namenode

53348 nodemanager

五、運作自帶的wordcount例子

hadoop fs -mkdir /tmp

hadoop fs -mkdir /tmp/input

hadoop fs -put /usr/hadoop/test.txt /tmp/input

cd /data/hadoop/hadoop2/share/hadoop/mapreduce

hadoop jar hadoop-mapreduce-examples-2.2.0.jar wordcount /tmp/input /tmp/output

如果能正确運作,則hadoop安裝配置正确

六、附錄1:

設定的環境變量(/etc/profile,編輯後運作source /etc/profile/ 使配置生效)

#java set

export jre_home=/usr/java/jdk1.6.0_43/jre

export classpath=.:$java_home/lib/dt.jar:$java_home/lib/tools.jar

export path=$path:$java_home/bin

#maven set

export m2_home=/home/soft/maven

export path=$path:$m2_home/bin

#ant

export ant_home=/home/soft/apache-ant-1.9.3

export path=$path:$ant_home/bin

#findbugs

export findbugs_home=/home/soft/findbugs-2.0.2

export path=$path:$findbugs_home/bin

   附錄2:core-site.xml,hdfs-site.xml,mapred-site.xml,yarn-site.xml的檔案内容

core-site.xml内容:

<configuration>

   <property>

<name>fs.defaultfs</name>

<value>hdfs://vdata.kt:8020</value>

   </property>

</configuration>

hdfs-site.xml内容:

<name>dfs.name.dir</name>

<value>/data/hadoop/namenode</value>

<name>dfs.data.dir</name>

<value>/data/hadoop/hdfs</value>

<name>dfs.replication</name>

<value>1</value>

   </property>  

<name>dfs.permissions</name>

<value>false</value>

mapred-site.xml内容:

<name>mapreduce.framework.name</name>

<value>yarn</value>

   </property> 

yarn-site.xml内容:

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

    </property>