天天看点

Hadoop源码本机编译

在启动hadoop时,我看到有下面警告信息

[[email protected] ~]$ hadoop/2.6.1/sbin/start-dfs.sh 
15/12/14 10:19:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [master]
master: starting namenode, logging to /home/hadoop/hadoop/2.6.1/logs/hadoop-hadoop-namenode-Yarn-Master.out
master: starting datanode, logging to /home/hadoop/hadoop/2.6.1/logs/hadoop-hadoop-datanode-Yarn-Master.out
slave_2: ssh: connect to host slave_2 port 22: No route to host
slave_1: ssh: connect to host slave_1 port 22: No route to host
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/hadoop/2.6.1/logs/hadoop-hadoop-secondarynamenode-Yarn-Master.out
15/12/14 10:19:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[[email protected] ~]$ 
           

Google出来可以参考 http://hadoop.apache.org/docs/r2.6.1/hadoop-project-dist/hadoop-common/NativeLibraries.html这篇官方文档需要自己编译hadoop的本地库,我查看了一下自己的本地库,果然是64位的,但我的机器是32位的

[[email protected] ~]$ file hadoop/2.6.1/lib/native/libhadoop.so.1.0.0 
hadoop/2.6.1/lib/native/libhadoop.so.1.0.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped
[[email protected] ~]$ 
           

自己从网上下载了2.6.1版本的源码包,直接编译,报各种错误,然后各种Google,找到两篇不错的博客:centos下 http://blog.pengduncun.com/?p=1208,ubuntu下 http://my.oschina.net/laigous/blog/356552,虽然两个系统需要的软件发布版本不同,但要装的软件是一样的,下面均以CentOS为例。

编译需要的软件

首先是官方文档上写明了的

The packages you need to install on the target platform are:

  • C compiler (e.g. GNU C Compiler)
  • GNU Autools Chain: autoconf, automake, libtool
  • zlib-development package (stable version >= 1.2.0)
  • openssl-development package(e.g. libssl-dev)

Once you installed the prerequisite packages use the standard hadoop pom.xml file and pass along the native flag to build the native hadoop library:

1、C相关的编译器,可以直接使用 yum groupinstall "Development Tools"来安装,有人说gcc需要4.4版本的,但是我的gcc默认版本就是4.4的所以没验证

[[email protected] ~]$ gcc --version
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16)
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[[email protected] ~]$ 
           
[[email protected] ~]$ libtool --version
ltmain.sh (GNU libtool) 2.2.6b
Written by Gordon Matzigkeit <[email protected]>, 1996

Copyright (C) 2008 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
[[email protected] ~]$ automake --version
automake (GNU automake) 1.11.1
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Tom Tromey <[email protected]>
       and Alexandre Duret-Lutz <[email protected]>.
[[email protected] ~]$ autoconf --version
autoconf (GNU Autoconf) 2.63
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv2+: GNU GPL version 2 or later
<http://gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by David J. MacKenzie and Akim Demaille.
[[email protected] ~]$ 
           

2、zlib zlib-devel openssl openssl-devel这四个库可以直接使用yum安装

[[email protected] hadoop]# yum install zlib zlib-devel openssl openssl-devel
Loaded plugins: fastestmirror
Setting up Install Process
Loading mirror speeds from cached hostfile
 * base: mirrors.zju.edu.cn
 * extras: mirrors.zju.edu.cn
 * updates: mirrors.zju.edu.cn
Package zlib-1.2.3-29.el6.i686 already installed and latest version
Package zlib-devel-1.2.3-29.el6.i686 already installed and latest version
Package openssl-1.0.1e-42.el6.i686 already installed and latest version
Package openssl-devel-1.0.1e-42.el6.i686 already installed and latest version
Nothing to do
[[email protected] hadoop]# 
           

常规的java编译需要的软件

maven: http://maven.apache.org/download.cgi,ant: http://ant.apache.org/bindownload.cgi下载软件包以后,解压到/usr/local文件夹下,

[[email protected] hadoop]# ls -al /usr/local/
total 64
drwxr-xr-x. 16 root root 4096 Dec 13 16:28 .
drwxr-xr-x. 13 root root 4096 Oct 28 09:55 ..
drwxr-xr-x.  6 root root 4096 Dec 13 16:15 apache-ant-1.9.6
drwxr-xr-x.  6 root root 4096 Dec 11 17:49 apache-maven-3.3.9
drwxr-xr-x.  2 root root 4096 Sep 23  2011 bin
drwxr-xr-x.  2 root root 4096 Sep 23  2011 etc
drwxr-xr-x.  8 root root 4096 Dec 13 16:28 findbugs-3.0.1
drwxr-xr-x.  2 root root 4096 Sep 23  2011 games
drwxr-xr-x.  2 root root 4096 Sep 23  2011 include
drwxr-xr-x.  2 root root 4096 Sep 23  2011 lib
drwxr-xr-x.  2 root root 4096 Sep 23  2011 libexec
drwxr-xr-x.  5 root root 4096 Dec 11 19:56 protobuf-2.5.0
drwxr-xr-x.  5 root root 4096 Dec 11 19:06 protobuf-2.6.1
drwxr-xr-x.  2 root root 4096 Sep 23  2011 sbin
drwxr-xr-x.  5 root root 4096 Oct 28 09:21 share
drwxr-xr-x.  2 root root 4096 Sep 23  2011 src
[[email protected] hadoop]# 
           

然后,再在/etc/profile.d/下建2个shell文件:ant.sh, maven.sh,其中MAVEN的shell文件如下:

[[email protected] hadoop]# cat /etc/profile.d/maven.sh 
#
# set the maven environment
#
export MAVEN_HOME=/usr/local/apache-maven-3.3.9
export PATH=$MAVEN_HOME/bin:$PATH
[[email protected] hadoop]# 
           

然后,运行( 该命令只在当前用户下生效,如果切换用户需要再执行下,但是重启后会自动执行的)

[[email protected] hadoop]# source /etc/profile
           

查看是否安装正确

[[email protected] hadoop]# which ant
/usr/local/apache-ant-1.9.6/bin/ant
[[email protected] hadoop]# which mvn
/usr/local/apache-maven-3.3.9/bin/mvn
[[email protected] hadoop]# 
           

其他hadoop需要的

1、cmake:可以直接使用yum命令安装

[[email protected] hadoop]# yum install cmake
Loaded plugins: fastestmirror
Setting up Install Process
Loading mirror speeds from cached hostfile
 * base: mirrors.zju.edu.cn
 * extras: mirrors.zju.edu.cn
 * updates: mirrors.zju.edu.cn
Package cmake-2.8.12.2-4.el6.i686 already installed and latest version
Nothing to do
[[email protected] hadoop]# 
           

2、protobuf:需要先检查下自己的hadoop编译时的版本

[[email protected] ~]$ hadoop/2.6.1/bin/hadoop version
Hadoop 2.6.1
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r b4d876d837b830405ccdb6af94742f99d49f9c04
Compiled by jenkins on 2015-09-16T21:07Z
Compiled with protoc 2.5.0
From source with checksum ba9a9397365e3ec2f1b3691b52627f
This command was run using /home/hadoop/hadoop/2.6.1/share/hadoop/common/hadoop-common-2.6.1.jar
[[email protected] ~]$ 
           

然后去 https://github.com/google/protobuf/tags找到对应的版本,下载安装就好

3、findbugs:http://findbugs.sourceforge.net/downloads.html同上

[[email protected] ~]$ which protoc
/usr/local/protobuf-2.5.0/bin/protoc
[[email protected] ~]$ which fb
/usr/local/findbugs-3.0.1/bin/fb
[[email protected] ~]$ 
           

编译

完成以上步骤后,就可以使用官方文档里的命令完成编译工作

Once you installed the prerequisite packages use the standard hadoop pom.xml file and pass along the native flag to build the native hadoop library:

$ mvn package -Pdist,native -DskipTests -Dtar      

You should see the newly-built library in:

$ hadoop-dist/target/hadoop-2.6.1/lib/native      
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  2.740 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  2.573 s]
...
[INFO] Apache Hadoop Tools ................................ SUCCESS [  0.524 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [01:16 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 18:54 min
[INFO] Finished at: 2015-12-13T18:52:17+08:00
[INFO] Final Memory: 86M/247M
[INFO] ------------------------------------------------------------------------
[[email protected] hadoop-2.6.1-src]$ 
           

检查是否编译正确

[[email protected] ~]$ file downloads/hadoop-2.6.1-src/hadoop-dist/target/hadoop-2.6.1/lib/native/libhadoop.so.1.0.0 
downloads/hadoop-2.6.1-src/hadoop-dist/target/hadoop-2.6.1/lib/native/libhadoop.so.1.0.0: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, not stripped
[[email protected] ~]$ 
           

使用编译的本地库

先将编译后的库拷贝到hadoop目录中

[[email protected] ~]$ cp -r downloads/hadoop-2.6.1-src/hadoop-dist/target/hadoop-2.6.1/lib hadoop/src_2.6.1/
[[email protected] ~]$ ls hadoop/src_2.6.1/
lib
[[email protected] ~]$ ls hadoop/src_2.6.1/lib/native/
libhadoop.a       libhadoop.so        libhadooputils.a  libhdfs.so
libhadooppipes.a  libhadoop.so.1.0.0  libhdfs.a         libhdfs.so.0.0.0
[[email protected] ~]$ 
           

然后修改环境变量,使用编译的本地库

export HADOOP_SRC_HOME=$HOME/hadoop/src_2.6.1

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_SRC_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_SRC_HOME/lib/native"
           

再启动就可以看到警告信息消失了

[[email protected] ~]$ export HADOOP_ROOT_LOGGER=DEBUG,console
[[email protected] ~]$ hadoop/2.6.1/sbin/start-dfs.sh 
15/12/14 11:25:44 DEBUG util.Shell: setsid exited with exit code 0
...
15/12/14 11:25:45 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
15/12/14 11:25:45 DEBUG security.Groups:  Creating new Groups object
15/12/14 11:25:45 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library...
15/12/14 11:25:45 DEBUG util.NativeCodeLoader: Loaded the native-hadoop library
15/12/14 11:25:45 DEBUG security.JniBasedUnixGroupsMapping: Using JniBasedUnixGroupsMapping for Group resolution
15/12/14 11:25:45 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMapping

           

继续阅读