天天看點

搭建及修正Hadoop1.2.1 MapReduce Pipes C++開發環境

Hadoop目前人氣超旺,返璞歸真的KV理念讓人們再一次換一個角度來冷靜思考一些問題。

但随着近些年來寫C/C++的人越來越少,網上和官方WIKI的教程直接落地的成功率卻不高,多少會碰到這樣那樣的問題。

現在我就重新整理下搭建過程的一些細節,供同好者分享,也請多多指點。

1,一些條件:

VituralBox 4.3 Win7 x64 

Centos 6.4 x64_86(來自某國内某鏡像網站)

Hadoop-1.2.1.tar.gz

安裝openssl、zlib、glib必備(之前cassandra的文章有提及)

2,搭建叢集過程(這部分簡寫,網上很多參考)

2.1 ssh_key互信

主備:ssh-keygen -t rsa 回車到底

主備:chmod 755 .ssh

主:cd .ssh

主:cp id_rsa.pub authorized_keys

主:chmod 644 authorized_keys

主:scp authorized_keys 192.168.137.102:/root/.ssh

備:#scp id_rsa.pub 192.168.137.101:/root/.ssh/192.168.137.102.id_rsa.pub

主:

cat 192.168.137.102.id_rsa.pub >> authorized_keys

主備:

vim /etc/ssh/sshd_config

改為 RSAAuthentication yes

PubkeyAuthentication yes

service sshd restart

2.2 hadoop-env.sh 頭上增補

export JAVA_HOME=/opt/java1.6

export HADOOP_HOME=/opt/hadoop

export HADOOP_CONF_DIRE=/opt/hadoop/conf

2.3 三大xml配置(此處略,網上都有,或者看老版本default)

2.4 master配置

192.168.137.101

2.5 slaver配置

192.168.137.102

2.6 同步

scp -r hadoop 192.168.137.102:/opt

2.7 格式化

hadoop namenode -format ,提升輸入大寫Y 

2.8 拉起來

start-all.sh

2.9 初驗

jps(主跑namenode*2+job,備跑task+data)

hadoop dfsadmin -report

或者開個IE,http://cent1:50070 看下日志,浏覽下Hdfs

3,搭建C++ Pipes

cd /opt/hadoop/src/c++/pipes   ->  chmod 777 configure -> ./configure -> make -> make install

cd /opt/hadoop/src/c++/utils     ->  chmod 777 configure -> ./configure -> make -> make install

cd //opt/hadoop/src/c++/libhdfs ->  chmod 777 configure -> ./configure -> make -> make install

把生成的靜、動庫檔案(比自帶版本size打了3~4倍)扔到下面三個目錄(為今後友善起見)

/opt/hadoop/c++/Linux-amd64-64/lib

/usr/lib64

/usr/lib

/usr/local/lib

及自己的開發目錄

把hadoop自帶的頭檔案/opt/hadoop/c++/Linux-amd64-64/include扔到

/usr/include

/usr/local/include

重新開機hadoop。不做第三步,在開始reduce的過程中會遇到伺服器認證失敗的報錯。

4,開發環境

4.1 用網上北美氣象局的SAMPLE

[root@cent3 tt]# more sample.txt

0067011990999991950051507004+68750+023550FM-12+038299999V0203301N00671220001CN9999999N9+00001+99999999999

0043011990999991950051512004+68750+023550FM-12+038299999V0203201N00671220001CN9999999N9+00221+99999999999

0043011990999991950051518004+68750+023550FM-12+038299999V0203201N00261220001CN9999999N9-00111+99999999999

0043012650999991949032412004+62300+010750FM-12+048599999V0202701N00461220001CN0500001N9+01111+99999999999

0043012650999991949032418004+62300+010750FM-12+048599999V0202701N00461220001CN0500001N9+00781+99999999999

4.2 用網上max_temperature sample

#include "hadoop/Pipes.hh"

#include "hadoop/TemplateFactory.hh"

#include "hadoop/StringUtils.hh"

#include <algorithm>

#include <limits>

#include <stdint.h>

#include <string>

#include <stdio.h>

class MaxTemperatureMapper: public HadoopPipes::Mapper {

public:

MaxTemperatureMapper(HadoopPipes::TaskContext& context){}

void map(HadoopPipes::MapContext& context)

{

std::string line=context.getInputValue();

std::string year=line.substr(15,4);

std::string airTemperature=line.substr(87,5);

std::string q=line.substr(92,1);

if(airTemperature != "+9999" && (q == "0" || q == "1" || q == "4" || q == "5" || q == "9"))

context.emit(year, airTemperature);

}

};

class MapTemperatureReducer: public HadoopPipes::Reducer {

MapTemperatureReducer(HadoopPipes::TaskContext& context){}

void reduce(HadoopPipes::ReduceContext& context)

int maxValue=0;

while(context.nextValue())

maxValue=std::max(maxValue,HadoopUtils::toInt(context.getInputValue()));

context.emit(context.getInputKey(),HadoopUtils::toString(maxValue));

int main()

return HadoopPipes::runTask(HadoopPipes::TemplateFactory<MaxTemperatureMapper,MapTemperatureReducer>());

4.3 設定Makefile或者VIM自帶設定

CC=g++

PLATFORM=Linux-amd64-64

HADOOP_INSTALL=/opt/hadoop

CPPFLAGS = -m64 -I/usr/local/include

max_temperature: maxtemperature.cpp

  $(CC) $(CPPFLAGS) $< -Wall -L/usr/local/lib -lhadooppipes -lcrypto -lhadooputils -lpthread -g -O2 -o $@

==

52 "======================

53 "F5 Compile c

54 "======================

55 map <F5> :call Compilepp()<CR>

56 func! Compilepp()

57 if &filetype == 'cpp'

58 exec "w"

59 exec "! clear;

60 \ echo Compiling: ./% ...;

61 \ echo ;

62 \ g++ % -g -lstdc++  -L/usr/local/lib -lhadooppipes -lcrypto -lhadooputils -lpthread  -o %<.o;

63 \ echo Complie Done;

64 \ echo Start Testing;

65 \ echo ;

66 \ echo ;

67 \ echo ;

68 \ ./%<.o;"

69 endif

70 endfunc

4.4 開始實驗

hadoop dfs -rmr output

hadoop dfs -rm bin/max_temperature

hadoop dfs -put max_temperature bin/max_temperature

haddop dfs -put sample.txt sample.txt

hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true -input sample.txt -output output -program bin/max_temperature

搭建及修正Hadoop1.2.1 MapReduce Pipes C++開發環境

大緻基本上就是這樣了,對重新編譯一事,wiki也沒有多說什麼,也是從别家了解到一些資訊,在此要感謝某位前輩。

最後再附上一張我自己了解的MP流程圖供參考

搭建及修正Hadoop1.2.1 MapReduce Pipes C++開發環境

http://www.z30.name