讓mapreduce任務在遠端叢集上運作

2021-11-10 03:52:10

一、編寫好map和reduce方法。

二、下載下傳叢集上的core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml四個檔案并放到src根目錄下。

三、編寫驅動程式，然後在擷取job對象之前，添加以下代碼：

conf.set("mapreduce.app-submission.cross-platform", "true");

也可以在mapred-site.xml中将mapreduce.app-submission.cross-platform屬性設定成true。

四、在擷取job對象後，添加以下代碼：

job.setjar("wc.jar");//wc.jar為接下來要打包成jar包的檔案名

五、将此工程導出jar包，并且jar包名為wc.jar，必須和第三步中的名稱一緻。然後将該jar包方法工程根目錄下。

六、直接在java程式中，右鍵選擇run on hadoop即可。

我的java程式代碼如下：

import java.io.ioexception;

import java.util.stringtokenizer;

import org.apache.hadoop.conf.configuration;

import org.apache.hadoop.fs.filesystem;

import org.apache.hadoop.fs.path;

import org.apache.hadoop.io.intwritable;

import org.apache.hadoop.io.text;

import org.apache.hadoop.mapreduce.job;

import org.apache.hadoop.mapreduce.mapper;

import org.apache.hadoop.mapreduce.reducer;

import org.apache.hadoop.mapreduce.lib.input.fileinputformat;

import org.apache.hadoop.mapreduce.lib.output.fileoutputformat;

public class wordcount {

public static class mymapper extends mapper<object, text, text, intwritable> {

public intwritable one = new intwritable(1);

@override

protected void map(object key, text value, context context) throws ioexception, interruptedexception {

stringtokenizer st = new stringtokenizer(value.tostring());

text text = new text();

while (st.hasmoretokens()) {

text.set(st.nexttoken());

context.write(text, one);

}

public static class myreducer extends reducer<text, intwritable, text, intwritable> {

protected void reduce(text key, iterable<intwritable> values, context context)

throws ioexception, interruptedexception {

int sum = 0;

for (intwritable val : values) {

sum += val.get();

context.write(key, new intwritable(sum));

public static void main(string[] args) throws exception {

configuration conf = new configuration();

filesystem fs = filesystem.get(conf);

job job = job.getinstance(conf, "word count");

job.setjar("wc.jar");

job.setjarbyclass(wordcount.class);

job.setmapperclass(mymapper.class);

job.setcombinerclass(myreducer.class);

job.setreducerclass(myreducer.class);

job.setoutputkeyclass(text.class);

job.setoutputvalueclass(intwritable.class);

path outpath = new path("/output");

path inpath = new path("/sweeney");

if (fs.exists(outpath)) {//避免出現檔案已存在的情況

fs.delete(outpath, true);

fileinputformat.addinputpath(job, inpath);

fileoutputformat.setoutputpath(job, outpath);

system.exit(job.waitforcompletion(true) ? 0 : 1);

————————————————

讓mapreduce任務在遠端叢集上運作

繼續閱讀

MapReduce的幾個企業級經典面試案例MapReduce的幾個企業級經典面試案例

Linux 7 中配置Apache服務，及禁止ip通路，删除apache廣告頁面。

Apache配置檔案中的deny和allow的使用

Apache 配置預設編碼

伺服器配置——Apache

Apache靜态檔案通路配置（書封伺服器）

apache httpd 配置

Ubuntu16.04安裝Apache+MySQL+PHP1. 安裝Apache2. 安裝MySQL3. 安裝PHP4. 安裝phpMyAdmin

ubuntu14.04下安裝hbse1.0.1.1

Apache配置SSLApache配置SSL

Windows下配置Apache的SSL服務

User Defined Hadoop DataType

Apache2.4.x 配置檔案詳解Apache配置需要了解如下：開始講解：

配置apache支援PHP（win7）

Ambari介紹和架構原理

spark/scala關于【資源檔案】加載方法概述外部檔案加載方案測試資源檔案打包入jar包中小結