不鏡于水，而鏡于人，則吉兇可鑒也

不蹶于山，而蹶于垤，則細微宜防也

Eclipse運作WordCount

檔案下載下傳

WordCount.java 提取碼2kwo
log4j.properties 提取碼tpz9
data.txt 提取碼zefp

具體步驟

注意：Eclipse連接配接Hadoop叢集執行完所有步驟後方可進行接下來的操作

打開Eclipse，依次點選“File”→“New”→“Map/ReduceProject”，點選“Next”

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount
在彈出的視窗填寫項目名，選擇項目路徑，點選“Finish”

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount
在mapreduce項目的src目錄中建立cn.neu包，點選“Finish”

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount
将下載下傳的WordCount.java檔案拷貝粘貼至cn.neu包中（直接拖拽即可）
使用Xftp等檔案傳輸軟體将遠端Hadoop叢集安裝目錄下的hadoop/hadoop-2.6.0/etc/hadoop目錄下的core-site.xml和hdfs-site.xml傳輸到本地

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount

上述兩個XML檔案和下載下傳的log4j.properties檔案一起拷貝到src中

注：若不清楚上述XML檔案如何配置，推薦參考多台Linux虛拟機Hadoop叢集的安裝與部署（超詳細版）

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount
若不添加兩個XML檔案，會産生如下錯誤

log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/G:/hadoop-2.6.0/share/hadoop/common/lib/hadoop-auth-2.6.0.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/test/input/data.txt
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:321)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:385)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:597)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:614)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:492)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
	at java.base/java.security.AccessController.doPrivileged(Native Method)
	at java.base/javax.security.auth.Subject.doAs(Unknown Source)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314)
	at cn.neu.WordCount.main(WordCount.java:60)

右擊HDFS根目錄，點選“Create new directory”

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount
輸入test後點選“OK”

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount

在Project Explorer框内右擊，點選Refresh重新整理後，即可看到建立的目錄

右擊test檔案夾，在此檔案夾下建立目錄input，重新整理後如下

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount
右擊input目錄，選擇Upload files to DFS（HDFS以前也稱DFS）

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount
選擇下載下傳的data.txt檔案後，點選“打開”，再次重新整理Project Explorer，如下圖所示

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount
WordCount.java代碼中有兩處參數值，是以需要配置參數

FileInputFormat.addInputPath(job, new Path(otherArgs[0]));

FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));

在代碼編輯處右鍵滑鼠，依次點選“Run As”→“Run Configurations”

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount

點選Arguments，輸入上一步驟中設定的data.txt路徑和程式最終的輸出路徑，點選“Apply”後點選“Run”開始運作程式

注意：不可再程式執行前在test目錄中建立output目錄，output目錄務必不存在！否則會産生目錄已存在的錯誤！

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount

可能會報出如下錯誤（若未報該錯誤，直接跳過此步驟）

Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(GenericOptionsParser.java:438)
	at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:484)
	at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:170)
	at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:153)
	at cn.neu.WordCount.main(WordCount.java:45)
Caused by: java.lang.StringIndexOutOfBoundsException: begin 0, end 3, length 2
	at java.base/java.lang.String.checkBoundsBeginEnd(Unknown Source)
	at java.base/java.lang.String.substring(Unknown Source)
	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:49)
	... 5 more

點選

(Shell.java:49)

，進入如下界面，點選Attach Source

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount

進入以下界面後，依次點選“External loaction”→“External file”，根據上圖中的路徑找到sources檔案夾，打開後點選hadoop-common-2.6.0-sources.jar，點選“打開”，最後點選“OK”

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount

再次點選

(Shell.java:49)

可檢視其源碼，定位到第49行，源碼如下

private static boolean IS_JAVA7_OR_ABOVE =

System.getProperty(“java.version”).substring(0, 3).compareTo(“1.7”) >= 0;

結合如下錯誤資訊

at java.base/java.lang.String.checkBoundsBeginEnd(Unknown Source)

at java.base/java.lang.String.substring(Unknown Source)

即找不到字元串，是以需要在主函數中添加如下代碼

System.setProperty("java.version", "1.8");

，其中後面的數字比1.7大即可

若程式可以正常運作，等待程式運作完畢後，右擊Project Explorer中Hadoop下建立的test目錄，點選Refresh重新整理，可在其中看到output目錄

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount
輕按兩下part-r-0000檔案可檢視程式運作結果

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount
若要再次執行，要麼在參數配置中更改輸出目錄，要麼删除輸出路徑下的檔案

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount

有一個一勞永逸的方法，即在程式中主函數略加改動，即每次進行運算前檢查輸出路徑是否存在，若存在則删除輸出路徑

改動前

System.setProperty("HADOOP_USER_NAME", "root");
        System.setProperty("java.version", "1.8");
		Configuration conf = new Configuration();
		String[] otherArgs = new GenericOptionsParser(conf,args).getRemainingArgs();
		if(otherArgs.length != 2){
			System.err.println("Usage WordCount <int> <out>");
			System.exit(2);
		}

改動後

System.setProperty("HADOOP_USER_NAME", "root");
		System.setProperty("java.version", "1.8");
		Configuration conf = new Configuration();
		FileSystem fs = FileSystem.get(conf);
		String[] otherArgs = new GenericOptionsParser(conf,args).getRemainingArgs();
		if(otherArgs.length != 2){
			System.err.println("Usage WordCount <int> <out>");
			System.exit(2);
		}
		Path outPath = new Path(otherArgs[1]);
		if(fs.exists(outPath)) {
			fs.delete(outPath, true);
		}

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount

相關連接配接

HDFS相關知識

Hadoop叢集連接配接

HDFS Java API

WordCount程式分析

Eclipse運作WordCount

檔案下載下傳

具體步驟

繼續閱讀

nginx location中斜線的位置的重要性

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的簡單使用

neo4j之cypher使用文檔

Ambari介紹和架構原理

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

用mybatis的generator插件在項目中自動生成dao及entity

spark/scala關于【資源檔案】加載方法概述外部檔案加載方案測試資源檔案打包入jar包中小結

mybatis_入門程式Mybatis入門

AOP程式設計_Android優雅權限架構(1)概念基礎，2021金三銀四前言正文大綱正文

Effective Java 8:通用程式設計

OOM三種類型

工廠模式-三種類型

【遞歸】高效率求2的n次幂

win10本地scala和spark安裝安裝scala安裝spark

scala (3) Function 和 Method