不镜于水，而镜于人，则吉凶可鉴也

不蹶于山，而蹶于垤，则细微宜防也

Eclipse运行WordCount

文件下载

WordCount.java 提取码2kwo
log4j.properties 提取码tpz9
data.txt 提取码zefp

具体步骤

注意：Eclipse连接Hadoop集群执行完所有步骤后方可进行接下来的操作

打开Eclipse，依次点击“File”→“New”→“Map/ReduceProject”，点击“Next”

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount
在弹出的窗口填写项目名，选择项目路径，点击“Finish”

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount
在mapreduce项目的src目录中新建cn.neu包，点击“Finish”

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount
将下载的WordCount.java文件拷贝粘贴至cn.neu包中（直接拖拽即可）
使用Xftp等文件传输软件将远程Hadoop集群安装目录下的hadoop/hadoop-2.6.0/etc/hadoop目录下的core-site.xml和hdfs-site.xml传输到本地

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount

上述两个XML文件和下载的log4j.properties文件一起拷贝到src中

注：若不清楚上述XML文件如何配置，推荐参考多台Linux虚拟机Hadoop集群的安装与部署（超详细版）

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount
若不添加两个XML文件，会产生如下错误

log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/G:/hadoop-2.6.0/share/hadoop/common/lib/hadoop-auth-2.6.0.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/test/input/data.txt
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:321)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:385)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:597)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:614)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:492)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
	at java.base/java.security.AccessController.doPrivileged(Native Method)
	at java.base/javax.security.auth.Subject.doAs(Unknown Source)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314)
	at cn.neu.WordCount.main(WordCount.java:60)

右击HDFS根目录，点击“Create new directory”

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount
输入test后点击“OK”

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount

在Project Explorer框内右击，点击Refresh刷新后，即可看到新建的目录

右击test文件夹，在此文件夹下建立目录input，刷新后如下

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount
右击input目录，选择Upload files to DFS（HDFS以前也称DFS）

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount
选择下载的data.txt文件后，点击“打开”，再次刷新Project Explorer，如下图所示

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount
WordCount.java代码中有两处参数值，因此需要配置参数

FileInputFormat.addInputPath(job, new Path(otherArgs[0]));

FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));

在代码编辑处右键鼠标，依次点击“Run As”→“Run Configurations”

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount

点击Arguments，输入上一步骤中设置的data.txt路径和程序最终的输出路径，点击“Apply”后点击“Run”开始运行程序

注意：不可再程序执行前在test目录中新建output目录，output目录务必不存在！否则会产生目录已存在的错误！

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount

可能会报出如下错误（若未报该错误，直接跳过此步骤）

Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(GenericOptionsParser.java:438)
	at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:484)
	at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:170)
	at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:153)
	at cn.neu.WordCount.main(WordCount.java:45)
Caused by: java.lang.StringIndexOutOfBoundsException: begin 0, end 3, length 2
	at java.base/java.lang.String.checkBoundsBeginEnd(Unknown Source)
	at java.base/java.lang.String.substring(Unknown Source)
	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:49)
	... 5 more

点击

(Shell.java:49)

，进入如下界面，点击Attach Source

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount

进入以下界面后，依次点击“External loaction”→“External file”，根据上图中的路径找到sources文件夹，打开后点击hadoop-common-2.6.0-sources.jar，点击“打开”，最后点击“OK”

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount

再次点击

(Shell.java:49)

可查看其源码，定位到第49行，源码如下

private static boolean IS_JAVA7_OR_ABOVE =

System.getProperty(“java.version”).substring(0, 3).compareTo(“1.7”) >= 0;

结合如下错误信息

at java.base/java.lang.String.checkBoundsBeginEnd(Unknown Source)

at java.base/java.lang.String.substring(Unknown Source)

即找不到字符串，因此需要在主函数中添加如下代码

System.setProperty("java.version", "1.8");

，其中后面的数字比1.7大即可

若程序可以正常运行，等待程序运行完毕后，右击Project Explorer中Hadoop下新建的test目录，点击Refresh刷新，可在其中看到output目录

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount
双击part-r-0000文件可查看程序运行结果

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount
若要再次执行，要么在参数配置中更改输出目录，要么删除输出路径下的文件

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount

有一个一劳永逸的方法，即在程序中主函数略加改动，即每次进行运算前检查输出路径是否存在，若存在则删除输出路径

改动前

System.setProperty("HADOOP_USER_NAME", "root");
        System.setProperty("java.version", "1.8");
		Configuration conf = new Configuration();
		String[] otherArgs = new GenericOptionsParser(conf,args).getRemainingArgs();
		if(otherArgs.length != 2){
			System.err.println("Usage WordCount <int> <out>");
			System.exit(2);
		}

改动后

System.setProperty("HADOOP_USER_NAME", "root");
		System.setProperty("java.version", "1.8");
		Configuration conf = new Configuration();
		FileSystem fs = FileSystem.get(conf);
		String[] otherArgs = new GenericOptionsParser(conf,args).getRemainingArgs();
		if(otherArgs.length != 2){
			System.err.println("Usage WordCount <int> <out>");
			System.exit(2);
		}
		Path outPath = new Path(otherArgs[1]);
		if(fs.exists(outPath)) {
			fs.delete(outPath, true);
		}

Eclipse运行WordCount（详细版）相关连接Eclipse运行WordCount

相关连接

HDFS相关知识

Hadoop集群连接

HDFS Java API

WordCount程序分析

Eclipse运行WordCount

文件下载

具体步骤

继续阅读

nginx location中斜线的位置的重要性

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的简单使用

neo4j之cypher使用文档

Ambari介绍和架构原理

GitHub连夜封杀！这份阿里 10W 字内部 Java 字面试手册到底有多强？

用mybatis的generator插件在项目中自动生成dao及entity

spark/scala关于【资源文件】加载方法概述外部文件加载方案测试资源文件打包入jar包中小结

mybatis_入门程序Mybatis入门

AOP编程_Android优雅权限框架(1)概念基础，2021金三银四前言正文大纲正文

Effective Java 8:通用程序设计

OOM三种类型

工厂模式-三种类型

【递归】高效率求2的n次幂

win10本地scala和spark安装安装scala安装spark

scala (3) Function 和 Method