linux下搭建hadoop-1.2.1

Hadoop的代碼可以到Apache上下載下傳，連結為http://archive.apache.org/dist/hadoop/core/hadoop-0.19.0/，我使用的Linux機器是ubuntu12.10，Linux上安裝的Java版本為1.7.0_51，并且JAVA_HOME=/usr/java/jdk1.7.0_51。

實踐過程

1、ssh無密碼驗證登陸localhost

保證Linux系統的ssh服務已經啟動，并保證能夠通過無密碼驗證登陸本機Linux系統。如果不能保證，可以按照如下的步驟去做：

（1）啟動指令行視窗，執行指令行：

[plain] view plaincopyprint?

1. $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

2. $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

$ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

$ cat~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

（2）ssh登陸localhost，執行指令行：

$ ssh localhost

第一次登入，會提示你無法建立到127.0.0.1的連接配接，是否要建立，輸入yes即可，下面是能夠通過無密碼驗證登陸的資訊：

[[email protected] hadoop-0.19.0]# ssh localhost

Last login: Sun Aug 1 18:35:37 2010 from 192.168.0.104

[[email protected] ~]#

2、Hadoop-0.19.0配置

下載下傳hadoop-0.19.0.tar.gz，大約是40.3M，解壓縮到Linux系統指定目錄，這裡我的是/root/hadoop-0.19.0目錄下。

下面按照有序的步驟來說明配置過程：

（1）修改hadoop-env.sh配置

将Java環境的配置進行修改後，并取消注釋“#”，修改後的行為：

export JAVA_HOME=/usr/java/jdk1.7.0_51

（2）在<configuration>與</configuration>加上3個屬性的配置，修改後的配置檔案内容為：

1、core-site.xml配置檔案

内容配置如下所示：

[xhtml] viewplaincopyprint?

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?><configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration>

2、hdfs-site.xml配置檔案

内容配置如下所示：

[xhtml] viewplaincopyprint?

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?><configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>

3、mapred-site.xml配置檔案

配置内容如下所示：

[xhtml] viewplaincopyprint?

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>

<xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?><configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> <property <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>

3、運作wordcount執行個體

wordcount例子是hadoop發行包中自帶的執行個體，通過運作執行個體可以感受并嘗試了解hadoop在執行MapReduce任務時的執行過程。按照官方的“HadoopQuick Start”教程基本可以容易地實作，下面簡單說一下我的練習過程。

導航到hadoop目錄下面，我的是/root/hadoop-0.19.0。

（1）格式化HDFS

執行格式化HDFS的指令行：

[plain] view plaincopyprint?

1. [[email protected] hadoop-0.19.0]# bin/hadoop namenode -format

[[email protected]]# bin/hadoop namenode -format

格式化執行資訊如下所示：

[plain] view plaincopyprint?

1. 10/08/01 19:04:02 INFO namenode.NameNode: STARTUP_MSG:

9. Re-format filesystem in /tmp/hadoop-root/dfs/name ? (Y or N) y

10. Format aborted in /tmp/hadoop-root/dfs/name

11. 10/08/01 19:04:05 INFO namenode.NameNode: SHUTDOWN_MSG:

12.

10/08/0119:04:02 INFO namenode.NameNode: STARTUP_MSG:

Re-formatfilesystem in /tmp/hadoop-root/dfs/name ? (Y or N) y

Formataborted in /tmp/hadoop-root/dfs/name

10/08/0119:04:05 INFO namenode.NameNode: SHUTDOWN_MSG:

（2）啟動Hadoop相關背景程序

執行指令行：

[[email protected] hadoop-0.19.0]# bin/start-all.sh

啟動執行資訊如下所示：

[plain] view plaincopyprint?

1. starting namenode, logging to /root/hadoop-0.19.0/bin/../logs/hadoop-root-namenode-localhost.out

2. localhost: starting datanode, logging to /root/hadoop-0.19.0/bin/../logs/hadoop-root-datanode-localhost.out

3. localhost: starting secondarynamenode, logging to /root/hadoop-0.19.0/bin/../logs/hadoop-root-secondarynamenode-localhost.out

4. starting jobtracker, logging to /root/hadoop-0.19.0/bin/../logs/hadoop-root-jobtracker-localhost.out

5. localhost: starting tasktracker, logging to /root/hadoop-0.19.0/bin/../logs/hadoop-root-tasktracker-localhost.out

startingnamenode, logging to/root/hadoop-0.19.0/bin/../logs/hadoop-root-namenode-localhost.out

localhost:starting datanode, logging to/root/hadoop-0.19.0/bin/../logs/hadoop-root-datanode-localhost.out

localhost:starting secondarynamenode, logging to/root/hadoop-0.19.0/bin/../logs/hadoop-root-secondarynamenode-localhost.out

startingjobtracker, logging to/root/hadoop-0.19.0/bin/../logs/hadoop-root-jobtracker-localhost.out

localhost:starting tasktracker, logging to /root/hadoop-0.19.0/bin/../logs/hadoop-root-tasktracker-localhost.out

（3）準備執行wordcount任務的資料

首先，這裡在本地建立了一個資料目錄input，并拷貝一些檔案到該目錄下面，如下所示：

[[email protected] hadoop-0.19.0]# mkdir input

[[email protected] hadoop-0.19.0]# cp CHANGES.txt LICENSE.txt NOTICE.txtREADME.txt input/

然後，将本地目錄input上傳到HDFS檔案系統上，執行如下指令：

[[email protected] hadoop-0.19.0]# bin/hadoop fs -putinput/ input

（4）啟動wordcount任務

執行如下指令行：

[[email protected] hadoop-0.19.0]# bin/hadoop jarhadoop-0.19.0-examples.jar wordcount input output

中繼資料目錄為input，輸出資料目錄為output。

任務執行資訊如下所示：

[plain] view plaincopyprint?

1. 10/08/01 19:06:15 INFO mapred.FileInputFormat: Total input paths to process : 4

2. 10/08/01 19:06:15 INFO mapred.JobClient: Running job: job_201008011904_0002

3. 10/08/01 19:06:16 INFO mapred.JobClient: map 0% reduce 0%

4. 10/08/01 19:06:22 INFO mapred.JobClient: map 20% reduce 0%

5. 10/08/01 19:06:24 INFO mapred.JobClient: map 40% reduce 0%

6. 10/08/01 19:06:25 INFO mapred.JobClient: map 60% reduce 0%

7. 10/08/01 19:06:27 INFO mapred.JobClient: map 80% reduce 0%

8. 10/08/01 19:06:28 INFO mapred.JobClient: map 100% reduce 0%

9. 10/08/01 19:06:38 INFO mapred.JobClient: map 100% reduce 26%

10. 10/08/01 19:06:40 INFO mapred.JobClient: map 100% reduce 100%

11. 10/08/01 19:06:41 INFO mapred.JobClient: Job complete: job_201008011904_0002

12. 10/08/01 19:06:41 INFO mapred.JobClient: Counters: 16

13. 10/08/01 19:06:41 INFO mapred.JobClient: File Systems

14. 10/08/01 19:06:41 INFO mapred.JobClient: HDFS bytes read=301489

15. 10/08/01 19:06:41 INFO mapred.JobClient: HDFS bytes written=113098

16. 10/08/01 19:06:41 INFO mapred.JobClient: Local bytes read=174004

17. 10/08/01 19:06:41 INFO mapred.JobClient: Local bytes written=348172

18. 10/08/01 19:06:41 INFO mapred.JobClient: Job Counters

19. 10/08/01 19:06:41 INFO mapred.JobClient: Launched reduce tasks=1

20. 10/08/01 19:06:41 INFO mapred.JobClient: Launched map tasks=5

21. 10/08/01 19:06:41 INFO mapred.JobClient: Data-local map tasks=5

22. 10/08/01 19:06:41 INFO mapred.JobClient: Map-Reduce Framework

23. 10/08/01 19:06:41 INFO mapred.JobClient: Reduce input groups=8997

24. 10/08/01 19:06:41 INFO mapred.JobClient: Combine output records=10860

25. 10/08/01 19:06:41 INFO mapred.JobClient: Map input records=7363

26. 10/08/01 19:06:41 INFO mapred.JobClient: Reduce output records=8997

27. 10/08/01 19:06:41 INFO mapred.JobClient: Map output bytes=434077

28. 10/08/01 19:06:41 INFO mapred.JobClient: Map input bytes=299871

29. 10/08/01 19:06:41 INFO mapred.JobClient: Combine input records=39193

30. 10/08/01 19:06:41 INFO mapred.JobClient: Map output records=39193

31. 10/08/01 19:06:41 INFO mapred.JobClient: Reduce input records=10860

10/08/0119:06:15 INFO mapred.FileInputFormat: Total input paths to process : 4

10/08/0119:06:15 INFO mapred.JobClient: Running job: job_201008011904_0002

10/08/0119:06:16 INFO mapred.JobClient: map 0%reduce 0%

10/08/0119:06:22 INFO mapred.JobClient: map 20%reduce 0%

10/08/0119:06:24 INFO mapred.JobClient: map 40%reduce 0%

10/08/0119:06:25 INFO mapred.JobClient: map 60%reduce 0%

10/08/0119:06:27 INFO mapred.JobClient: map 80%reduce 0%

10/08/0119:06:28 INFO mapred.JobClient: map 100%reduce 0%

10/08/0119:06:38 INFO mapred.JobClient: map 100%reduce 26%

10/08/0119:06:40 INFO mapred.JobClient: map 100%reduce 100%

10/08/0119:06:41 INFO mapred.JobClient: Job complete: job_201008011904_0002

10/08/0119:06:41 INFO mapred.JobClient: Counters: 16

10/08/0119:06:41 INFO mapred.JobClient: FileSystems

10/08/0119:06:41 INFO mapred.JobClient: HDFSbytes read=301489

10/08/0119:06:41 INFO mapred.JobClient: HDFSbytes written=113098

10/08/0119:06:41 INFO mapred.JobClient: Localbytes read=174004

10/08/0119:06:41 INFO mapred.JobClient: Localbytes written=348172

10/08/0119:06:41 INFO mapred.JobClient: JobCounters

10/08/0119:06:41 INFO mapred.JobClient: Launched reduce tasks=1

10/08/0119:06:41 INFO mapred.JobClient: Launchedmap tasks=5

10/08/0119:06:41 INFO mapred.JobClient: Data-local map tasks=5

10/08/0119:06:41 INFO mapred.JobClient: Map-Reduce Framework

10/08/0119:06:41 INFO mapred.JobClient: Reduce input groups=8997

10/08/0119:06:41 INFO mapred.JobClient: Combine output records=10860

10/08/0119:06:41 INFO mapred.JobClient: Mapinput records=7363

10/08/0119:06:41 INFO mapred.JobClient: Reduce output records=8997

10/08/0119:06:41 INFO mapred.JobClient: Mapoutput bytes=434077

10/08/0119:06:41 INFO mapred.JobClient: Mapinput bytes=299871

10/08/0119:06:41 INFO mapred.JobClient: Combine input records=39193

10/08/0119:06:41 INFO mapred.JobClient: Mapoutput records=39193

10/08/0119:06:41 INFO mapred.JobClient: Reduce input records=10860

（5）檢視任務執行結果

可以通過如下指令行：

bin/hadoop fs -cat output/*

執行結果，截取部分顯示如下所示：

[plain] view plaincopyprint?

1. vijayarenu 20

2. violations. 1

3. virtual 3

4. vis-a-vis 1

5. visible 1

6. visit 1

7. volume 1

8. volume, 1

9. volumes 2

10. volumes. 1

11. w.r.t 2

12. wait 9

13. waiting 6

14. waiting. 1

15. waits 3

16. want 1

17. warning 7

18. warning, 1

19. warnings 12

20. warnings. 3

21. warranties 1

22. warranty 1

23. warranty, 1

vijayarenu 20

violations. 1

virtual3

vis-a-vis 1

visible1

visit 1

volume 1

volume,1

volumes2

volumes. 1

w.r.t 2

wait 9

waiting6

waiting. 1

waits 3

want 1

warning7

warning, 1

warnings 12

warnings. 3

warranties 1

warranty 1

warranty, 1

（6）終止Hadoop相關背景程序

執行如下指令行：

[[email protected] hadoop-0.19.0]# bin/stop-all.sh

執行資訊如下所示：

stopping jobtracker

localhost: stopping tasktracker

stopping namenode

localhost: stopping datanode

localhost: stopping secondarynamenode

已經将上面列出的5個程序jobtracker、tasktracker、namenode、datanode、secondarynamenode終止。

異常分析

在進行上述實踐過程中，可能會遇到某種異常情況，大緻分析如下：

1、Call to localhost/127.0.0.1:9000 failed on local exception異常

（1）異常描述

可能你會在執行如下指令行的時候出現：

[[email protected] hadoop-0.19.0]# bin/hadoop jarhadoop-0.19.0-examples.jar wordcount input output

出錯異常資訊如下所示：

[plain] view plaincopyprint?

1. 10/08/01 19:50:55 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 0 time(s).

2. 10/08/01 19:50:56 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 1 time(s).

3. 10/08/01 19:50:57 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 2 time(s).

4. 10/08/01 19:50:58 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 3 time(s).

5. 10/08/01 19:50:59 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 4 time(s).

6. 10/08/01 19:51:00 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 5 time(s).

7. 10/08/01 19:51:01 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 6 time(s).

8. 10/08/01 19:51:02 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 7 time(s).

9. 10/08/01 19:51:03 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 8 time(s).

10. 10/08/01 19:51:04 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 9 time(s).

11. java.lang.RuntimeException: java.io.IOException: Call to localhost/127.0.0.1:9000 failed on local exception: Connection refused

12. at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:323)

13. at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:295)

14. at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:268)

15. at org.apache.hadoop.examples.WordCount.run(WordCount.java:146)

16. at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

17. at org.apache.hadoop.examples.WordCount.main(WordCount.java:155)

18. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

19. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

20. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

21. at java.lang.reflect.Method.invoke(Method.java:597)

22. at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)

23. at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:141)

24. at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:61)

25. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

26. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

27. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

28. at java.lang.reflect.Method.invoke(Method.java:597)

29. at org.apache.hadoop.util.RunJar.main(RunJar.java:165)

30. at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)

31. at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

32. at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

33. at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

34. Caused by: java.io.IOException: Call to localhost/127.0.0.1:9000 failed on local exception: Connection refused

35. at org.apache.hadoop.ipc.Client.call(Client.java:699)

36. at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)

37. at $Proxy0.getProtocolVersion(Unknown Source)

38. at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)

39. at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)

40. at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177)

41. at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:74)

42. at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1367)

43. at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:56)

44. at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1379)

45. at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)

46. at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)

47. at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:319)

48. ... 21 more

49. Caused by: java.net.ConnectException: Connection refused

50. at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

51. at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)

52. at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)

53. at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299)

54. at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)

55. at org.apache.hadoop.ipc.Client.getConnection(Client.java:772)

56. at org.apache.hadoop.ipc.Client.call(Client.java:685)

57. ... 33 more

10/08/0119:50:55 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 0 time(s).

10/08/0119:50:56 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 1 time(s).

10/08/0119:50:57 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 2 time(s).

10/08/0119:50:58 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 3 time(s).

10/08/0119:50:59 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 4 time(s).

10/08/0119:51:00 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 5 time(s).

10/08/0119:51:01 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 6 time(s).

10/08/0119:51:02 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 7 time(s).

10/08/0119:51:03 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 8 time(s).

10/08/0119:51:04 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000.Already tried 9 time(s).

java.lang.RuntimeException:java.io.IOException: Call to localhost/127.0.0.1:9000 failed on localexception: Connection refused

atorg.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:323)

atorg.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:295)

atorg.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:268)

atorg.apache.hadoop.examples.WordCount.run(WordCount.java:146)

atorg.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

atorg.apache.hadoop.examples.WordCount.main(WordCount.java:155)

atsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

atjava.lang.reflect.Method.invoke(Method.java:597)

atorg.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)

atorg.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:141)

atorg.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:61)

atsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

atjava.lang.reflect.Method.invoke(Method.java:597)

atorg.apache.hadoop.util.RunJar.main(RunJar.java:165)

atorg.apache.hadoop.mapred.JobShell.run(JobShell.java:54)

atorg.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

atorg.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

atorg.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

Causedby: java.io.IOException: Call to localhost/127.0.0.1:9000 failed on localexception: Connection refused

at org.apache.hadoop.ipc.Client.call(Client.java:699)

atorg.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)

at $Proxy0.getProtocolVersion(UnknownSource)

atorg.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)

at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)

atorg.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177)

atorg.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:74)

atorg.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1367)

atorg.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:56)

atorg.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1379)

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)

atorg.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)

atorg.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:319)

... 21 more

Causedby: java.net.ConnectException: Connection refused

atsun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

atsun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)

atsun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)

at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299)

atorg.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)

atorg.apache.hadoop.ipc.Client.getConnection(Client.java:772)

at org.apache.hadoop.ipc.Client.call(Client.java:685)

... 33 more

（2）異常分析

從上述異常資訊分析，這句是關鍵：

Retrying connect to server: localhost/127.0.0.1:9000.

是說在嘗試10次連接配接到“server”時都無法成功，這就說明到server的通信鍊路是不通的。我們已經在hadoop-site.xml中配置了namenode結點的值，如下所示：

[xhtml] view plaincopyprint?

1. <property>

2. <name>fs.default.name</name>

3. <value>hdfs://localhost:9000</value>

4. </property>

<property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property>

是以，敢肯定是無法連接配接到server，也就是很可能namenode程序根本就沒有啟動，更不必談要執行任務了。

上述異常，我模拟的過程是：

格式化了HDFS，但是沒有執行bin/start-all.sh，直接啟動wordcount任務，就出現上述異常。

是以，應該執行bin/start-all.sh以後再啟動wordcount任務。

2、Input path does not exist異常

（1）異常描述

當你在目前hadoop目錄下面建立一個input目錄，并cp某些檔案到裡面，開始執行：

[[email protected] hadoop-0.19.0]# bin/hadoop namenode-format

[[email protected] hadoop-0.19.0]# bin/start-all.sh

這時候，你認為input已經存在，應該可以執行wordcount任務了：

[[email protected] hadoop-0.19.0]# bin/hadoop jarhadoop-0.19.0-examples.jar wordcount input output

結果抛出一堆異常，資訊如下：

[plain] view plaincopyprint?

1. org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://localhost:9000/user/root/input

2. at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:179)

3. at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:190)

4. at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:782)

5. at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1127)

6. at org.apache.hadoop.examples.WordCount.run(WordCount.java:149)

7. at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

8. at org.apache.hadoop.examples.WordCount.main(WordCount.java:155)

9. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

10. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

11. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

12. at java.lang.reflect.Method.invoke(Method.java:597)

13. at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)

14. at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:141)

15. at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:61)

16. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

17. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

18. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

19. at java.lang.reflect.Method.invoke(Method.java:597)

20. at org.apache.hadoop.util.RunJar.main(RunJar.java:165)

21. at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)

22. at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

23. at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

24. at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

org.apache.hadoop.mapred.InvalidInputException:Input path does not exist: hdfs://localhost:9000/user/root/input

atorg.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:179)

atorg.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:190)

atorg.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:782)

atorg.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1127)

at org.apache.hadoop.examples.WordCount.run(WordCount.java:149)

atorg.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

atorg.apache.hadoop.examples.WordCount.main(WordCount.java:155)

atsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

atjava.lang.reflect.Method.invoke(Method.java:597)

atorg.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)

atorg.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:141)

atorg.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:61)

atsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

atjava.lang.reflect.Method.invoke(Method.java:597)

atorg.apache.hadoop.util.RunJar.main(RunJar.java:165)

atorg.apache.hadoop.mapred.JobShell.run(JobShell.java:54)

atorg.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

atorg.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

atorg.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

上述異常，我模拟的過程是：

[[email protected] hadoop-0.19.0]# bin/hadoop fs -rmrinput

Deleted hdfs://localhost:9000/user/root/input

[[email protected] hadoop-0.19.0]# bin/hadoop fs -rmroutput

Deleted hdfs://localhost:9000/user/root/output

因為之前我已經成功執行過一次。

（2）異常分析

應該不用多說了，是因為本地的input目錄并沒有上傳到HDFS上，所出現org.apache.hadoop.mapred.InvalidInputException:Input path does not exist: hdfs://localhost:9000/user/root/input

在我的印象中，好像使用hadoop-0.16.4的時候，隻要input目錄存在，是不用執行上傳指令，就可以運作的，後期的版本是不行的。

隻需要執行上傳的指令即可：

[[email protected] hadoop-0.19.0]# bin/hadoop fs -putinput/ input

linux下搭建hadoop-1.2.1

繼續閱讀

linux-svn解除安裝與安裝

vsftp虛拟多使用者多權限一鍵部署腳本

Ubuntu14.04 LTS下安裝mongodb

ubuntu14.04下安裝hbse1.0.1.1

httpd服務的部署、啟動、配置和簡單優化一、部署二、啟動三、配置檔案

配置網頁内容通路

手動安裝Intel network I217-LM網卡的Linux驅動

禁止ubuntu系統彈出報錯界面

User Defined Hadoop DataType

Ubuntu Linux下Apache的配置檔案

Ambari介紹和架構原理

samba伺服器的功能

【Linux】UDP廣播封包接收速率問題

Linux裝置模型（中）之上層容器

PowerPC平台 Linux移植三