天天看點

【問題】hadoop讀取檔案失敗

環境

hadoop3.2.1

要實作的功能和代碼

讀取hdfs上的檔案。

Configuration conf = new Configuration();
try (FileSystem fs = FileSystem.get(conf)) {
	Path basePath = new Path("hdfs://hdfs-cluster/");
	FileStatus[] files = fs.listStatus(basePath);
	
	for (FileStatus file : files) {
		addPaths(newestFile, results, fs, file.getPath());
	}
}
           

出現的問題

[WARN ] 2021-05-07 14:58:44.146 - java.lang.NoClassDefFoundError: org/apache/hadoop/fs/PathHandle
[WARN ] 2021-05-07 14:58:44.146 - java.lang.ClassNotFoundException: org.apache.hadoop.fs.PathHandle
[ERROR] 2021-05-07 14:58:44.156 - class=self.HDFSLoader	
org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "hdfs"
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3220) ~[hadoop-common-2.10.0.jar:?]
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3240) ~[hadoop-common-2.10.0.jar:?]
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121) ~[hadoop-common-2.10.0.jar:?]
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3291) ~[hadoop-common-2.10.0.jar:?]
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3259) ~[hadoop-common-2.10.0.jar:?]
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:470) ~[hadoop-common-2.10.0.jar:?]
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:223) ~[hadoop-common-2.10.0.jar:?]
	at self.HDFSLoader.getPath(HDFSLoader.java:118) ~[websocket_service-1.0.jar:?]
	at self.HDFSLoader.run(HDFSLoader.java:57) [websocket_service-1.0.jar:?]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_65]

           

解決方法

更新到hadoop 3.2.1的依賴後,hadoop-common還是用的2.x版本,導緻部分類找不到。

在pom檔案中同時添加以下依賴即解決:

<dependency>
	<groupId>org.apache.hadoop</groupId>
	<artifactId>hadoop-client</artifactId>
	<version>3.2.1</version>
</dependency>

<dependency>
	<groupId>org.apache.hadoop</groupId>
	<artifactId>hadoop-common</artifactId>
	<version>3.2.1</version>
</dependency>```