hadoop自定義資料類型

2023-03-02 02:05:46

Hadoop的基本資料類型是基于對Java的基本資料類型的封裝，如int對應IntWritable,Long對應LongWritable。

和Java中自定義資料類型一樣，某些時候我們也會在Hadoop中建立自定義資料類型。

Hadoop中自定義資料類型必須實作WritableComparable接口

舉例：

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;

import org.apache.hadoop.io.WritableComparable;

public class LastOrder implements WritableComparable<LastOrder>{

	private int cust_id;
	private String cust_type;
	private String cust_email;
	
	public LastOrder(){
		
	}
	
	@Override
	public void readFields(DataInput in) throws IOException {
		
		this.cust_id = in.readInt();
		this.cust_type = in.readUTF();
		this.cust_email =in.readUTF();
	}

	@Override
	public void write(DataOutput out) throws IOException {
		
		out.writeInt(this.cust_id);
		out.writeUTF(this.cust_type);
		out.writeUTF(this.cust_email);
		
	}


	public int compareTo(LastOrder o) {
		
		return this.cust_id-o.cust_id;
	}

	public int hashCode(){
		return super.hashCode();
	}
	
	public boolean equals(LastOrder o){
		return super.equals(o);
	}
	
	
	public String toString(){
		StringBuffer  sb= new StringBuffer();
		sb.append(cust_id);
		sb.append("\001");
		sb.append(cust_type);
		sb.append("\001");
		sb.append(cust_email);
		
		return sb.toString();
	}
	
	public int getCust_id() {
		return cust_id;
	}

	public void setCust_id(int cust_id) {
		this.cust_id = cust_id;
	}

	public String getCust_email() {
		return cust_email;
	}

}

注意：方法readFields()和write()的字段順序必須一一對應，不然程式運作時會報錯。

hadoop自定義資料類型

繼續閱讀

大資料技術原理與應用（最後三天備考了！！！）

Hadoop FSDataInputStream 和FSDataOutputStream 用法

Windows下Cygwin環境的Hadoop安裝（3）- 運作hadoop中的wordcount執行個體遇到的問題和解決方法

MapReduce運作Wordcount時一直卡在INFO mapreduce.Job: Running job，web檢視一直處于accepted階段

ubuntu hadoop2.6.1，terminal下運作wordcount

MapReduce(一)：入門級程式wordcount及其分析

hadoop操作遇到的問題問題一：輸出檔案已存在

Hadoop之運作wordcount

jdk1.7+Eclipse+Maven3.5+Hadoop2.7.3建構hadoop項目

Eclipse運作WordCount（詳細版）相關連接配接Eclipse運作WordCount

hadoop 用MR實作join操作

Centos7 下 Hadoop 2.6.4 分布式叢集環境搭建摘要叢集準備安裝JDK 安裝 Hadoop 2.6.4 部署 slaver1-slaver4 啟動 hadoop 叢集成功了

MapReduce的幾個企業級經典面試案例MapReduce的幾個企業級經典面試案例

ubuntu14.04下安裝hbse1.0.1.1

User Defined Hadoop DataType

Ambari介紹和架構原理