天天看點

編寫hive udf函數

大寫轉小寫

package com.afan;

import org.apache.hadoop.hive.ql.exec.UDF;

import org.apache.hadoop.io.Text;

public class UDFLower extends UDF{

    public Text evaluate(final Text s){

        if (null == s){

            return null;

        }

        return new Text(s.toString().toLowerCase());

    }

}

1、加載udf jar包

afan@ubuntu:/usr/local/hadoop/hive$ bin/hive

Hive history file=/tmp/afan/hive_job_log_afan_201105150623_175667077.txt

hive> add jar udf_hive.jar;

Added udf_hive.jar to class path

Added resource: udf_hive.jar

2、建立udf函數

hive> create temporary function my_lower as 'com.afan.UDFLower';

OK

Time taken: 0.253 seconds

3、建立測試資料

hive> create table dual (info string);

Time taken: 0.178 seconds

hive> load data local inpath 'data.txt' into table dual;

Copying data from file:/usr/local/hadoop/hive/data.txt

Copying file: file:/usr/local/hadoop/hive/data.txt

Loading data to table default.dual

Time taken: 0.377 seconds

hive> select info from dual;

Total MapReduce jobs = 1

Launching Job 1 out of 1

Number of reduce tasks is set to 0 since there's no reduce operator

Starting Job = job_201105150525_0003, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201105150525_0003

Kill Command = /usr/local/hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=localhost:9001 -kill job_201105150525_0003

2011-05-15 06:46:05,459 Stage-1 map = 0%,  reduce = 0%

2011-05-15 06:46:10,905 Stage-1 map = 100%,  reduce = 0%

2011-05-15 06:46:13,963 Stage-1 map = 100%,  reduce = 100%

Ended Job = job_201105150525_0003

WHO

AM

I

HELLO

worLd

Time taken: 14.874 seconds

4、使用udf函數

hive> select my_lower(info) from dual;

Starting Job = job_201105150525_0002, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201105150525_0002

Kill Command = /usr/local/hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=localhost:9001 -kill job_201105150525_0002

2011-05-15 06:43:26,100 Stage-1 map = 0%,  reduce = 0%

2011-05-15 06:43:34,364 Stage-1 map = 100%,  reduce = 0%

2011-05-15 06:43:37,484 Stage-1 map = 100%,  reduce = 100%

Ended Job = job_201105150525_0002

who

am

i

hello

world

Time taken: 20.834 seconds

本文轉自 SimplePoint 51CTO部落格,原文連結:http://blog.51cto.com/2226894115/1897838,如需轉載請自行聯系原作者