安裝配置pig 0.12.0
1、下載下傳pig 0.12.0
2、直接解壓,配置環境變量 export JAVA_HOME=/usr/java/jdk1.7.0_45
export HADOOP_HOME=/home/hdpuser/hadoop-2.2.0
export PIG_CLASSPATH=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$JAVA_HOME/jre/bin:/home/hdpuser/pig-0.12.0/bin:/home/hdpuser/apache-ant-1.9.2/bin
export PIG_HADOOP_VERSION=23 3、pig -x local 4、做一個簡單load檔案的操作,發現有如下報錯: WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 5、查閱各種資料後,做如下動作(有些動作 可能是無效的) a、參考RELEASE_NOTES.txt,執行如下步驟
1. Download pig-0.12.0.tar.gz
2. Unpack the file: tar -xzvf pig-0.12.0.tar.gz
3. Move into the installation directory: cd pig-0.12.0
4. To run pig without Hadoop cluster, execute the command below. This will
take you into an interactive shell called grunt that allows you to navigate
the local file system and execute Pig commands against the local files
bin/pig -x local
5. To run on your Hadoop cluster, you need to set PIG_CLASSPATH environment
variable to point to the directory with your hadoop-site.xml file and then run
pig. The commands below will take you into an interactive shell called grunt
that allows you to navigate Hadoop DFS and execute Pig commands against it
export PIG_CLASSPATH=/hadoop/conf
bin/pig
6. To build your own version of pig.jar run
ant
7. To run unit tests run
ant test
8. To build jar file with available user defined functions run commands below.
cd contrib/piggybank/java
ant
9. To build the tutorial:
cd tutorial
ant
10. To run tutorial follow instructions in http://wiki.apache.org/pig/PigTutorial
b、随後發現問題依然存在,在網上看到如下說法: ant clean jar-withouthadoop -Dhadoopversion=23,依然很糾結,我的環境是hadoop 2.2.0,不知道這個數值應該如何設定,看到過0.18的設定為18,0.20的設定為20,抱着試試的态度,執行了一下。問題解決。 這裡,不知道的是,之前的步驟a是否一定要執行。 6、将/etc/passwd檔案copy到目前目錄,執行pig -x local(本地模式) 7、參考官網的内容執行( http://pig.apache.org/docs/r0.12.0/start.html):
grunt> A = load 'passwd' using PigStorage(':');
grunt> B = foreach A generate $0 as id;
grunt> dump B;
正确傳回。
測試PIG工作
随後測試hadoop模式下是否可以正常工作。這裡将叢集環境切回僞叢集模式。在全叢集模式下,還是有報錯,尚未解決,初步判斷和叢集配置有關。 1、将PIG教程帶的檔案COPY至HDFS hadoop fs -copyFromLocal /home/hdpuser/pig-0.12.0/tutorial/data/excite-small.log /user/hdpuser/excite-small.log 2、pig -x mapreduce,進入pig hadoop模式 3、 grunt> cd /user 4、 grunt>cd /hdpuser 5、ls檢視copy的檔案是否存在 6、 grunt> log = LOAD '/usr/hdpusr/excite-small.log' AS (user:chararray, time:long, query:chararray); 7、grunt> lmt = LIMIT log 4; 8、grunt> DUMP lmt; 9、結果正确傳回
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs
job_local1555930682_0002 1 1 0 0 0 0 0 0 0 0 lmt,log
job_local1844986210_0003 1 1 0 0 0 0 0 0 0 0 log hdfs://localhost:9000/tmp/temp1201738014/tmp-313826598,
Input(s):
Successfully read 0 records from: "/user/hdpuser/excite-small.log"
Output(s):
Successfully stored 0 records in: "hdfs://localhost:9000/tmp/temp1201738014/tmp-313826598"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local1555930682_0002 -> job_local1844986210_0003,
job_local1844986210_0003
2013-12-08 04:45:37,849 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2013-12-08 04:45:37,852 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2013-12-08 04:45:37,852 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2013-12-08 04:45:37,852 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2013-12-08 04:45:37,860 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2013-12-08 04:45:37,860 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(2A9EABFB35F5B954,970916105432,+md foods +proteins)
(BED75271605EBD0C,970916001949,yahoo chat)
(BED75271605EBD0C,970916001954,yahoo chat)
(BED75271605EBD0C,970916003523,yahoo chat)