CDH5: 使用parcels配置lzo 一、Parcel 部署步驟二、lzo parcels本地化三、修改配置四、驗證

1 下載下傳: 首先需要下載下傳 parcel。下載下傳完成後，parcel 将駐留在 cloudera manager 主機的本地目錄中。

2 配置設定: parcel 下載下傳後，将配置設定到群集中的所有主機上并解壓縮。

3 激活: 配置設定後，激活 parcel 為群集重新開機後使用做準備。激活前可能還需要更新。

2、同時下載下傳manifest.json，并根據manifest.json檔案中的hash值建立sha檔案（注意：sha檔案的名稱與parcels包名一樣）

3、指令行進入apache（如果沒有安裝，則需要安裝）的網站根目錄下，預設是/var/www/html,在此目錄下建立lzo，并将這三個檔案放在lzo目錄中

4、啟動httpd服務，在浏覽器檢視，如http://ip/lzo,則結果如下：

CDH5: 使用parcels配置lzo 一、Parcel 部署步驟二、lzo parcels本地化三、修改配置四、驗證

5、将釋出的local parcels釋出位址配置到遠端 parcel 存儲庫 url位址中，見下圖

6、在cloud manager的parcel頁面的可下載下傳parcel中，就可以看到lzo parcels, 點選并進行下載下傳

7、根據parcels的部署步驟，進行配置設定、激活。結果如下圖

修改hdfs的配置

将io.compression.codecs屬性值中追加,org.apache.hadoop.io.compress.lz4codec,

com.hadoop.compression.lzo.lzopcodec

修改yarn配置

将mapreduce.application.classpath的屬性值修改為：$hadoop_mapred_home/*,$hadoop_mapred_home/lib/*,$mr2_classpath,/opt/cloudera/parcels/hadoop_lzo/lib/hadoop/lib/*

将mapreduce.admin.user.env的屬性值修改為：ld_library_path=$hadoop_common_home/lib/native:$java_library_path:/opt/cloudera/parcels/hadoop_lzo/lib/hadoop/lib/native

create external table lzo(id int,name string) row format delimited fields terminated by '#' stored as inputformat 'com.hadoop.mapred.deprecatedlzotextinputformat' outputformat 'org.apache.hadoop.hive.ql.io.hiveignorekeytextoutputformat' location '/test';

建立一個data.txt,内容如下：

然後使用lzop指令對此檔案壓縮，然後上傳到hdfs的/test目錄下

啟動hive,建表并進行資料查詢，結果如下：

hive> create external table lzo(id int,name string) row format delimited fields terminated by '#' stored as inputformat 'com.hadoop.mapred.deprecatedlzotextinputformat' outputformat 'org.apache.hadoop.hive.ql.io.hiveignorekeytextoutputformat' location '/test';

time taken: 0.108 seconds

hive> select * from lzo where id>2;

total mapreduce jobs = 1

launching job 1 out of 1

number of reduce tasks is set to 0 since there's no reduce operator

starting job = job_1404206497656_0002, tracking url = http://hadoop01.kt:8088/proxy/application_1404206497656_0002/

kill command = /opt/cloudera/parcels/cdh-5.0.1-1.cdh5.0.1.p0.47/lib/hadoop/bin/hadoop job -kill job_1404206497656_0002

hadoop job information for stage-1: number of mappers: 1; number of reducers: 0

2014-07-01 17:30:27,547 stage-1 map = 0%, reduce = 0%

2014-07-01 17:30:37,403 stage-1 map = 100%, reduce = 0%, cumulative cpu 2.84 sec

2014-07-01 17:30:38,469 stage-1 map = 100%, reduce = 0%, cumulative cpu 2.84 sec

2014-07-01 17:30:39,527 stage-1 map = 100%, reduce = 0%, cumulative cpu 2.84 sec

mapreduce total cumulative cpu time: 2 seconds 840 msec

ended job = job_1404206497656_0002

mapreduce jobs launched:

job 0: map: 1 cumulative cpu: 2.84 sec hdfs read: 295 hdfs write: 15 success

total mapreduce cpu time spent: 2 seconds 840 msec

3 sz

4 sz

5 bx

time taken: 32.803 seconds, fetched: 3 row(s)

hive> set hive.exec.compress.output=true;

hive> set mapred.output.compression.codec=com.hadoop.compression.lzo.lzopcodec;

hive> create external table lzo2(id int,name string) row format delimited fields terminated by '#' stored as inputformat 'com.hadoop.mapred.deprecatedlzotextinputformat' outputformat 'org.apache.hadoop.hive.ql.io.hiveignorekeytextoutputformat' location '/test';

time taken: 0.092 seconds

hive> insert into table lzo2 select * from lzo;

total mapreduce jobs = 3

launching job 1 out of 3

starting job = job_1404206497656_0003, tracking url = http://hadoop01.kt:8088/proxy/application_1404206497656_0003/

kill command = /opt/cloudera/parcels/cdh-5.0.1-1.cdh5.0.1.p0.47/lib/hadoop/bin/hadoop job -kill job_1404206497656_0003

2014-07-01 17:33:47,351 stage-1 map = 0%, reduce = 0%

2014-07-01 17:33:57,114 stage-1 map = 100%, reduce = 0%, cumulative cpu 1.96 sec

2014-07-01 17:33:58,170 stage-1 map = 100%, reduce = 0%, cumulative cpu 1.96 sec

mapreduce total cumulative cpu time: 1 seconds 960 msec

ended job = job_1404206497656_0003

stage-4 is selected by condition resolver.

stage-3 is filtered out by condition resolver.

stage-5 is filtered out by condition resolver.

moving data to: hdfs://hadoop01.kt:8020/tmp/hive-hdfs/hive_2014-07-01_17-33-22_504_966970548620625440-1/-ext-10000

loading data to table default.lzo2

table default.lzo2 stats: [num_partitions: 0, num_files: 2, num_rows: 0, total_size: 171, raw_data_size: 0]

job 0: map: 1 cumulative cpu: 1.96 sec hdfs read: 295 hdfs write: 79 success

total mapreduce cpu time spent: 1 seconds 960 msec

time taken: 36.625 seconds

CDH5: 使用parcels配置lzo 一、Parcel 部署步驟二、lzo parcels本地化三、修改配置四、驗證

繼續閱讀

Linux 7 中配置Apache服務，及禁止ip通路，删除apache廣告頁面。

Apache配置檔案中的deny和allow的使用

Apache 配置預設編碼

伺服器配置——Apache

Apache靜态檔案通路配置（書封伺服器）

apache httpd 配置

Ubuntu16.04安裝Apache+MySQL+PHP1. 安裝Apache2. 安裝MySQL3. 安裝PHP4. 安裝phpMyAdmin

ubuntu14.04下安裝hbse1.0.1.1

Apache配置SSLApache配置SSL

Windows下配置Apache的SSL服務

User Defined Hadoop DataType

Apache2.4.x 配置檔案詳解Apache配置需要了解如下：開始講解：

配置apache支援PHP（win7）

neo4j之cypher使用文檔

Ambari介紹和架構原理

sqlServer根據經緯查距離

CDH5: 使用parcels配置lzo 一、Parcel 部署步驟 二、lzo parcels本地化三、修改配置四、驗證

繼續閱讀

CDH5: 使用parcels配置lzo 一、Parcel 部署步驟二、lzo parcels本地化三、修改配置四、驗證