天天看點

Hive擴充功能(八)--表的索引

軟體環境:

linux系統: CentOS6.7
Hadoop版本: 2.6.5
zookeeper版本: 3.4.8
           

主機配置:

一共m1, m2, m3這三部機, 每部主機的使用者名都為centos

192.168.179.201: m1 
192.168.179.202: m2 
192.168.179.203: m3 

m1: Zookeeper, Namenode, DataNode, ResourceManager, NodeManager, Master, Worker
m2: Zookeeper, Namenode, DataNode, ResourceManager, NodeManager, Worker
m3: Zookeeper, DataNode, NodeManager, Worker
           

資料

官方資料:
    https://cwiki.apache.org/confluence/display/Hive/IndexDev
    https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/AlterIndex
           

一. 編輯hive-site.xml檔案

<property>
    <name>hive.optimize.index.filter</name>
    <value>true</value>
</property>
<property>
    <name>hive.optimize.index.groupby</name>
    <value>true</value>
</property>
<property>
    <name>hive.index.compact.file.ignore.hdfs</name>
    <value>true</value>
</property>
           

二. 建立Hive表索引

官方資料:
    https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Indexing
           
1.建立/構造, 顯示, 删除索引:
create index table01_index on table table01 (column2) as 'compact';
show index on table01;
drop index table01_index on table01;
           
2.建立時重構, 格式化顯示 (with column names), 删除索引:
create index table02_index on table table02 (column3) as 'compact' with deferred rebuild;
alter index table02_index on table2 rebuild;
show formatted index on table02;
drop index table02_index on table02;
           
3.建立索引視圖, 建構, 顯示, 删除:
create index table03_index on table table03 (column4) as 'bitmap' with deferred rebuild;
alter index table03_index on table03 rebuild;
show formatted index on table03;
drop index table03_index on table03;
           
4.在新表中建立索引:
5.建立索引以RCFile的存儲格式:
6.建立索引以TextFile的存儲格式:
7.建立索引和索引的屬性:
8.建立索引和表的屬性:
9.索引如果存在則删除:
10.重構一個分區的資料:

繼續閱讀