天天看點

hive安裝和使用相關問題解決方式

ubuntu下面:

(1)安裝好java

設定 JAVA_HOME

在檔案/etc/profile中設定

JAVA_HOME=/usr/

export JAVA_HOME

[email protected]:~/apache/trunk$ echo $JAVA_HOME

/usr/

(2)ssh

[email protected]:~$ ssh-keygen -t rsa -P ""

[email protected]:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

ssh localhost

如果沒有ssh就需要安裝

tian[email protected]:~$ sudo apt-get install openssh-server

[email protected]:~$ ssh localhost

(3) hadoop的配置,  hadoop-env.sh  mapred-site.xml hdfs-site.xml core-site.xml

官方的版本裡面要在 hadoop-env.sh 加上 :

export JAVA_HOME=/usr/

[email protected]:~/hadoop-0.20.2-cdh3u0/conf$ cat core-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>

  <name>fs.default.name</name>

  <value>hdfs://localhost:54310</value>

  <description>The name of the default file system. A URI whose

  scheme and authority determine the FileSystem implementation. The

  uri's scheme determines the config property (fs.SCHEME.impl) naming

  the FileSystem implementation class. The uri's authority is used to

  determine the host, port, etc. for a filesystem.</description>

</property>

<property>

  <name>hadoop.tmp.dir</name>

  <value>/home/tianzhao/hadoop-0.20.2-cdh3u0/tmp</value>

</property>

</configuration>

[email protected]:~/hadoop-0.20.2-cdh3u0/conf$ cat hdfs-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>

<name>dfs.name.dir</name>

<value>/home/tianzhao/hadoop-0.20.2-cdh3u0/hdfs/name</value>

</property>

<property>

<name>dfs.data.dir</name>

<value>/home/tianzhao/hadoop-0.20.2-cdh3u0/hdfs/data</value>

</property>

</configuration>

[email protected]:~/hadoop-0.20.2-cdh3u0/conf$ cat mapred-site.xml <?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>localhost:54311</value>

<description>The host and port that the MapReduce job tracker runs

at. If "local", then jobs are run in-process as a single map

and reduce task.

</description>

</property>

</configuration>

bin/hadoop namenode -format

bin/start-all.sh

bin/stop-all.sh

(3.1)hbase   hbase-site.xml

<configuration>

  <property>

    <name>hbase.rootdir</name>

    <value>hdfs://localhost:54310/hbase</value>

  </property>

<property> 

  <name>hbase.cluster.distributed</name> 

  <value>true</value> 

</property> 

  <property>

    <name>dfs.replication</name>

    <value>1</value>

  </property>

</configuration>

// master

http://localhost:60010

// region server

http://localhost:60030

mysql 4不支援,mysql 5以上才支援。 外鍵問題。

(4)mysql安裝

[email protected]:~/hive-0.7.0-cdh3u0$ sudo apt-get install mysql-server

root使用者密碼:tianzhao1234

[email protected]:~/hive-0.7.0-cdh3u0$ mysql -uroot -ptianzhao1234

mysql> create database hive;

Query OK, 1 row affected (0.00 sec)

mysql> create user 'hive'@'localhost' identified by '123456';

Query OK, 0 rows affected (0.00 sec)

mysql> grant all privileges on hive.* to 'hive'@'localhost' with grant option;

Query OK, 0 rows affected (0.00 sec)

flush privileges;

sudo /etc/init.d/mysql restart

http://www.pantz.org/software/mysql/mysqlcommands.html

用mysqldump指令行

指令格式

mysqldump -u使用者名 -p 資料庫名 > 資料庫名.sql

範例:

mysqldump -uroot -p abc > abc.sql

(導出資料庫abc到abc.sql檔案)

提示輸入密碼時,輸入該資料庫使用者名的密碼。

用mysql指令行

指令格式

mysql -u使用者名 -p 資料庫名 < 資料庫名.sql

範例:

mysql -uabc_f -p abc < abc.sql

(導入資料庫abc從abc.sql檔案)

提示輸入密碼時,輸入該資料庫使用者名的密碼。

drop database hive;

show databases;

(5)hive  hive-config.sh   hive-site.xml

1.修改bin/hive-config.sh

export HADOOP_HOME=/home/tianzhao/hadoop-0.20.2-cdh3u0/

conf/hive-site.xml

[email protected]:~/hive-0.7.0-cdh3u0/conf$ cat hive-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>

  <name>hive.metastore.local</name>

  <value>true</value>

</property>

<property>

        <name>javax.jdo.option.ConnectionURL</name>

        <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>

        <description>JDBC connect string FOR a JDBC metastore</description>

</property>

<property>

        <name>javax.jdo.option.ConnectionDriverName</name>

        <value>com.mysql.jdbc.Driver</value>

        <description>Driver class name FOR a JDBC metastore</description>

</property>

<property>

        <name>javax.jdo.option.ConnectionUserName</name>

        <value>hive</value>

        <description>username TOUSE against metastore database</description>

</property>

<property>

        <name>javax.jdo.option.ConnectionPassword</name>

        <value>123456</value>

        <description>password TOUSE against metastore database</description>

</property>

</configuration>

待修改

export HADOOP_HOME=/home/tianzhao/hadoop-0.19.2

export HADOOP_CONF_DIR=$HOME/config

echo $HOME 

/home/tianzhao

export HIVE_HOME=/home/tianzhao/hive

export HIVE_CONF_DIR=/home/tianzhao/hive-config

export HIVE_LIB=${HIVE_HOME}/lib

export HIVE_AUX_JARS_PATH=/home/tianzhao/jar

在 hive  cli中執行show tables,alter table等hivesql變得很慢,往往是因為mysql的原因,因為hive操作分幾部分:表資訊操作是通過連接配接的資料庫,目錄操作的是hdfs上面的目錄,job送出給jobtracker。

show tables主要是從mysql中讀取表的資訊,然後顯示。如果變得很慢就是hive從Mysql擷取table資訊很慢,可以重新開機mysql或者把mysql遷移到其他機器上。

http://forge.mysql.com/wiki/MySQL_Internals_Optimizer#The_Optimizer

hive> show tables;

FAILED: Error in metadata: javax.jdo.JDODataStoreException: Exception thrown obtaining schema column information from datastore

NestedThrowables:

java.sql.SQLException: Got error 28 from storage engine

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

java.sql.SQLException: Got error 28 from storage engine  這個是因為磁盤空間滿了的原因

http://bugs.mysql.com/bug.php?id=3611

This issue still exist in MySQL server version 5.1.41

I did a sub-query in FROM clause and used WHERE clause to filter results.

The following query works fine without WHERE clause. If you add WHERE clause you will

receive an error code: "1267 Illegal mix of collations (latin1_swedish_ci, IMPLICIT) and

(utf8_general_ci, COERCIBLE) for operation '='"

This example is taken from the book I was reading:

SELECT PLAYERNO, NAME, PLAYERS.TOWN, NUMBER * 1000

FROM PLAYERS,

(SELECT 'Stratford' AS TOWN, 4 AS NUMBER

UNION

SELECT 'Plymouth', 6

UNION

SELECT 'Inglewood', 1

UNION

SELECT 'Douglas', 2) AS TOWNS

WHERE PLAYERS.TOWN = TOWNS.TOWN

ORDER BY PLAYERNO

髒資料産生了分區字段,然後出錯。

Failed with exception javax.jdo.JDODataStoreException: Error executing JDOQL query "SELECT 'org.apache.hadoop.hive.metastore.model.MPartition' AS NUCLEUS_TYPE,`THIS`.`CREATE_TIME`,`THIS`.`LAST_ACCESS_TIME`,`THIS`.`PART_NAME`,`THIS`.`PART_ID` FROM `PARTITIONS` `THIS` LEFT OUTER JOIN `TBLS` `THIS_TABLE_TABLE_NAME` ON `THIS`.`TBL_ID` = `THIS_TABLE_TABLE_NAME`.`TBL_ID` LEFT OUTER JOIN `TBLS` `THIS_TABLE_DATABASE` ON `THIS`.`TBL_ID` = `THIS_TABLE_DATABASE`.`TBL_ID` LEFT OUTER JOIN `DBS` `THIS_TABLE_DATABASE_DATABASE_NAME` ON `THIS_TABLE_DATABASE`.`DB_ID` = `THIS_TABLE_DATABASE_DATABASE_NAME`.`DB_ID` WHERE `THIS_TABLE_TABLE_NAME`.`TBL_NAME` = ? AND `THIS_TABLE_DATABASE_DATABASE_NAME`.`NAME` = ? AND `THIS`.`PART_NAME` = ?" : Illegal mix of collations (latin1_bin,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='.

NestedThrowables:

java.sql.SQLException: Illegal mix of collations (latin1_bin,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

Failed with exception javax.jdo.JDODataStoreException: Add request failed : INSERT INTO `COLUMNS` (`SD_ID`,`COMMENT`,`COLUMN_NAME`,`TYPE_NAME`,`INTEGER_IDX`) VALUES (?,?,?,?,?)

NestedThrowables:

java.sql.BatchUpdateException: The table 'columns' is full

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

錯誤資訊:

Caused by: java.sql.SQLException: The table 'xxx' is full

        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073)

        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)

        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)

解決辦法:

請參閱如下位址

1. http://forums.dcm4che.org/jiveforums/message.jspa?messageID=2751

2. http://jeremy.zawodny.com/blog/archives/000796.html

3. http://dev.mysql.com/doc/refman/5.0/en/full-table.html