ubuntu下面:
(1)安裝好java
設定 JAVA_HOME
在檔案/etc/profile中設定
JAVA_HOME=/usr/
export JAVA_HOME
[email protected]:~/apache/trunk$ echo $JAVA_HOME
/usr/
(2)ssh
[email protected]:~$ ssh-keygen -t rsa -P ""
[email protected]:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
如果沒有ssh就需要安裝
tian[email protected]:~$ sudo apt-get install openssh-server
[email protected]:~$ ssh localhost
(3) hadoop的配置, hadoop-env.sh mapred-site.xml hdfs-site.xml core-site.xml
官方的版本裡面要在 hadoop-env.sh 加上 :
export JAVA_HOME=/usr/
[email protected]:~/hadoop-0.20.2-cdh3u0/conf$ cat core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/tianzhao/hadoop-0.20.2-cdh3u0/tmp</value>
</property>
</configuration>
[email protected]:~/hadoop-0.20.2-cdh3u0/conf$ cat hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/home/tianzhao/hadoop-0.20.2-cdh3u0/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/tianzhao/hadoop-0.20.2-cdh3u0/hdfs/data</value>
</property>
</configuration>
[email protected]:~/hadoop-0.20.2-cdh3u0/conf$ cat mapred-site.xml <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
bin/hadoop namenode -format
bin/start-all.sh
bin/stop-all.sh
(3.1)hbase hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:54310/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
// master
http://localhost:60010
// region server
http://localhost:60030
mysql 4不支援,mysql 5以上才支援。 外鍵問題。
(4)mysql安裝
[email protected]:~/hive-0.7.0-cdh3u0$ sudo apt-get install mysql-server
root使用者密碼:tianzhao1234
[email protected]:~/hive-0.7.0-cdh3u0$ mysql -uroot -ptianzhao1234
mysql> create database hive;
Query OK, 1 row affected (0.00 sec)
mysql> create user 'hive'@'localhost' identified by '123456';
Query OK, 0 rows affected (0.00 sec)
mysql> grant all privileges on hive.* to 'hive'@'localhost' with grant option;
Query OK, 0 rows affected (0.00 sec)
flush privileges;
sudo /etc/init.d/mysql restart
http://www.pantz.org/software/mysql/mysqlcommands.html
用mysqldump指令行
指令格式
mysqldump -u使用者名 -p 資料庫名 > 資料庫名.sql
範例:
mysqldump -uroot -p abc > abc.sql
(導出資料庫abc到abc.sql檔案)
提示輸入密碼時,輸入該資料庫使用者名的密碼。
用mysql指令行
指令格式
mysql -u使用者名 -p 資料庫名 < 資料庫名.sql
範例:
mysql -uabc_f -p abc < abc.sql
(導入資料庫abc從abc.sql檔案)
提示輸入密碼時,輸入該資料庫使用者名的密碼。
drop database hive;
show databases;
(5)hive hive-config.sh hive-site.xml
1.修改bin/hive-config.sh
export HADOOP_HOME=/home/tianzhao/hadoop-0.20.2-cdh3u0/
conf/hive-site.xml
[email protected]:~/hive-0.7.0-cdh3u0/conf$ cat hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" target="_blank" rel="external nofollow" ?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>hive.metastore.local</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string FOR a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name FOR a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username TOUSE against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password TOUSE against metastore database</description>
</property>
</configuration>
待修改
export HADOOP_HOME=/home/tianzhao/hadoop-0.19.2
export HADOOP_CONF_DIR=$HOME/config
echo $HOME
/home/tianzhao
export HIVE_HOME=/home/tianzhao/hive
export HIVE_CONF_DIR=/home/tianzhao/hive-config
export HIVE_LIB=${HIVE_HOME}/lib
export HIVE_AUX_JARS_PATH=/home/tianzhao/jar
在 hive cli中執行show tables,alter table等hivesql變得很慢,往往是因為mysql的原因,因為hive操作分幾部分:表資訊操作是通過連接配接的資料庫,目錄操作的是hdfs上面的目錄,job送出給jobtracker。
show tables主要是從mysql中讀取表的資訊,然後顯示。如果變得很慢就是hive從Mysql擷取table資訊很慢,可以重新開機mysql或者把mysql遷移到其他機器上。
http://forge.mysql.com/wiki/MySQL_Internals_Optimizer#The_Optimizer
hive> show tables;
FAILED: Error in metadata: javax.jdo.JDODataStoreException: Exception thrown obtaining schema column information from datastore
NestedThrowables:
java.sql.SQLException: Got error 28 from storage engine
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
java.sql.SQLException: Got error 28 from storage engine 這個是因為磁盤空間滿了的原因
http://bugs.mysql.com/bug.php?id=3611
This issue still exist in MySQL server version 5.1.41
I did a sub-query in FROM clause and used WHERE clause to filter results.
The following query works fine without WHERE clause. If you add WHERE clause you will
receive an error code: "1267 Illegal mix of collations (latin1_swedish_ci, IMPLICIT) and
(utf8_general_ci, COERCIBLE) for operation '='"
This example is taken from the book I was reading:
SELECT PLAYERNO, NAME, PLAYERS.TOWN, NUMBER * 1000
FROM PLAYERS,
(SELECT 'Stratford' AS TOWN, 4 AS NUMBER
UNION
SELECT 'Plymouth', 6
UNION
SELECT 'Inglewood', 1
UNION
SELECT 'Douglas', 2) AS TOWNS
WHERE PLAYERS.TOWN = TOWNS.TOWN
ORDER BY PLAYERNO
髒資料産生了分區字段,然後出錯。
Failed with exception javax.jdo.JDODataStoreException: Error executing JDOQL query "SELECT 'org.apache.hadoop.hive.metastore.model.MPartition' AS NUCLEUS_TYPE,`THIS`.`CREATE_TIME`,`THIS`.`LAST_ACCESS_TIME`,`THIS`.`PART_NAME`,`THIS`.`PART_ID` FROM `PARTITIONS` `THIS` LEFT OUTER JOIN `TBLS` `THIS_TABLE_TABLE_NAME` ON `THIS`.`TBL_ID` = `THIS_TABLE_TABLE_NAME`.`TBL_ID` LEFT OUTER JOIN `TBLS` `THIS_TABLE_DATABASE` ON `THIS`.`TBL_ID` = `THIS_TABLE_DATABASE`.`TBL_ID` LEFT OUTER JOIN `DBS` `THIS_TABLE_DATABASE_DATABASE_NAME` ON `THIS_TABLE_DATABASE`.`DB_ID` = `THIS_TABLE_DATABASE_DATABASE_NAME`.`DB_ID` WHERE `THIS_TABLE_TABLE_NAME`.`TBL_NAME` = ? AND `THIS_TABLE_DATABASE_DATABASE_NAME`.`NAME` = ? AND `THIS`.`PART_NAME` = ?" : Illegal mix of collations (latin1_bin,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='.
NestedThrowables:
java.sql.SQLException: Illegal mix of collations (latin1_bin,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
Failed with exception javax.jdo.JDODataStoreException: Add request failed : INSERT INTO `COLUMNS` (`SD_ID`,`COMMENT`,`COLUMN_NAME`,`TYPE_NAME`,`INTEGER_IDX`) VALUES (?,?,?,?,?)
NestedThrowables:
java.sql.BatchUpdateException: The table 'columns' is full
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
錯誤資訊:
Caused by: java.sql.SQLException: The table 'xxx' is full
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
解決辦法:
請參閱如下位址
1. http://forums.dcm4che.org/jiveforums/message.jspa?messageID=2751
2. http://jeremy.zawodny.com/blog/archives/000796.html
3. http://dev.mysql.com/doc/refman/5.0/en/full-table.html