一、建構資料倉庫思路
hadoop11作為資料倉庫的client用戶端
haoop12作為資料倉庫hive server端
hadoop13上作為mysql server用戶端
二、在hadoop13上安裝mysql
1.下載下傳mysql的repo源
wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
2.安裝mysql-community-release-el7-5.noarch.rpm包
rpm -ivh mysql-community-release-el7-5.noarch.rpm
3.安裝mysql
yum install mysql-server
yum -y install mysql-community-server
4.啟動服務
重載所有修改過的配置檔案:
systemctl daemon-reload
5.開啟服務:
systemctl start mysqld
設定開機自啟
systemctl enable mysqld
6.設定sql密碼為空
檢視mysql狀态,如果開啟,使用指令關閉
ps -ef | grep -i mysql
systemctl stop mysqld
修改mysql的配置檔案my.cnf
增加以下代碼(跳過密碼驗證)
[mysqld]
skip-grant-tables
啟動mysql服務,進入mysql
systemctl start mysqld
mysql -u root
7.修改密碼
update user set password=password("你的新密碼") where user="root";
flush privileges;
8.重新開機mysql服務
删除之前加的兩行代碼,并重新開機。
9.允許遠端連接配接
grant all privileges on *.* to 'root'@'%' with grant option;
flush privileges;
10.建立hive使用者,資料庫,并把權限賦給hadoop12。
create database hive;
grant all on *.* to [email protected] identified by 'zyl990708';
flush privileges;
exit
三、Hive的安裝配置(每台機器都需要)
1.解壓hive3.1.2
tar -zxvf apache-hive-3.1.2-bin.tar.gz -C /opt/moudle/
修改名稱為hive
将hive檔案夾分發給其他虛拟機
xsync /opt/moudle/hive
2.配置環境變量
##HIVE_HOME
export HIVE_HOME=/opt/moudle/hive
export PATH=$PATH:$HIVE_HOME/bin
3.解決日志沖突
mv lib/log4j-slf4j-impl-2.10.0.jar lib/log4j-slf4j-impl-2.10.0..bak
4.修改conf目錄下的檔案名稱
cp hive-env.sh.template hive-env.sh
cp hive-default.xml.template hive-site.xml
5.在hive-env.sh檔案中找到HADOOP_HOME改為
HADOOP_HOME=/opt/moudle/hadoop
6.将mysql的jar包放在hive的lib目錄下
cp /opt/sofeware/mysql-connector-java-5.1.46-bin.jar ./lib/
7.在hadoop13上配置core-site.xml檔案
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop13:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>zyl990708</value>
<description>password to use against metastore database</description>
</property>
8.在hadoop11上進行配置
在hive目錄下建立檔案夾tmp
使用本地服務連接配接 Hive,預設為 true
<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://hadoop12:9083</value>
</property>
<property>
<name>hive.querylog.location</name>
<value>/opt/moudle/hive/tmp/${system:user.name}</value>
<description>Location of Hive run time structured log file</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/opt/moudle/hive/tmp/${user.name}</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/opt/moudle/hive/tmp/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/opt/moudle/hive/tmp/${system:user.name}/operation_logs</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
9.在hadoop12上進行配置
配置之前在hive目錄下建立目錄tmp
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop13:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>zyl990708</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
</description>
</property>
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
<description>
Should metastore do authorization against database notification related APIs such as get_next_notification.
If set to true, then only the superusers in proxy settings have the permission
</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/opt/moudle/hive/tmp/${system:user.name}</value>
<description>Location of Hive run time structured log file</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/opt/moudle/hive/tmp/${user.name}</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/opt/moudle/hive/tmp/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/opt/moudle/hive/tmp/${system:user.name}/operation_logs</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
8.将hadoop中的guava.jar複制到hive的lib目錄下,并删除hive的guava.jar
cp /opt/moudle/hadoop/share/hadoop/common/lib/guava-27.0-jre.jar lib/
rm -rf guava-19.0.jar
10.不顯示info資訊
set hive.server2.logging.operation.level=NONE
11.格式化(在hadoop12上進行)
schematool -dbType mysql -initSchema
12在hadoop12上啟動
hive --service metastore &
執行指令會出現卡頓,并不是錯誤,按下Enter鍵即可。
13.在hadoop11上啟動hive
hive
參考:全國大學生大資料技能競賽