天天看点

数据仓库的搭建一、构建数据仓库思路二、在hadoop13上安装mysql三、Hive的安装配置(每台机器都需要)

一、构建数据仓库思路

hadoop11作为数据仓库的client客户端

haoop12作为数据仓库hive server端

hadoop13上作为mysql server客户端

二、在hadoop13上安装mysql

1.下载mysql的repo源

wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm 
           
数据仓库的搭建一、构建数据仓库思路二、在hadoop13上安装mysql三、Hive的安装配置(每台机器都需要)

2.安装mysql-community-release-el7-5.noarch.rpm包

rpm -ivh mysql-community-release-el7-5.noarch.rpm
           
数据仓库的搭建一、构建数据仓库思路二、在hadoop13上安装mysql三、Hive的安装配置(每台机器都需要)

3.安装mysql

yum install mysql-server
           
yum -y install mysql-community-server
           

4.启动服务

重载所有修改过的配置文件:

systemctl daemon-reload 
           

5.开启服务:

systemctl start mysqld
           

设置开机自启

systemctl enable mysqld
           

6.设置sql密码为空

查看mysql状态,如果开启,使用命令关闭

ps -ef | grep -i mysql 
           
systemctl stop mysqld
           

修改mysql的配置文件my.cnf

增加以下代码(跳过密码验证)

[mysqld]

skip-grant-tables
           

启动mysql服务,进入mysql

systemctl start mysqld
mysql -u root
           
数据仓库的搭建一、构建数据仓库思路二、在hadoop13上安装mysql三、Hive的安装配置(每台机器都需要)

7.修改密码

update user set password=password("你的新密码") where user="root";

flush privileges;
           
数据仓库的搭建一、构建数据仓库思路二、在hadoop13上安装mysql三、Hive的安装配置(每台机器都需要)

8.重启mysql服务

删除之前加的两行代码,并重启。

数据仓库的搭建一、构建数据仓库思路二、在hadoop13上安装mysql三、Hive的安装配置(每台机器都需要)

9.允许远程连接

grant all privileges on *.* to 'root'@'%' with grant option;

flush privileges; 
           
数据仓库的搭建一、构建数据仓库思路二、在hadoop13上安装mysql三、Hive的安装配置(每台机器都需要)

10.创建hive用户,数据库,并把权限赋给hadoop12。

create database hive;
grant all on *.* to [email protected] identified by 'zyl990708';
flush privileges;
exit
           

三、Hive的安装配置(每台机器都需要)

1.解压hive3.1.2

tar -zxvf apache-hive-3.1.2-bin.tar.gz -C /opt/moudle/
           

修改名称为hive

将hive文件夹分发给其他虚拟机

xsync /opt/moudle/hive
           

2.配置环境变量

##HIVE_HOME
export HIVE_HOME=/opt/moudle/hive
export PATH=$PATH:$HIVE_HOME/bin
           

3.解决日志冲突

mv lib/log4j-slf4j-impl-2.10.0.jar lib/log4j-slf4j-impl-2.10.0..bak
           

4.修改conf目录下的文件名称

cp hive-env.sh.template hive-env.sh
cp hive-default.xml.template hive-site.xml
           

5.在hive-env.sh文件中找到HADOOP_HOME改为

HADOOP_HOME=/opt/moudle/hadoop
           

6.将mysql的jar包放在hive的lib目录下

cp /opt/sofeware/mysql-connector-java-5.1.46-bin.jar ./lib/
           

7.在hadoop13上配置core-site.xml文件

<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://hadoop13:3306/hive?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>
  </property>
  
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
  
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>username to use against metastore database</description>
  </property>
  
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>zyl990708</value>
    <description>password to use against metastore database</description>
  </property>
           

8.在hadoop11上进行配置

在hive目录下创建文件夹tmp

使用本地服务连接 Hive,默认为 true
<property>
 	<name>hive.metastore.local</name>
	 <value>false</value>
</property>
<property>
 	<name>hive.metastore.uris</name>
	<value>thrift://hadoop12:9083</value>
</property> 
  
  <property>
    <name>hive.querylog.location</name>
    <value>/opt/moudle/hive/tmp/${system:user.name}</value>
    <description>Location of Hive run time structured log file</description>
  </property>
  
  <property>
    <name>hive.exec.local.scratchdir</name>
    <value>/opt/moudle/hive/tmp/${user.name}</value>
    <description>Local scratch space for Hive jobs</description>
</property>

<property>
    <name>hive.downloaded.resources.dir</name>
    <value>/opt/moudle/hive/tmp/${hive.session.id}_resources</value>
    <description>Temporary local directory for added resources in the remote file system.</description>
</property>

<property>
    <name>hive.server2.logging.operation.log.location</name>
    <value>/opt/moudle/hive/tmp/${system:user.name}/operation_logs</value>
    <description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
           

9.在hadoop12上进行配置

配置之前在hive目录下创建目录tmp

<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://hadoop13:3306/hive?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>
  </property>
  
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
  
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>username to use against metastore database</description>
  </property>
  
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>zyl990708</value>
    <description>password to use against metastore database</description>
  </property>
  
<property>
    <name>datanucleus.schema.autoCreateAll</name>
    <value>true</value>
 </property>
 
<property>
    <name>hive.metastore.schema.verification</name>
    <value>false</value>
    <description>
      Enforce metastore schema version consistency.
      True: Verify that version information stored in is compatible with one from Hive jars.  Also disable automatic
            schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
            proper metastore schema migration. (Default)
      False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
    </description>
  </property>
  
  <property>
    <name>hive.metastore.event.db.notification.api.auth</name>
    <value>false</value>
    <description>
      Should metastore do authorization against database notification related APIs such as get_next_notification.
      If set to true, then only the superusers in proxy settings have the permission
    </description>
  </property>
  
  <property>
    <name>hive.querylog.location</name>
    <value>/opt/moudle/hive/tmp/${system:user.name}</value>
    <description>Location of Hive run time structured log file</description>
  </property>
  
  <property>
    <name>hive.exec.local.scratchdir</name>
    <value>/opt/moudle/hive/tmp/${user.name}</value>
    <description>Local scratch space for Hive jobs</description>
</property>

<property>
    <name>hive.downloaded.resources.dir</name>
    <value>/opt/moudle/hive/tmp/${hive.session.id}_resources</value>
    <description>Temporary local directory for added resources in the remote file system.</description>
</property>

<property>
    <name>hive.server2.logging.operation.log.location</name>
    <value>/opt/moudle/hive/tmp/${system:user.name}/operation_logs</value>
    <description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
           

8.将hadoop中的guava.jar复制到hive的lib目录下,并删除hive的guava.jar

cp /opt/moudle/hadoop/share/hadoop/common/lib/guava-27.0-jre.jar lib/
rm -rf guava-19.0.jar 
           

10.不显示info信息

set hive.server2.logging.operation.level=NONE
           

11.格式化(在hadoop12上进行)

schematool -dbType mysql -initSchema
           

12在hadoop12上启动

hive --service metastore &
           

执行命令会出现卡顿,并不是错误,按下回车键即可。

数据仓库的搭建一、构建数据仓库思路二、在hadoop13上安装mysql三、Hive的安装配置(每台机器都需要)

13.在hadoop11上启动hive

hive
           
数据仓库的搭建一、构建数据仓库思路二、在hadoop13上安装mysql三、Hive的安装配置(每台机器都需要)

参考:全国大学生大数据技能竞赛