天天看點

基于CentOS7的hue部署準備工作安裝hue依賴的第三方軟體包下載下傳hue源碼編譯配置啟動hue驗證遇到的問題

文章目錄

  • 準備工作
  • 安裝hue依賴的第三方軟體包
  • 下載下傳hue源碼
  • 編譯
  • 配置
    • 基礎配置
    • 配置資料庫
    • hue內建hadoop3.1.0配置
      • 配置hadoop叢集
      • 配置hue
    • hue內建hive配置
    • hue內建spark配置
      • 安裝livy
      • 配置hue
    • MySql初始化
  • 啟動hue
  • 驗證
    • 驗證hue內建spark配置是否正确
  • 遇到的問題

準備工作

1、 安裝python

2、 安裝maven

3、應用類服務一般用專有賬号啟動,我們建立一個hue使用者和使用者組

groupadd hadoop
useradd -g hadoop hue
           

安裝hue依賴的第三方軟體包

yum -y install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi gcc gcc-c++ krb5-devel libtidy libxml2-devel libxslt-devel openldap-devel python-devel sqlite-devel openssl-devel mysql-devel gmp-devel  
           

下載下傳hue源碼

下載下傳方式一:直接去官網下載下傳

http://gethue.com

下載下傳方式二:通過git下載下傳

git clone https://github.com/cloudera/hue.git branch-4.4
           

這裡采用git下載下傳

mv branch-4.4 hue
           

将hue及其内部檔案所屬使用者設定為hue,所屬使用者組設定為hadoop,然後切換hue使用者下

chown -R  hue:hadoop hue
su hue
           

編譯

cd hue
make apps
           

配置

修改/usr/local/hue/desktop/conf/pseudo-distributed.ini

基礎配置

[desktop]
       # 安全秘鑰,存儲session的加密處理
       secret_key=dfsahjfhflsajdhfljahl
       # Time zone name
       time_zone=Asia/Shanghai
       # Enable or disable debug mode.
       django_debug_mode=false
       # Enable or disable backtrace for server error
       http_500_debug_mode=false
       # This should be the hadoop cluster admin
       ## default_hdfs_superuser=hdfs
       default_hdfs_superuser=root
       # 不啟用的子產品
       # app_blacklist=impala,security,rdbms,jobsub,pig,hbase,sqoop,zookeeper,metastore,indexer
           

配置資料庫

[[database]]
       # 資料庫引擎類型
       engine=mysql
       # 資料庫主機位址
       host=10.62.124.43
       # 資料庫端口
       port=3306
       # 資料庫使用者名
       user=root
       # 資料庫密碼
       password=xhw888
       # 資料庫庫名
       name=hue
           

hue內建hadoop3.1.0配置

配置hadoop叢集

配置etc/Hadoop/core-site.xml

<!-- Hue hue user. Start -->
    <property>
        <name>hadoop.proxyuser.hue.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hue.groups</name>
        <value>*</value>
    </property>
    <!-- Hue hue user. End -->
           

配置好後,重新開機dfs使配置生效

配置hue

修改desktop/conf/pseudo-distributed.ini檔案,先找到[[hdfs_clusters]]這個标簽,修改如下:

[hadoop]

  # Configuration for HDFS NameNode
  # ------------------------------------------------------------------------
  [[hdfs_clusters]]
    # HA support by using HttpFs

    [[[default]]]
      # Enter the filesystem uri
      fs_defaultfs=hdfs://hadoopSvr1:8020

      # NameNode logical name.
      logical_name=hadoopSvr1

      # Use WebHdfs/HttpFs as the communication mechanism.
      # Domain should be the NameNode or HttpFs host.
      # Default port is 14000 for HttpFs.
      ## webhdfs_url=http://localhost:50070/webhdfs/v1
      webhdfs_url=http://hadoopSvr1:9870/webhdfs/v1

      # Change this if your HDFS cluster is Kerberos-secured
      ## security_enabled=false

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True

      # Directory of the Hadoop configuration
      ## hadoop_conf_dir=$HADOOP_CONF_DIR when set or '/etc/hadoop/conf'
      hadoop_conf_dir=$HADOOP_CONF_DIR
           

找到[[yarn_clusters]]這個标簽,修改如下:

# Configuration for YARN (MR2)
  # ------------------------------------------------------------------------
  [[yarn_clusters]]

    [[[default]]]
      # Enter the host on which you are running the ResourceManager
      ## resourcemanager_host=localhost
      resourcemanager_host=hadoopSvr3

      # The port where the ResourceManager IPC listens on
      ## resourcemanager_port=8032

      # Whether to submit jobs to this cluster
      submit_to=True

      # Resource Manager logical name (required for HA)
      ## logical_name=

      # Change this if your YARN cluster is Kerberos-secured
      ## security_enabled=false

      # URL of the ResourceManager API
      ## resourcemanager_api_url=http://localhost:8088
      resourcemanager_api_url=http://hadoopSvr3:8088

      # URL of the ProxyServer API
      ## proxy_api_url=http://localhost:8088
      proxy_api_url=http://hadoopSvr3:8088

      # URL of the HistoryServer API
      ## history_server_api_url=http://localhost:19888
      history_server_api_url=http://hadoopSvr4:19888

      # URL of the Spark History Server
      ## spark_history_server_url=http://localhost:18088
      spark_history_server_url=http://hadoopSvr1:18080

      # Change this if your Spark History Server is Kerberos-secured
      ## spark_history_server_security_enabled=false

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True
           

hue內建hive配置

hue主要用于hive的互動式查詢,在hue所在伺服器建立/usr/local/hue/hive/conf目錄,

把hiveService2的hive-site.xml配置檔案複制到該目錄下。

修改desktop/conf/pseudo-distributed.ini檔案,先找到[beeswax]這個标簽,修改如下:

[beeswax]
    # hiveServer2 服務位址(填主機名,kerberos要用)
    hive_server_host=hadoopSvr3
    # hiveServer2服務端口
    hive_server_port=10000
    # hiveServer2 hive-site.xml配置檔案存放位置
    hive_conf_dir=/usr/local/hue/hive/conf
           

hue內建spark配置

啟動spark的thrift server

cd /usr/local/spark/sbin/
start-thriftserver.sh --master yarn --deploy-mode client
           

安裝livy

  1. 下載下傳livy安裝包

    下載下傳位址:http://livy.incubator.apache.org/download/

  2. 解壓zip包
unzip apache-livy-0.6.0-incubating-bin.zip
mv apache-livy-0.6.0-incubating-bin livy-0.6.0
           
  1. 配置livy
cd livy-0.6.0/conf/
           

配置livy-env.sh

cp livy-env.sh.template livy-env.sh
           

建立livy的log目錄,并在livy-env.sh檔案中加入以下内容

export HADOOP_CONF_DIR=/usr/local/hadoop-3.1.0/etc/hadoop
export SPARK_HOME=/usr/local/spark
export LIVY_LOG_DIR=/data/livy/logs
export LIVY_PID_DIR=/data/livy/pid
           

配置livy.conf

cp livy.conf.template livy.conf
           

在livy.conf檔案中加入以下内容

# What port to start the server on.
livy.server.port = 8998
# What spark master Livy sessions should use.
livy.spark.master = yarn
# What spark deploy mode Livy sessions should use.
livy.spark.deploy-mode = client
           
  1. 啟動livy
/usr/local/livy-0.6.0/bin/livy-server start
           

配置hue

修改desktop/conf/pseudo-distributed.ini檔案,先找到[spark]這個标簽,修改如下:

###########################################################################
# Settings to configure the Spark application.
###########################################################################

[spark]
  # The Livy Server URL.
  ## livy_server_url=http://localhost:8998
  livy_server_url=http://hadoopSvr3:8998

  # Configure Livy to start in local 'process' mode, or 'yarn' workers.
  ## livy_server_session_kind=yarn
  livy_server_session_kind=yarn

  # Whether Livy requires client to perform Kerberos authentication.
  ## security_enabled=false

  # Whether Livy requires client to use csrf protection.
  ## csrf_enabled=false

  # Host of the Sql Server
  ## sql_server_host=localhost
  sql_server_host=hadoopSvr1

  # Port of the Sql Server
  ## sql_server_port=10000
  sql_server_port=10000

  # Choose whether Hue should validate certificates received from the server.
  ## ssl_cert_ca_verify=true

###########################################################################
           

MySql初始化

在mysql資料庫上建一個名為hue的庫

# 登入mysql資料庫
mysql -u root -p
# 建立資料庫hue
create database hue;
# 建立使用者
create user 'hue'@'%' identified by 'xhw888';
# 授權
grant all privileges on hue.* to 'hue'@'%';
flush privileges;
           

然後執行如下指令

build/env/bin/hue syncdb
build/env/bin/hue migrate
           

執行完以後,可以在mysql中看到,hue相應的表已經生成。

啟動hue

停止hue

  • 一般情況下,直接使用Ctrl + c來停止hue服務
  • 如果将hue在背景運作的話,可以使用kill指令:
ps -ef | grep hue | grep -v grep | awk '{print $2}' | xargs kill -9
           

驗證

服務啟起來後,預設服務端口為8000

http://10.62.124.44:8000
           

驗證hue內建spark配置是否正确

登入hue背景,打開scala編輯頁,執行以下scala代碼

var counter = 0
val data = Array(1, 2, 3, 4, 5)
var rdd = sc.parallelize(data)
 
// Wrong: Don't do this!!
rdd.map(x=>x+1).collect()
           

若出現如下結果,則證明內建成功

counter: Int = 0
data: Array[Int] = Array(1, 2, 3, 4, 5)
rdd: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at <console>:27
res2: Array[Int] = Array(2, 3, 4, 5, 6)
           

hue管理賬号

使用者名:hue

密碼:xhw888

遇到的問題

  1. mysql初始化過程中報錯

解決方案

登入mysql修改密碼加密插件

繼續閱讀