天天看點

Zeppelin的初體驗--安裝,hive on Zeppelin簡介hive on Zeppelinhive on zeppelin 問題解決:

簡介

  eppelin是一個基于Web的notebook,提供互動資料分析和可視化。背景支援接入多種資料處理引擎,如spark,hive等。支援多種語言: Scala(Apache Spark)、Python(Apache Spark)、SparkSQL、 Hive、 Markdown、Shell等。本文主要介紹Zeppelin中Interpreter和SparkInterpreter的實作原理。

  • 官方網址: http://zeppelin.apache.org/
  • Zeppelin 下載下傳位址:

    wget https://mirror.bit.edu.cn/apache/zeppelin/zeppelin-0.8.2/zeppelin-0.8.2-bin-all.tgz

  • 解壓 安裝 Zeppelin
# 解壓
[hadoop@hadoop001 software]$ tar -zxvf zeppelin-0.8.2-bin-all.tgz -C ~/app/
# 配置檔案
[hadoop@hadoop001 ~]$ cd app/zeppelin-0.8.2-bin-all/conf

[hadoop@hadoop001 conf]$ vi zeppelin-site.xml 
# 修改兩個配置 其他預設
<property>
  <name>zeppelin.server.addr</name>
  <value>hadoop001</value>  #自己主機的ip 或 0.0.0.0
  <description>Server binding address</description>
</property>

<property>
  <name>zeppelin.server.port</name>
  <value>8084</value>  #注意端口是否被占用 預設端口 8080
  <description>Server port.</description>
</property>

[hadoop@hadoop001 conf]$ vi zeppelin-env.sh
export JAVA_HOME=/root/apps/jdk1.8.0_221
export SPARK_HOME=/home/hadoop/app/spark-2.4.4-bin-2.6.0-cdh5.15.1
export SPARK_APP_NAME="ZeppelinAaron"
export HADOOP_CONF_DIR=/root/apps/hadoop/etc/hadoop

           
  • 啟動 Zeppelin

    ./zeppelin-daemon.sh start

    -------------------------------------------------------至此 Zeppelin 已經完成安裝-----------------------------------------------------------------------

hive on Zeppelin

  • 配置Hive Interpreter

      nterpreter 是Zeppelin裡最重要的概念,每一種Interpreter對應一個引擎。Hive對應的Interpreter是Jdbc Interpreter, 因為Zeppelin是通過Hive的Jdbc接口來運作Hive SQL。

       接下來你可以在Zeppelin的Interpreter頁面配置Jdbc Interpreter來啟用Hive。首先我想說明的是Zeppelin的Jdbc Interpreter可以支援所有Jdbc協定的資料庫,Zeppelin 的Jdbc Interpreter預設是連接配接Postgresql。

啟動Hive,可以有2種選擇:

  • 修改預設jdbc interpreter的配置項(這種配置下,在Note裡用hive可以直接 %jdbc 開頭)
  • 建立一個新的Jdbc interpreter,命名為hive (這種配置下,在Note裡用hive可以直接 %hive 開頭)

這裡我會選用第2種方法。建立一個新的hive interpreter,然後配置以下基本的屬性(你需要根據自己的環境做配置)

配置項
default.driver org.apache.hive.jdbc.HiveDriver (注: Zeppelin中無此jar,需要自己添加依賴)
default.url jdbc:hive2://hadoop001:10000 (端口号10000,是 hive 中 HiveServer2預設值)
default.user hadoop
default.password ***********
  • 添加依賴 (注: jar版本需要與hive的版本一緻)
    Zeppelin的初體驗--安裝,hive on Zeppelin簡介hive on Zeppelinhive on zeppelin 問題解決:
       hive.url的預設配置形式是 jdbc:hive2://host:port/<db_name>, 這裡的host是你的hiveserver2的機器名,port是 hiveserver2的thrift 端口 (如果你的hiveserver2用的是binary模式,那麼對應的hive配置是hive.server2.thrift.port (預設是10000),如果是http模式,那麼對應的hive配置是hive.server2.thrift.http.port,(預設是10001) 。db_name是你要連的hive 資料庫的名字,預設是default。
  • hive 開啟hiveserver2
[[email protected] bin]$ ./hiveserver2 start
[[email protected] lib]$ jps -m
9984 RunJar /home/hadoop/app/hive/lib/hive-service-1.1.0-cdh5.15.1.jar org.apache.hive.service.server.HiveServer2 --hiveconf hive.aux.jars.path=file:///home/hadoop/app/hive/auxlib/hive-exec-1.1.0-cdh5.15.1-core.jar start

           
Zeppelin的初體驗--安裝,hive on Zeppelin簡介hive on Zeppelinhive on zeppelin 問題解決:
  • Zeppelin 操作hive中的表
    Zeppelin的初體驗--安裝,hive on Zeppelin簡介hive on Zeppelinhive on zeppelin 問題解決:

hive on zeppelin 問題解決:

  • java.lang.ClassNotFoundException: org.apache.hive.service.rpc.thrift.TCLIService$Iface
  • 導入jar–hive-service-1.1.0.jar
java.lang.ClassNotFoundException: org.apache.hive.service.rpc.thrift.TCLIService$Iface
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
	at java.sql.DriverManager.getConnection(DriverManager.java:664)
	at java.sql.DriverManager.getConnection(DriverManager.java:208)
	at org.apache.commons.dbcp2.DriverManagerConnectionFactory.createConnection(DriverManagerConnectionFactory.java:79)
	at org.apache.commons.dbcp2.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:205)
	at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
	at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
	at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
	at org.apache.commons.dbcp2.PoolingDriver.connect(PoolingDriver.java:129)

           
  • java.lang.ClassNotFoundException: org.apache.hadoop.hive.common.auth.HiveAuthUtils
  • 導入hive-common-1.1.0
java.lang.ClassNotFoundException: org.apache.hadoop.hive.common.auth.HiveAuthUtils
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at org.apache.hive.jdbc.HiveConnection.createUnderlyingTransport(HiveConnection.java:376)
	at org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:396)
	at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:201)
	at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:168)
	at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
	at java.sql.DriverManager.getConnection(DriverManager.java:664)
	at java.sql.DriverManager.getConnection(DriverManager.java:208)
	at org.apache.commons.dbcp2.DriverManagerConnectionFactory.createConnection(DriverManagerConnectionFactory.java:79)
	at org.apache.commons.dbcp2.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:205)
	at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
	at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
	at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
	at org.apache.commons.dbcp2.PoolingDriver.connect(PoolingDriver.java:129)
	at java.sql.DriverManager.getConnection(DriverManager.java:664)
	at java.sql.DriverManager.getConnection(DriverManager.java:270)
	at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnectionFromPool(JDBCInterpreter.java:425)
	at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:443)
	at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:692)
	at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:820)
	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:632)
	at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
	at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
           
  • java.lang.NoClassDefFoundError: com/google/common/primitives/Ints
  • –導入jar guava-14.0.1.jar
java.lang.NoClassDefFoundError: com/google/common/primitives/Ints
	at org.apache.hive.service.cli.Column.<init>(Column.java:150)
	at org.apache.hive.service.cli.ColumnBasedSet.<init>(ColumnBasedSet.java:51)
	at org.apache.hive.service.cli.RowSetFactory.create(RowSetFactory.java:37)
	at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:367)
	at org.apache.commons.dbcp2.DelegatingResultSet.next(DelegatingResultSet.java:191)
	at org.apache.commons.dbcp2.DelegatingResultSet.next(DelegatingResultSet.java:191)
	at org.apache.zeppelin.jdbc.JDBCInterpreter.getResults(JDBCInterpreter.java:567)
	at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:749)
	at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:820)
	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:632)
	at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
	at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: com.google.common.primitives.Ints
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 20 more
           
  • java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
  • 解決: 檢視HiveServer2日志 發現 權限問題
  • Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=hive, access=EXECUTE, inode="/tmp/hadoop-yarn/staging":hadoop:supergroup:drwx------
java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
	at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:295)
	at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291)
	at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291)
	at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:737)
	at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:820)
	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:632)
	at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
	at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
           

繼續閱讀