首先,安裝spark之前需要安裝配置的軟體有:JDK,Scala,ssh,Hadoop這些開發平台的安裝配置在我之前的部落格中都有詳細的攻略,需要的請去看看。
hadoop安裝配置
再此提一句,無論是hadoop,hbase,hive,spark都是需要版本适配的,不然就會多很多步的不必要操作,版本的适配官網上都有,這裡寫者是使用:jdk1.7+hadoop2.6.4+scala2.11.8+spark1.6.1。
由于spark的核心是scala,是以使用spark之前,必先安裝scala,那麼廢話不多說,開始安裝。
scala安裝配置
- 解壓scala-2.11.8.tgz
[email protected]:/software/spark-1.6.1-bin-hadoop2.6$ cd ~
[email protected]:~$ cd Downloads/ [email protected]:~/Downloads$ ls
apache-hive-2.0.0-bin.tar.gz scala-2.11.8.tgz hadoop-2.6.4.tar.gz
spark-1.6.1-bin-hadoop2.6.tgz hbase-1.2.1-bin.tar.gz
zookeeper-3.5.0-alpha.tar.gz jdk-7u80-linux-x64.tar.gz
[email protected]:~/Downloads$ cd /software/ [email protected]:/software$ tar
-zxvf scala-2.11.8/
2.配置環境變量
[email protected]:/software$ sudo gedit /etc/profile
裡面添加
export SCALA_HOME=/software/scala-2.11.8
export PATH=$SCALA_HOME/bin:$PATH
[email protected]:/software$ source /etc/profile
3.啟動及驗證
[email protected]:/software$ scala
Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_80).
Type in expressions for evaluation. Or try :help.
scala> 8*8
res0: Int = 64
scala> ;
安裝spark
- 解壓spark-1.6.1-bin-hadoop2.6.tgz
[email protected]:/software$ tar -zxvf ~/Downloads/spark-1.6.1-bin-hadoop2.6.tgz
2.配置環境變量
sudo gedit /etc/profile
export SPARK_HOME=/software/spark-1.6.1-bin-hadoop2.6
export PATH=$SPARK_HOME/bin:$PATH
[email protected]:/software$ source /etc/profile
3.修改spark-env.sh
[email protected]:~$ cd /software/spark-1.6.1-bin-hadoop2.6/conf/
[email protected]:/software/spark-1.6.1-bin-hadoop2.6/conf$ ls
Docker.properties.template metrics.properties.template
spark-env.sh.template fairscheduler.xml.template slaves.template
log4j.properties.template spark-defaults.conf.template
[email protected]:/software/spark-1.6.1-bin-hadoop2.6/conf$ cp
spark-env.sh.template spark-env.sh
[email protected]:/software/spark-1.6.1-bin-hadoop2.6/conf$ sudo gedit
spark-env.sh
加入
export SCALA_HOME=/software/scala-2.11.8
export JAVA_HOME=/software/jdk1.7.0_80
export SPARK_MASTER_IP=master
export SPARK_WORKER_MEMORY=512m
export master=spark://master:7070
修改slaves
[email protected]:/software/spark-1.6.1-bin-hadoop2.6/conf$ cp slaves.template slaves
[email protected]:/software/spark-1.6.1-bin-hadoop2.6/conf$ sudo gedit slaves
loaclhost改為master
4.啟動spark
[email protected]:/software/spark-1.6.1-bin-hadoop2.6$ sbin/start-all.sh
starting org.apache.spark.deploy.master.Master, logging to
/software/spark-1.6.1-bin-hadoop2.6/logs/spark-tg-org.apache.spark.deploy.master.Master-1-master.out
master: starting org.apache.spark.deploy.worker.Worker, logging to
/software/spark-1.6.1-bin-hadoop2.6/logs/spark-tg-org.apache.spark.deploy.worker.Worker-1-master.out
jps檢視程序 多了Worker,Master
5.進入spark-shell
[email protected]:/software/spark-1.6.1-bin-hadoop2.6$ bin/spark-shell
log4j:WARN No appenders could be found for logger
(org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN
Please initialize the log4j system properly. log4j:WARN See
http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark’s repl log4j profile:
org/apache/spark/log4j-defaults-repl.properties To adjust logging
level use sc.setLogLevel(“INFO”) Welcome to
__
/ / _ _/ /__
\ \/ \/ _ `/ / ‘/ // ._/_,// //_\ version 1.6.1
/_/
Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java
1.7.0_80) Type in expressions to have them evaluated. Type :help for more information. Spark context available as sc. 16/05/31 05:58:36
WARN Connection: BoneCP specified but not present in CLASSPATH (or one
of dependencies) 16/05/31 05:58:37 WARN Connection: BoneCP specified
but not present in CLASSPATH (or one of dependencies) 16/05/31
05:58:45 WARN ObjectStore: Version information not found in metastore.
hive.metastore.schema.verification is not enabled so recording the
schema version 1.2.0 16/05/31 05:58:46 WARN ObjectStore: Failed to get
database default, returning NoSuchObjectException 16/05/31 05:58:50
WARN Connection: BoneCP specified but not present in CLASSPATH (or one
of dependencies) 16/05/31 05:58:51 WARN Connection: BoneCP specified
but not present in CLASSPATH (or one of dependencies) 16/05/31
05:58:57 WARN ObjectStore: Version information not found in metastore.
hive.metastore.schema.verification is not enabled so recording the
schema version 1.2.0 16/05/31 05:58:58 WARN ObjectStore: Failed to get
database default, returning NoSuchObjectException SQL context
available as sqlContext.
scala>
可以通過浏覽器通路檢視:url為maste的Ip:4040,master的Ip:7077