天天看点

linux+scala+hadoop+yarn+hbase+spark 环境配置一、scala环境配置二、maven环境变量的配置三、hadoop环境的安装(默认端口50070)四、yarn环境的搭建(默认端口8088)五、命令行实战 六、HBase的安装七、spark环境的搭建

一、scala环境配置

1.1  下载地址: https://downloads.lightbend.com/scala/2.11.8/scala-2.11.8.tgz 

1.2 解压、配置环境变量 ~./bash_profile

1.3 命令行直接输入scala测试

二、maven环境变量的配置

2.1 下载地址3.3.9:https://archive.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz

2.2 解压配置环境变量

2.3 修改config配置文件settings.xml 以防空间不够

三、hadoop环境的安装(默认端口50070)

3.1下载安装jdk, hadoop,配置环境变量

3.2 配置SSH登录 - 一路回车

// 1. 生成SSH连接
ssh-keygen -t rsa

// 2. 查看.ssh目录
cd .ssh/

// 3. [[email protected] .ssh]$ ls
id_rsa  id_rsa.pub

// 拷贝一份
 cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
// [[email protected] .ssh]$
 ls
authorized_keys  id_rsa  id_rsa.pub

// 此时 ssh已经通了
           

3.3然后解压配置hadoop环境变量

tar -zxvf hadoop-2.6.0-cdh5.7.0.tar.gz -C /home/peng/peng/app/
           

3.4 修改hadoop配置文件

[[email protected] hadoop]$ vi hadoop-env.sh

[[email protected] hadoop]$ vi core-site.xml 

[[email protected] hadoop]$ vi hdfs-site.xml 

[[email protected] hadoop]$ vi slaves 

// 切换到配置文件的地方
 cd hadoop-2.6.0-cdh5.7.0/etc/hadoop

// 全部的配置文件 ls

capacity-scheduler.xml      httpfs-env.sh            mapred-env.sh
configuration.xsl           httpfs-log4j.properties  mapred-queues.xml.template
container-executor.cfg      httpfs-signature.secret  mapred-site.xml.template
core-site.xml               httpfs-site.xml          slaves
hadoop-env.cmd              kms-acls.xml             ssl-client.xml.example
hadoop-env.sh               kms-env.sh               ssl-server.xml.example
hadoop-metrics2.properties  kms-log4j.properties     yarn-env.cmd
hadoop-metrics.properties   kms-site.xml             yarn-env.sh
hadoop-policy.xml           log4j.properties         yarn-site.xml
hdfs-site.xml               mapred-env.cmd

// ------------------只修改关键的几个文件-------------------
// 1. 修改第一个配置文件 - 只配置一个java的安装目录就行
vi hadoop-env.sh

# export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/home/peng/peng/app/jdk1.8.0_201
// -----------------------------------------------------


// 2. 修改第二个配置文件 - core-site.xml - 配置文件存放地址相关
vi vi core-site.xml 

<configuration>
<property>
  <name>fs.defaultFS</name>
  <value>hdfs://localhost:8020</value>
</property>

<property>
  <name>hadoop.tmp.dir</name>
  <value>/home/peng/peng/hadoop/tmp</value>
</property>
</configuration>

// -----------------------------------------------------


// 3. 修改第三个配置文件 - hdfs-site.xml - 调节副本系数 
vi hdfs-site.xml  

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
</configuration>
// -----------------------------------------------------


 
// 4. 修改第四个配置文件 - vi slaves - 更改域名相关的 可不改
vi slaves
localhost -> hadoop000 

// -----------------------------------------------------


           

老师 core-site.xml文件的配置

linux+scala+hadoop+yarn+hbase+spark 环境配置一、scala环境配置二、maven环境变量的配置三、hadoop环境的安装(默认端口50070)四、yarn环境的搭建(默认端口8088)五、命令行实战 六、HBase的安装七、spark环境的搭建

 3.5 hadoop环境变量配完后(省略), 对HDFS进行格式化

pwd
//  /home/peng/peng/app/hadoop-2.6.0-cdh5.7.0/bin

ls
// hadoop  hadoop.cmd  hdfs  hdfs.cmd  mapred  mapred.cmd  rcc  yarn  yarn.cmd

// 格式化

./hdfs namenode -format  或者 ./hadoop namenode -format
// Storage directory /home/peng/peng/hadoop/tmp/dfs/name has been successfully formatted.

// 启动hadoop 
cd sbin/

// 在启动的时候,会让你输入YES, 这是因为配置SSH免密登录,之后不会再登录
./start-dfs.sh

// 启动成功
[[email protected] sbin]$ jps
18807 QuorumPeerMain
24651 NameNode
24747 DataNode
24879 SecondaryNameNode
25119 Jps

           

再虚拟机浏览器输入localhost:50070访问即可 

linux+scala+hadoop+yarn+hbase+spark 环境配置一、scala环境配置二、maven环境变量的配置三、hadoop环境的安装(默认端口50070)四、yarn环境的搭建(默认端口8088)五、命令行实战 六、HBase的安装七、spark环境的搭建

四、yarn环境的搭建(默认端口8088)

4.1 在/etc/hadoop目录下面配置

// 1 进入目录
cd etc/hadoop

// 2  拷贝配置文件
cp mapred-site.xml.template mapred-site.xml
 
// 3 编辑配置 mapred-site.xml- 告诉我们要使用分布式的YARN框架
vi mapred-site.xml

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>


// 4 配置这个 vi yarn-site.xml 
<configuration>

<!-- Site specific YARN configuration properties -->
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>
</configuration>

           

4.2 启动yarn 

./start-yarn.sh

// 命令行  查看启动  jps
25939 ResourceManager
26035 NodeManager
           

4.3 验证启动

linux+scala+hadoop+yarn+hbase+spark 环境配置一、scala环境配置二、maven环境变量的配置三、hadoop环境的安装(默认端口50070)四、yarn环境的搭建(默认端口8088)五、命令行实战 六、HBase的安装七、spark环境的搭建

五、命令行实战 

// 测试hadoop 在SBIN目录里面

hadoop fs -mkdir /haData

hadoop fs -ls /

hadoop fs -put slaves.sh /haData

hadoop fs -ls /haData

hadoop fs -text /haData/slaves.sh

// 测试yarn - 在
/home/peng/peng/app/hadoop-2.6.0-cdh5.7.0/share/hadoop/mapreduce下面有很多测试用例


hadoop jar hadoop-mapreduce-examples-2.6.0-cdh5.7.0.jar pi 2 3

验证yarn没毛病
           

六、HBase的安装

6.1 下载地址:http://archive.apache.org/dist/hbase/1.2.0/hbase-1.2.0-bin.tar.gz

解压 , 配置环境变量

6.2 修改conf配置文件

/**
* 1.  hbase-env.sh   修改两个地方
*/
vi hbase-env.sh
-------------------------------------------

export JAVA_HOME=/home/peng/peng/

// 让zookeeper来管理实例不用自己
export HBASE_MANAGES_ZK=false
-------------------------------------------

/**
* 2.  hbase-site.xml   添加三个地方
*/
vi hbase-site.xml
-------------------------------------------

<configuration>

#存储在hdfs之上,配置地址

/**
*注意,这个地方一定要跟 hadoop core-site.xml 一模一样
*注意,这个地方一定要跟 hadoop core-site.xml 一模一样
*注意,这个地方一定要跟 hadoop core-site.xml 一模一样
*/
<property>
  <name>hbase.rootdir</name>
  <value>hdfs://localhost:8020/hbase</value>
</property>

# 单机伪分布式
<property>
  <name>hbase.cluster.distributed</name>
  <value>true</value>
</property>

#zookeeper管理地址
<property>
  <name>hbase.zookeeper.quorum</name>
  <value>localhost:2181</value>
</property>

</configuration>
-------------------------------------------

/**
* 3.  regionservers   存放多个节点得地址IP
*/
vi regionservers
-------------------------------------------
localhost - hadoop000
-------------------------------------------
           

6.3 启动zookeeper    ./zkServer.sh start

6.4启动hbase   localhost:60010

./start-hbase.sh
jps 
20257 Jps
19635 QuorumPeerMain
// 下面两个一定要起来
20060 HMaster
20172 HRegionServer
           

6.5测试执行hbase脚本

//执行hdfs脚本

./ hbase

// 测试命令
version
status

// 创建表
create 'member', 'info', 'address'

list

desc

describe member

           

【错误1】

当执行命令 ./hbase shell   -> status 的时候遇到错误:

【ERROR: Can't get master address from ZooKeeper; znode data == null】

七、spark环境的搭建

7.1、首先,我们下周源码,自己编译,编译很重要,这里给出编译教程:

7.2、安装,配置环境变量:我是直接下载的,编译以后再深入

下载地址: https://archive.apache.org/dist/spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.6.tgz

7.3、直接使用:

./spark -shell --master local[2]
           

 【错误1】

ERROR SparkContext: Error initializing SparkContext.
java.net.BindException: Cannot assign requested address: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
	at sun.nio.ch.Net.bind0(Native Method)
	at sun.nio.ch.Net.bind(Net.java:433)
	at sun.nio.ch.Net.bind(Net.java:425)
	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
	at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:127)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:501)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1218)
	at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:496)
	at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:481)
	at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:965)
	at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:210)
	at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:353)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
	at java.lang.Thread.run(Thread.java:748)
java.net.BindException: Cannot assign requested address: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
  at sun.nio.ch.Net.bind0(Native Method)
  at sun.nio.ch.Net.bind(Net.java:433)
  at sun.nio.ch.Net.bind(Net.java:425)
  at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
  at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:127)
  at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:501)
  at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1218)
  at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:496)
  at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:481)
  at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:965)
  at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:210)
  at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:353)
  at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
  at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446)
  at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
  at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
  at java.lang.Thread.run(Thread.java:748)
<console>:14: error: not found: value spark
       import spark.implicits._
              ^
<console>:14: error: not found: value spark
       import spark.sql
              ^
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
      /_/
         
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_201)
Type in expressions to have them evaluated.
Type :help for more information.
           

继续阅读