天天看点

hadoop伪分布式部署

1.安装包准备

安装包:http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.16.2.tar.gz

jdk:链接:https://pan.baidu.com/s/11dt1k_hK17wv8ptCkeGfVg

提取码:dair

2.环境准备

2.1 创建用户/目录

[[email protected] ~]# useradd wzj

[[email protected] ~]# su - wzj

Last login: Mon Dec 2 10:03:41 CST 2019 on pts/0

[[email protected] ~]$

[[email protected]~]$ mkdir app software sourcecode log tmp data lib
[[email protected]~]$ ll
total 0
drwxrwxr-x 2 wzj wzj 6 Nov 27 21:32 app         解压的文件夹  软连接
drwxrwxr-x 2 wzj wzj 6 Nov 27 21:32 data        数据
drwxrwxr-x 2 wzj wzj 6 Nov 27 21:32 lib         第三方的jar
drwxrwxr-x 2 wzj wzj 6 Nov 27 21:32 log         日志文件夹
drwxrwxr-x 2 wzj wzj 6 Nov 27 21:32 software    压缩包
drwxrwxr-x 2 wzj wzj 6 Nov 27 21:32 sourcecode  源代码编译
drwxrwxr-x 2 wzj wzj 6 Nov 27 21:32 tmp         临时文件夹
           

2.2 上传/解压压缩包

hadoop安装包上传解压

[[email protected] ~]$ cd software/    ###安装包rz上传至software文件夹
[[email protected] software]$ ll
total 424176
-rw-r--r--. 1 wzj wzj 434354462 Dec  2 11:06 hadoop-2.6.0-cdh5.16.2.tar.gz
[[email protected] software]$ tar -xzvf hadoop-2.6.0-cdh5.16.2.tar.gz -C ../app/  ###解压至app目录
[[email protected] software]$ cd ../app
[[email protected] app]$ ll
total 0
drwxr-xr-x. 14 wzj wzj 241 Jun  3 19:11 hadoop-2.6.0-cdh5.16.2
[[email protected] app]$ ln -s hadoop-2.6.0-cdh5.16.2/ hadoop ###软连接
[[email protected] app]$ ll
total 0
lrwxrwxrwx.  1 wzj wzj  23 Dec  2 11:23 hadoop -> hadoop-2.6.0-cdh5.16.2/
drwxr-xr-x. 14 wzj wzj 241 Jun  3 19:11 hadoop-2.6.0-cdh5.16.2
           

jdk上传解压

解压后的文件权限需要特别注意

mkdir /usr/java
cd /usr/java
rz上传   jdk-8u45-linux-x64.gz包
[[email protected] java]# tar -xzvf jdk-8u45-linux-x64.gz
[[email protected] java]# ll
total 169212
drwxr-xr-x. 8   10  143       255 Apr 11  2015 jdk1.8.0_45
-rw-r--r--. 1 root root 173271626 Dec  2 10:16 jdk-8u45-linux-x64.gz
[[email protected] java]# chown -R  root:root jdk1.8.0_45  ###切记修正权限
[[email protected] java]# ll
total 169212
drwxr-xr-x. 8 root root       255 Apr 11  2015 jdk1.8.0_45
-rw-r--r--. 1 root root 173271626 Dec  2 10:16 jdk-8u45-linux-x64.gz
           

2.3 java环境变量配置

[[email protected] usr]# vi /etc/profile ##编辑文件

添加以下变量配置

###java env
export JAVA_HOME=/usr/java/jdk1.8.0_45
export PATH=$JAVA_HOME/bin:$PATH
           

[[email protected] usr]# source /etc/profile 生效环境变量

[[email protected] usr]# which java

/usr/java/jdk1.8.0_45/bin/java

3.hadoop安装部署

3.1 JAVA_HOME 显性配置

[[email protected] hadoop]$ vi hadoop-env.sh  #java_home 的路径修改为具体位置
# The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.8.0_45
           

3.2 伪分布式文件配置

/home/wzj/app/hadoop/etc/hadoop

core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop001:9000</value>
    </property>
</configuration>
###hadoop001换为你的ip或者你的机器名,如果像这样使用机器名,你需要先去/etc/hosts文件配置ip:机器名
           

hdfs-site.xml

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>
           

mapred-site.xml

需要先将模板文件拷贝一份

[[email protected] hadoop]$ cp mapred-site.xml.template mapred-site.xml
[[email protected] hadoop]$ vi mapred-site.xml
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
           

yarn-site.xml

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>hadoop001:38088</value>
        ####为了防止病毒,挖矿,8088的默认端口进行修改
    </property>
</configuration>
           

3.3 ssh无密码信任关系

[[email protected] ~]$ ssh-keygen ##生成密钥
Generating public/private rsa key pair.
Enter file in which to save the key (/home/wzj/.ssh/id_rsa): 
Created directory '/home/wzj/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/wzj/.ssh/id_rsa.
Your public key has been saved in /home/wzj/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:gUjtPp1Y48YtMpXXrd6VWYlvNwDhIoQvqZiW+kHts38 [email protected]
The key's randomart image is:
+---[RSA 2048]----+
|    ....   ..    |
|   . oo.  ..     |
|    ..+..o o... .|
|   . o..*.o .o...|
|  = o..BS=   .o +|
| * o  * B . .  *o|
|o . o  = . . ...o|
|.  . o  E   . .  |
| .. ....         |
+----[SHA256]-----+
           
[[email protected] .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[[email protected] .ssh]$ ll
total 12
-rw-rw-r--. 1 wzj wzj  395 Dec  2 11:38 authorized_keys
-rw-------. 1 wzj wzj 1679 Dec  2 11:36 id_rsa
-rw-r--r--. 1 wzj wzj  395 Dec  2 11:36 id_rsa.pub
[[email protected] .ssh]$ chmod 0600 ~/.ssh/authorized_keys
#如果是root用户,权限修改不影响,如果是其它用户,在此必须进行600的权限修改
[[email protected] .ssh]$ ll
total 12
-rw-------. 1 wzj wzj  395 Dec  2 11:38 authorized_keys
-rw-------. 1 wzj wzj 1679 Dec  2 11:36 id_rsa
-rw-r--r--. 1 wzj wzj  395 Dec  2 11:36 id_rsa.pub
           

3.4 环境变量配置

[[email protected] ~]$ vi .bashrc
# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
	. /etc/bashrc
fi

# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=
export HADOOP_HOME=/home/wzj/app/hadoop
export PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
# User specific aliases and functions
[[email protected] ~]$ source .bashrc
[[email protected] ~]$ which hadoop
~/app/hadoop/bin/hadoop
           

3.5 格式化hdfs

[[email protected] ~]$ hdfs namenode -format
           

第一次启动hdfs

[[email protected] ~]$ start-dfs.sh
19/12/02 11:43:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hadoop001]
The authenticity of host 'hadoop001 (192.168.116.132)' can't be established.
ECDSA key fingerprint is SHA256:+bDm6V9qn91ZZUjoaoIOuO0GgY2diaDWh+UANbw/CXM.
ECDSA key fingerprint is MD5:76:0f:37:32:3f:9a:7d:b2:b7:51:a6:7f:61:b2:fa:b8.
Are you sure you want to continue connecting (yes/no)? yes
hadoop001: Warning: Permanently added 'hadoop001,192.168.116.132' (ECDSA) to the list of known hosts.
hadoop001: starting namenode, logging to /home/wzj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-wzj-namenode-hadoop001.out
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:+bDm6V9qn91ZZUjoaoIOuO0GgY2diaDWh+UANbw/CXM.
ECDSA key fingerprint is MD5:76:0f:37:32:3f:9a:7d:b2:b7:51:a6:7f:61:b2:fa:b8.
Are you sure you want to continue connecting (yes/no)? yes
localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
localhost: starting datanode, logging to /home/wzj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-wzj-datanode-hadoop001.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:+bDm6V9qn91ZZUjoaoIOuO0GgY2diaDWh+UANbw/CXM.
ECDSA key fingerprint is MD5:76:0f:37:32:3f:9a:7d:b2:b7:51:a6:7f:61:b2:fa:b8.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/wzj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-wzj-secondarynamenode-hadoop001.out
19/12/02 11:44:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[[email protected] ~]$ jps
10131 SecondaryNameNode
11047 Jps
9963 DataNode
9836 NameNode
           

第一次启动发现需要输入几次yes,当$ stop-dfs.sh之后,再次进行启动则不需要输入yes,这是因为在.ssh文件下产生的known_hosts文件夹里边生成了各进程启动对应的密钥,实际上密钥是一样的

[[email protected] .ssh]$ cat known_hosts

hadoop001,192.168.116.132 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBAkpAKYqKhH6OCDBuao4+8iHbym3LXPcBY3eVB5U2kuT3ce2EPb44KvksvE0ss45ps1iWUnWvs6+FAHa7YRP6Qs=

localhost ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBAkpAKYqKhH6OCDBuao4+8iHbym3LXPcBY3eVB5U2kuT3ce2EPb44KvksvE0ss45ps1iWUnWvs6+FAHa7YRP6Qs=

0.0.0.0 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBAkpAKYqKhH6OCDBuao4+8iHbym3LXPcBY3eVB5U2kuT3ce2EPb44KvksvE0ss45ps1iWUnWvs6+FAHa7YRP6Qs=

此处如果遇到明显关于known_hosts文件相关报错,可删除对应信息,重新生成

3.6 启动yarn

[[email protected] hadoop]$ which start-yarn.sh
~/app/hadoop/sbin/start-yarn.sh
[[email protected] hadoop]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/wzj/app/hadoop-2.6.0-cdh5.16.2/logs/yarn-wzj-resourcemanager-hadoop001.out
hadoop001: starting nodemanager, logging to /home/wzj/app/hadoop-2.6.0-cdh5.16.2/logs/yarn-wzj-nodemanager-hadoop001.out
[[email protected] hadoop]$ jps
21746 Jps
21594 NodeManager
21499 ResourceManager
           

3.7 优化

[[email protected] ~]$ start-dfs.sh

19/12/02 12:00:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable

Starting namenodes on [hadoop001]

hadoop001: starting namenode, logging to /home/wzj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-wzj-namenode-hadoop001.out

localhost: starting datanode, logging to /home/wzj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-wzj-datanode-hadoop001.out

Starting secondary namenodes [0.0.0.0]

0.0.0.0: starting secondarynamenode, logging to /home/wzj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-wzj-secondarynamenode-hadoop001.out

现在启动hdfs是不同的,我们想统一使用机器名去启动需要修改那几个对应的文件?

namenode===>core-site.xml文件的fs.defaultFS属性配置

datanade===>slaves文件里改为自己的机器名

secondarynamenode===>hdfs-site.xml文件配置添加如下内容

<property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>aliyun:50090</value>       #新旧版本端口号有区别
</property>
<property>
    <name>dfs.namenode.secondary.https-address</name>
    <value>aliyun:50091</value>       #新旧版本端口号有区别
</property>
           

再一次关闭hdfs进程 重新启动

[[email protected] hadoop]$ start-dfs.sh

19/12/02 12:56:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable

Starting namenodes on [hadoop001]

hadoop001: starting namenode, logging to /home/wzj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-wzj-namenode-hadoop001.out

hadoop001: starting datanode, logging to /home/wzj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-wzj-datanode-hadoop001.out

Starting secondary namenodes [hadoop001]

hadoop001: starting secondarynamenode, logging to /home/wzj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-wzj-secondarynamenode-hadoop001.out

4.部署完成
[[email protected] ~]$ jps
26227 SecondaryNameNode
26037 DataNode
26344 Jps
21594 NodeManager
21499 ResourceManager
25934 NameNode
           

查询NameNode,ResourceManager对应的web端口号

yarn调度页面

hadoop伪分布式部署

hdfs页面

hadoop伪分布式部署

继续阅读