天天看點

大資料之---hadoop問題排查彙總終極篇---持續更新中

RHEL6 角色 jdk-8u45

hadoop-2.8.1.tar.gz   ssh

xx.xx.xx.xx ip位址 NN hadoop1

xx.xx.xx.xx ip位址 DN hadoop2

xx.xx.xx.xx ip位址 DN hadoop3

xx.xx.xx.xx ip位址 DN hadoop4

xx.xx.xx.xx ip位址 DN hadoop5

本次涉及僞分布式部署隻是要主機hadoop1

HDFS啟動

[hadoop@hadoop01 hadoop]$ ./sbin/start-dfs.sh

Starting namenodes on [hadoop01]

The authenticity of host 'hadoop01 (172.16.18.133)' can't be established.

RSA key fingerprint is 8f:e7:6c:ca:6e:40:78:b8:df:6a:b4:ca:52:c7:01:4b.

Are you sure you want to continue connecting (yes/no)? yes

hadoop01: Warning: Permanently added 'hadoop01' (RSA) to the list of known hosts.

hadoop01: chown: changing ownership of `/opt/software/hadoop-2.8.1/logs': Operation not permitted

hadoop01: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-namenode-hadoop01.out

hadoop01: /opt/software/hadoop-2.8.1/sbin/hadoop-daemon.sh: line 159:

/opt/software/hadoop-2.8.1/logs/hadoop-hadoop-namenode-hadoop01.out: Permission denied

啟動如果有互動輸入密碼,不輸入報錯權限限制,這是因為我們沒有配置互信,

僞分布式即便在同一台機器上面我們也需要配置ssh登陸互信。

非root使用者公鑰檔案權限必須是600權限(root除外)

在hadoop使用者配置ssh免密碼登陸

[hadoop@hadoop01 .ssh]$ cat id_rsa.pub  > authorized_keys

[hadoop@hadoop01 .ssh]$ chmod 600 authorized_keys

[hadoop@hadoop01 hadoop]$ ssh hadoop01 date

[hadoop@hadoop01 .ssh]$

hadoop01: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-datanode-hadoop01.out

Starting secondary namenodes [hadoop01]

hadoop01: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-secondarynamenode-hadoop01.out

[hadoop@hadoop01 hadoop]$ jps

1761 Jps

1622 SecondaryNameNode

1388 DataNode

1276 NameNode

分兩種情況:1、程序不存在,且process information unavailable

                              2、程序存在  報process information unavailable

對于第一種情況:

[hadoop@hadoop01 sbin]$ jps

3108 DataNode

4315 Jps

4156 SecondaryNameNode

2990 NameNode

[hadoop@hadoop01 hsperfdata_hadoop]$ ls

5295  5415  5640

[hadoop@hadoop01 hsperfdata_hadoop]$ ll

total 96

-rw------- 1 hadoop hadoop 32768 Apr 27 09:35 5295

-rw------- 1 hadoop hadoop 32768 Apr 27 09:35 5415

-rw------- 1 hadoop hadoop 32768 Apr 27 09:35 5640

[hadoop@hadoop01 hsperfdata_hadoop]$ pwd

/tmp/hsperfdata_hadoop

裡面記錄jps顯示的程序号,如果此時jps看到報錯

[hadoop@hadoop01 tmp]$ jps

3330 SecondaryNameNode -- process information unavailable

3108 DataNode                         -- process information unavailable

3525 Jps

2990 NameNode                      -- process information unavailable

查詢異常程序是否存在

[hadoop@hadoop01 tmp]$ ps -ef |grep 3330

hadoop    3845  2776  0 09:29 pts/6    00:00:00 grep 3330

對于程序不存在了,ok去/tmp/hsperfdata_xxx删除檔案, 直接重新啟動程序。。

jps查詢的是目前使用者的 hsperfdata_目前使用者/檔案

[root@hadoop01 ~]# jps

7153 -- process information unavailable

8133 -- process information unavailable

7495 -- process information unavailable

8489 Jps

[root@hadoop01 ~]# ps -ef |grep 7153   ---檢視異常程序存在

hadoop    7153     1  2 09:47 ?        00:00:17 /usr/java/jdk1.8.0_45/bin/java -Dproc_namenode -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/software/hadoop-2.8.1/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/software/hadoop-2.8.1 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/software/hadoop-2.8.1/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/software/hadoop-2.8.1/logs -Dhadoop.log.file=hadoop-hadoop-namenode-hadoop01.log -Dhadoop.home.dir=/opt/software/hadoop-2.8.1 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/opt/software/hadoop-2.8.1/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=INFO,RFAS -Dhdfs.audit.logger=INFO,NullAppender -Dhadoop.security.logger=INFO,RFAS -Dhdfs.audit.logger=INFO,NullAppender -Dhadoop.security.logger=INFO,RFAS -Dhdfs.audit.logger=INFO,NullAppender -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.namenode.NameNode

root      8505  2752  0 09:58 pts/6    00:00:00 grep 7153

假如存在,目前使用者檢視就是process information unavailable ,這時候檢視是否程序是否存在,目前使用者  ps –ef |grep  程序号,看程序運作使用者,不是切換使用者

[hadoop@hadoop01 hadoop]$ jps             -----切換hadoop使用者檢視程序

7153 NameNode

8516 Jps

8133 DataNode

7495 SecondaryNameNode

切換使用者發現程序都正常。

這個情況是檢視的使用者不對,hadoop檢視jps不是運作使用者檢視,這個情況是不需要進行任何處理,服務運作正常

總結:對應process information unavailable報錯,處理:

1.檢視程序是否存在 (程序不存在,删/tmp/hsperfdata_xxx,重新啟動程序)

2.如果程序存在,檢視存在的程序運作使用者,如果不是目前使用者 切換使用者後重新運作jps

繼續閱讀