天天看點

cloudera manager 恢複_誤删clouderascmagent恢複

cloudera manager 恢複_誤删clouderascmagent恢複

之前在測試叢集在折騰Cloudera Manager,有一次誤把cloudera-scm-agent給删了。原因是解除安裝httpd的時候,沒有發現cloudera-scm-agent依賴http服務,解除安裝的時候連同cloudera-scm-agent一起給删了。那次我重新安裝了cloudera-manager-agent,反複折騰,CM就是無法發現這台主機。無奈之下,由于是測試叢集,我就重裝了一遍Cloudera Manager。

仔細一想,分布式叢集,挂了一台從節點,按道理從節點恢複後,根據IP或者主機名,從節點應該能連接配接上主結點的,不可能需要重裝。難道出在連接配接IP或者主機名的過程中。

後來仔細看了這個節點的cloudera-scm-agent.log日志,發現原來真是IP的問題

[13/Sep/2020 05:01:33 +0800] 22503 MainThread agent        ERROR    Heartbeating to localhost:7182 failed.Traceback (most recent call last):  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1390, in _send_heartbeat    self.cfg.master_port)  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 469, in __init__    self.conn.connect()  File "/usr/lib64/python2.7/httplib.py", line 833, in connect    self.timeout, self.source_address)  File "/usr/lib64/python2.7/socket.py", line 571, in create_connection    raise errerror: [Errno 111] Connection refused[13/Sep/2020 05:01:55 +0800] 22503 MainThread heartbeat_tracker INFO     HB stats (seconds): num:1 LIFE_MIN:0.00 min:0.00 mean:0.00 max:0.00 LIFE_MAX:0.00
           

單獨啟動cloudera-scm-agent後,連接配接的是 localhost:7182 而不是 server端的ip

于是我們需要修改cloudera-scm-agent連接配接的cloudera-scm-server配置

[[email protected] cloudera-scm-agent]# vim /etc/cloudera-scm-agent/config.ini# Configuration file for cloudera-scm-agent.# Please note that this file supports multi-line values.  Multi-line# values are indicated by indenting following lines with a space.## If you have whitespace in front of a parameter name, it will be# read as a continuation of the previous parameter value.  Please# be careful not to leave spaces in front of parameter names.## To check if this file has spaces in front of parameters names# you can do a grep like this:#  grep '^[[:blank:]]' /etc/cloudera-scm-agent/config.ini[General]# Hostname of the CM server.server_host=192.168.0.171# Port that the CM server is listening on.server_port=7182
           

然後重新開機cloudera-scm-agent就可以了

systemctl restart cloudera-scm-agent