天天看点

Storm重启报java.io.FileNotFoundException错误

问题:

storm异常停止后,重启storm,nimbus进程几秒后便不见了。 查看日志报错:nimbus [ERROR] Error when processing event

java.io.FileNotFoundException: File ‘/data/storm/nimbus/stormdist/risk_topo-1-1574555563/stormconf.ser’ does not exist

at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:299) ~[analyzer-storm-dependency.jar:na]

at org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1763) ~[analyzer-storm-dependency.jar:na]

at backtype.storm.daemon.nimbus r e a d s t o r m c o n f . i n v o k e ( n i m b u s . c l j : 89 )   [ s t o r m − c o r e − 0.9.5. j a r : 0.9.5 ] a t b a c k t y p e . s t o r m . d a e m o n . n i m b u s read_storm_conf.invoke(nimbus.clj:89) ~[storm-core-0.9.5.jar:0.9.5] at backtype.storm.daemon.nimbus reads​tormc​onf.invoke(nimbus.clj:89) [storm−core−0.9.5.jar:0.9.5]atbacktype.storm.daemon.nimbuscompute_executors.invoke(nimbus.clj:419) ~[storm-core-0.9.5.jar:0.9.5]

at backtype.storm.daemon.nimbusKaTeX parse error: Expected group after '_' at position 17: …ompute_topology_̲_GT_executorsiter__3324__3328KaTeX parse error: Expected group after '_' at position 3: fn_̲_3329.invoke(ni…seq.invoke(core.clj:133) ~[clojure-1.5.1.jar:na]

at clojure.core.protocols s e q r e d u c e . i n v o k e ( p r o t o c o l s . c l j : 30 )   [ c l o j u r e − 1.5.1. j a r : n a ] a t c l o j u r e . c o r e . p r o t o c o l s seq_reduce.invoke(protocols.clj:30) ~[clojure-1.5.1.jar:na] at clojure.core.protocols seqr​educe.invoke(protocols.clj:30) [clojure−1.5.1.jar:na]atclojure.core.protocolsfn__6026.invoke(protocols.clj:54) ~[clojure-1.5.1.jar:na]

at clojure.core.protocolsKaTeX parse error: Expected group after '_' at position 3: fn_̲_5979G__5974__5992.invoke(protocols.clj:13) ~[clojure-1.5.1.jar:na]

at clojure.core r e d u c e . i n v o k e ( c o r e . c l j : 6177 )   [ c l o j u r e − 1.5.1. j a r : n a ] a t c l o j u r e . c o r e reduce.invoke(core.clj:6177) ~[clojure-1.5.1.jar:na] at clojure.core reduce.invoke(core.clj:6177) [clojure−1.5.1.jar:na]atclojure.coreinto.invoke(core.clj:6229) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.nimbusKaTeX parse error: Expected group after '_' at position 17: …ompute_topology_̲_GT_executors.i…compute_new_topology__GT_executor__GT_node_PLUS_port.invoke(nimbus.clj:550) ~[storm-core-0.9.5.jar:0.9.5]

at backtype.storm.daemon.nimbus m k a s s i g n m e n t s . d o I n v o k e ( n i m b u s . c l j : 662 )   [ s t o r m − c o r e − 0.9.5. j a r : 0.9.5 ] a t c l o j u r e . l a n g . R e s t F n . i n v o k e ( R e s t F n . j a v a : 410 )   [ c l o j u r e − 1.5.1. j a r : n a ] a t b a c k t y p e . s t o r m . d a e m o n . n i m b u s mk_assignments.doInvoke(nimbus.clj:662) ~[storm-core-0.9.5.jar:0.9.5] at clojure.lang.RestFn.invoke(RestFn.java:410) ~[clojure-1.5.1.jar:na] at backtype.storm.daemon.nimbus mka​ssignments.doInvoke(nimbus.clj:662) [storm−core−0.9.5.jar:0.9.5]atclojure.lang.RestFn.invoke(RestFn.java:410) [clojure−1.5.1.jar:na]atbacktype.storm.daemon.nimbusfn__3724KaTeX parse error: Expected group after '_' at position 8: exec_fn_̲_1103__auto____…fn__3730KaTeX parse error: Expected group after '_' at position 3: fn_̲_3731.invoke(ni…fn__3724KaTeX parse error: Expected group after '_' at position 8: exec_fn_̲_1103__auto____…fn__3730.invoke(nimbus.clj:908) ~[storm-core-0.9.5.jar:0.9.5]

at backtype.storm.timer s c h e d u l e r e c u r r i n g schedule_recurring scheduler​ecurringthis__1807.invoke(timer.clj:99) ~[storm-core-0.9.5.jar:0.9.5]

at backtype.storm.timer m k t i m e r mk_timer mkt​imerfn__1790KaTeX parse error: Expected group after '_' at position 3: fn_̲_1791.invoke(ti…mk_timer$fn__1790.invoke(timer.clj:42) ~[storm-core-0.9.5.jar:0.9.5]

at clojure.lang.AFn.run(AFn.java:24) ~[clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]

2019-11-24T10:43:46.368+0800 b.s.util [ERROR] Halting process: (“Error when processing an event”)

java.lang.RuntimeException: (“Error when processing an event”)

分析:

nimbus重启后便down掉是因为zookeeper里还保留着之前异常挂掉storm的信息,以至于每次重启storm的时候,zookeeper都会去读取该topology信息,如果重启的时候已经将{storm.local.dir}目录下的文件删除了,便会报找不到文件了。此时应该将zookeeper的storm也删除掉,实现同步。

解决方案:

如果不清楚zookeeper安装在哪里,则是使用命令查询

find / -name zkCli.sh

我的zookeeper安装位置:

/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/zookeeper/bin

则修复步骤:

1、进入目录:

cd /opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/zookeeper/bin

2、输入 ./zkCli.sh

3、查看服务: ls /

4、删除storm的服务: rmr /storm

5、重启storm即可

当然也要记得将配置中{storm.local.dir}的路径的supervisor,nimbus目录删除掉

PS:kill掉storm进程命令

ps -ef | grep storm | grep -v ‘grep’ | awk ‘{print $2}’ | xargs kill -9