Spark在以standalone模式运行一段时间户总会出现Spark Master GC overhead limit exceeded异常
16/09/20 05:42:24 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-akka.actor.default-dispatcher-6] shutting down ActorSystem [sparkMaster]
java.lang.OutOfMemoryError: GC overhead limit exceeded
Spark master的内存被GC释放不掉,主要是因为随着时间的推移会缓存大量的对象Web UI中需要显示的已完成的Applications对象,默认设置的缓存数据是50,而启动的spark master进程内存默认设置的大小1G -Xms1g -Xmx1g
spark.history.retainedApplications | 50 |
在配置文件spark-default.conf中修改保存为一个合适的数量
spark.eventLog.enabled true
spark.eventLog.dir hdfs://namespace/tmp/spark/events
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.history.fs.logDirectory hdfs://namespace/tmp/spark/events
spark.history.ui.port 18080
spark.history.retainedApplications 20
spark.kafka.metadata.broker.list kafka1:9092,kafka2:9092,kafka3:9092
spark.flume.listener.port 44445
spark.executor.extraJavaOptions -XX:HeapDumpPath=/data0/spark/temp/dump