Spark在以standalone模式運作一段時間戶總會出現Spark Master GC overhead limit exceeded異常
16/09/20 05:42:24 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-akka.actor.default-dispatcher-6] shutting down ActorSystem [sparkMaster]
java.lang.OutOfMemoryError: GC overhead limit exceeded
Spark master的記憶體被GC釋放不掉,主要是因為随着時間的推移會緩存大量的對象Web UI中需要顯示的已完成的Applications對象,預設設定的緩存資料是50,而啟動的spark master程序記憶體預設設定的大小1G -Xms1g -Xmx1g
spark.history.retainedApplications | 50 |
在配置檔案spark-default.conf中修改儲存為一個合适的數量
spark.eventLog.enabled true
spark.eventLog.dir hdfs://namespace/tmp/spark/events
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.history.fs.logDirectory hdfs://namespace/tmp/spark/events
spark.history.ui.port 18080
spark.history.retainedApplications 20
spark.kafka.metadata.broker.list kafka1:9092,kafka2:9092,kafka3:9092
spark.flume.listener.port 44445
spark.executor.extraJavaOptions -XX:HeapDumpPath=/data0/spark/temp/dump