目錄
狀态遷移圖
RMAppState
RMAppEventType
Yarn Client Commands: yarn app|application
RMApp 是 ResourceManager 中用于維護一個 Application 生命周期的資料結構,由 RMAppImpl 實作,該類維護了一個 Application 狀态機,記錄了一個 Application 可能存在的各個狀态 RMAppState 以及導緻狀态間轉換的事件 RMAppEvent。
狀态遷移圖
RMAppState
public enum RMAppState {
// 初始狀态
NEW,
// RM 接受到 client 的 app submit 後
// 會建立一個 RMAppImpl 對象來維護 app 的狀态
// 然後立即序列化 app 的基本資訊用于故障恢複
// 預設 RMStateStore 是 FileSystemRMStateStore
// 由 yarn.resourcemanager.store.class 控制
//
// 1
// preState: RMAppState.NEW;
// eventType: RMAppEventType.START;
// txnHook: RMAppNewlySavingTransition
// 2
// preState: RMAppState.NEW_SAVING;
// eventType: RMAppEventType.NODE_UPDATE;
// txnHook: RMAppNodeUpdateTransition
NEW_SAVING,
// 經過合法性驗證并且 app 基本資訊已經序列化
// RM 會建立一個 RMAppAttemptImpl 進行一次運作嘗試
//
// 1
// preState: RMAppState.NEW;
// eventType: RMAppEventType.RECOVER;
// txnHook: RMAppRecoveredTransition;
// 2
// preState: RMAppState.NEW_SAVING;
// eventType: RMAppEventType.APP_NEW_SAVED;
// txnHook: AddApplicationToSchedulerTransition;
SUBMITTED,
// 經過 ResourceScheduler 驗證之後被送出到 SchedulerQueue 中
// e.g: CapacityScheduler
// yarn.scheduler.capacity.maximum-applications:
// Maximum number of applications that can be pending and running.
// 層級隊列 hierarchical queues 的相關驗證
// Submit to the queue
// Update the metrics
// Accepted application: a1 for user: u1 in queue: q1
//
// preState: RMAppState.SUBMITTED;
// eventType: RMAppEventType.APP_ACCEPTED;
// txnHook: StartAppAttemptTransition;
ACCEPTED,
// appMaster 已在某個 node 上運作
// RMAppAttemptImpl 已經處于 running 狀态
RUNNING,
// RMAppEventType.ATTEMPT_FAILED 事件觸發後
// 先判斷失敗次數是否超過 yarn.resourcemanager.am.max-attempts
// 若沒超過則讓狀态機回到 ACCEPTED
// 若超過則進入 FINAL_SAVING 進行資源回收等善後操作
FINAL_SAVING,
// appMaster 通過 RPC 通知 RM app 運作結束将要退出
FINISHING,
// NM 通過心跳彙報 appMaster 所在的 container 運作結束
FINISHED,
// appMaster 運作失敗
FAILED,
// 1
// preState: RMAppState.ACCEPTED;
// eventType: RMAppEventType.KILL;
// txnHook: KillAttemptTransition;
// 2
// preState: RMAppState.RUNNING;
// eventType: RMAppEventType.KILL;
// txnHook: KillAttemptTransition;
KILLING,
// RM 接受到 client 的 kill 指令時主動将 app 殺死
KILLED
}
RMAppEventType
public enum RMAppEventType {
// Source: ClientRMService
START,
RECOVER,
KILL,
// Source: Scheduler and RMAppManager
APP_REJECTED,
// Source: Scheduler
APP_ACCEPTED,
// Source: RMAppAttempt
ATTEMPT_REGISTERED,
ATTEMPT_UNREGISTERED,
ATTEMPT_FINISHED, // Will send the final state
ATTEMPT_FAILED,
ATTEMPT_KILLED,
NODE_UPDATE,
ATTEMPT_LAUNCHED,
// Source: Container and ResourceTracker
APP_RUNNING_ON_NODE,
// Source: RMStateStore
APP_NEW_SAVED,
APP_UPDATE_SAVED,
APP_SAVE_FAILED,
}
Yarn Client Commands: yarn app|application
Works with -list to filter applications
based on input comma-separated list of
application states.
The valid application state can be one
of the following:
ALL, NEW, NEW_SAVING, SUBMITTED,
ACCEPTED, RUNNING, FINISHED,
FAILED, KILLED
yarn app -list -appStates 'RUNNING,FINISHED' | grep distcp | head | awk '{print $1}'
application_1620823068070_0283
application_1620823068070_0282
application_1620823068070_0281
application_1620823068070_0280
application_1620823068070_0279
application_1620823068070_0278
application_1620823068070_0277
application_1620823068070_0276
application_1620823068070_0287
application_1620823068070_0286