天天看點

源碼學習:yarn application 狀态機

目錄

​​狀态遷移圖​​

​​RMAppState​​

​​RMAppEventType​​

​​Yarn Client Commands: yarn app|application​​

RMApp 是 ResourceManager 中用于維護一個 Application 生命周期的資料結構,由 RMAppImpl 實作,該類維護了一個 Application 狀态機,記錄了一個 Application 可能存在的各個狀态 RMAppState 以及導緻狀态間轉換的事件 RMAppEvent。

狀态遷移圖

RMAppState

public enum RMAppState {
  // 初始狀态
  NEW,
  // RM 接受到 client 的 app submit 後
  // 會建立一個 RMAppImpl 對象來維護 app 的狀态
  // 然後立即序列化 app 的基本資訊用于故障恢複
  // 預設 RMStateStore 是 FileSystemRMStateStore
  // 由 yarn.resourcemanager.store.class 控制
  //
  // 1
  // preState: RMAppState.NEW;
  // eventType: RMAppEventType.START;
  // txnHook: RMAppNewlySavingTransition
  // 2
  // preState: RMAppState.NEW_SAVING;
  // eventType: RMAppEventType.NODE_UPDATE;
  // txnHook: RMAppNodeUpdateTransition
  NEW_SAVING,
  // 經過合法性驗證并且 app 基本資訊已經序列化
  // RM 會建立一個 RMAppAttemptImpl 進行一次運作嘗試
  //
  // 1
  // preState: RMAppState.NEW;
  // eventType: RMAppEventType.RECOVER;
  // txnHook: RMAppRecoveredTransition;
  // 2
  // preState: RMAppState.NEW_SAVING;
  // eventType: RMAppEventType.APP_NEW_SAVED;
  // txnHook: AddApplicationToSchedulerTransition;
  SUBMITTED,
  // 經過 ResourceScheduler 驗證之後被送出到 SchedulerQueue 中
  // e.g: CapacityScheduler
  // yarn.scheduler.capacity.maximum-applications:
  // Maximum number of applications that can be pending and running.
  // 層級隊列 hierarchical queues 的相關驗證
  // Submit to the queue
  // Update the metrics
  // Accepted application: a1 for user: u1 in queue: q1
  //
  // preState: RMAppState.SUBMITTED;
  // eventType: RMAppEventType.APP_ACCEPTED;
  // txnHook: StartAppAttemptTransition;
  ACCEPTED,
  // appMaster 已在某個 node 上運作
  // RMAppAttemptImpl 已經處于 running 狀态
  RUNNING,
  // RMAppEventType.ATTEMPT_FAILED 事件觸發後
  // 先判斷失敗次數是否超過 yarn.resourcemanager.am.max-attempts
  // 若沒超過則讓狀态機回到 ACCEPTED
  // 若超過則進入 FINAL_SAVING 進行資源回收等善後操作
  FINAL_SAVING,
  // appMaster 通過 RPC 通知 RM app 運作結束将要退出
  FINISHING,
  // NM 通過心跳彙報 appMaster 所在的 container 運作結束
  FINISHED,
  // appMaster 運作失敗
  FAILED,
  // 1
  // preState: RMAppState.ACCEPTED;
  // eventType: RMAppEventType.KILL;
  // txnHook: KillAttemptTransition;
  // 2
  // preState: RMAppState.RUNNING;
  // eventType: RMAppEventType.KILL;
  // txnHook: KillAttemptTransition;
  KILLING,
  // RM 接受到 client 的 kill 指令時主動将 app 殺死
  KILLED
}      

RMAppEventType

public enum RMAppEventType {
  // Source: ClientRMService
  START,
  RECOVER,
  KILL,

  // Source: Scheduler and RMAppManager
  APP_REJECTED,

  // Source: Scheduler
  APP_ACCEPTED,

  // Source: RMAppAttempt
  ATTEMPT_REGISTERED,
  ATTEMPT_UNREGISTERED,
  ATTEMPT_FINISHED, // Will send the final state
  ATTEMPT_FAILED,
  ATTEMPT_KILLED,
  NODE_UPDATE,
  ATTEMPT_LAUNCHED,
  
  // Source: Container and ResourceTracker
  APP_RUNNING_ON_NODE,

  // Source: RMStateStore
  APP_NEW_SAVED,
  APP_UPDATE_SAVED,
  APP_SAVE_FAILED,
}      

Yarn Client Commands: yarn app|application

Works with -list to filter applications
based on input comma-separated list of 
application states. 
The valid application state can be one
of the following:  
ALL, NEW, NEW_SAVING, SUBMITTED, 
ACCEPTED, RUNNING, FINISHED,
FAILED, KILLED      
yarn app -list -appStates 'RUNNING,FINISHED' | grep distcp | head | awk '{print $1}'
application_1620823068070_0283
application_1620823068070_0282
application_1620823068070_0281
application_1620823068070_0280
application_1620823068070_0279
application_1620823068070_0278
application_1620823068070_0277
application_1620823068070_0276
application_1620823068070_0287
application_1620823068070_0286      

繼續閱讀