本文主要介紹 ART異常處理,ART對SIGSEGV信号的攔截處理,Implicit Suspend Check的實作,以及一般的 Java Exception在ART種的檢測和抛出。由于 StackOverflowError / NullPointerException的檢測抛出,throw-catch的實作比較複雜,開始寫到一篇文章内,發現文章太長了,後來把這3個比較複雜的處理拆分出來單獨列出了。
ART異常處理機制(2) - StackOverflowError 實作
ART異常處理機制(3) - NullPointerException實作
ART異常處理機制(4) - throw & catch & finally實作
實際上 ART 種的主要的兩種Exception的處理都是通過産生 SIGSEGV信号好攔截SIGSEGV信号進行實作的。是以我們接下來先弄明白 ART種對 SIGSEGV信号的攔截的處理流程。
1. FaultManager的初始化
ART中處理linux信号是通過 FaultManger來處理,對于特定的信号,先經過ART中的信号處理函數 art_fault_handler進行處理,在ART中能夠識别的情況下,把這些信号轉換為Java工程師能夠識别的 Java Exception 抛出,以便于工程師處理異常。 下面我們看下 FaultManger的實作。實際上 FaultManger實在虛拟機啟動的時候,就完成了初始化,在虛拟機啟動完成,即可立即處理 Java Exception。 在 runtime.cc 的 Runtime::Init 函數:其中 fault_manager.Init() 初始化,會通過 signal 信号處理函數,設定攔截幾個特定的信号,通過對這些信号進行特殊處理,來實作 Java Exception;下面的幾個Handler的建立,實際都是通過構造函數,将自己添加到 fault_manager 的信号處理 handler 集合中,以便後續處理特定信号。bool Runtime::Init(RuntimeArgumentMap&& runtime_options_in) { ... fault_manager.Init(); if (implicit_suspend_checks_) { new SuspensionHandler(&fault_manager); } if (implicit_so_checks_) { new StackOverflowHandler(&fault_manager); } if (implicit_null_checks_) { new NullPointerHandler(&fault_manager); } if (kEnableJavaStackTraceHandler) { new JavaStackTraceHandler(&fault_manager); } ... }
目前 mask中,包含 SIGABRT,SIGBUS,SIGFPE,SIGILL,SIGSEGV 之外的所有信号。 而 AddSpecialSignalHandlerFn中,隻傳遞了 SIGSEGV過去。void FaultManager::Init() { CHECK(!initialized_); sigset_t mask; sigfillset(&mask); sigdelset(&mask, SIGABRT); sigdelset(&mask, SIGBUS); sigdelset(&mask, SIGFPE); sigdelset(&mask, SIGILL); sigdelset(&mask, SIGSEGV); SigchainAction sa = { .sc_sigaction = art_fault_handler, .sc_mask = mask, .sc_flags = 0UL, }; AddSpecialSignalHandlerFn(SIGSEGV, &sa); initialized_ = true; }
其中 chains是一個數組,其長度是linux 信号的個數: static SignalChain chains[_NSIG]; 在 InitializeSignalChain函數中,擷取 sigchainlib 裡的 sigaction 和 sigprocmask 函數,以便後續使用:extern "C" void AddSpecialSignalHandlerFn(int signal, SigchainAction* sa) { InitializeSignalChain(); if (signal <= 0 || signal >= _NSIG) { fatal("Invalid signal %d", signal); } // Set the managed_handler. chains[signal].AddSpecialHandler(sa); chains[signal].Claim(signal); }
記得Android之前的版本 libsigchain.so 是在 initrc 中通過 LD_PRELOAD 添加到 ldpath 中的,後來好像改了,修改後還沒研究過。 在 sigchain lib中實作了自己的 sigaction和 sigprocmask函數,通過類似 LD_PRELOAD 的手段,把 libsigchain.so 添加到 ldpath;使得在調用 sigaction 函數以及 sigprocmask時,會調用 libsigchain的這兩個函數,而不是 libc的這兩個函數。 這裡通過 dlsym(RTLD_NEXT,"***"),來擷取 libc的這兩個函數的指針,後續使用。 簡單來講,就相當于 hook 了 libc的這兩個函數,使得調用這兩個函數的地方都會進入 libsigchain 實作的 sigaction和sigprocmask函數内。__attribute__((constructor)) static void InitializeSignalChain() { ... void* linked_sigaction = dlsym(RTLD_NEXT, "sigaction"); void* linked_sigprocmask = dlsym(RTLD_NEXT, "sigprocmask"); ... }
extern "C" int sigaction(int signal, const struct sigaction* new_action, struct sigaction* old_action) { InitializeSignalChain(); if (signal < 0 || signal >= _NSIG) { errno = EINVAL; return -1; } if (chains[signal].IsClaimed()) { struct sigaction saved_action = chains[signal].GetAction(); if (new_action != nullptr) { chains[signal].SetAction(new_action); } if (old_action != nullptr) { *old_action = saved_action; } return 0; } // Will only get here if the signal chain has not been claimed. We want // to pass the sigaction on to the kernel via the real sigaction in libc. return linked_sigaction(signal, new_action, old_action); }
從代碼中看到,通過 sigaction設定的信号會調用 libsigchain 的sigaction函數路徑設定new action,如果時我們關注的信号,則沒有真正設定new action到kernel,而是将其存放到該信号對應的SignalChain對應的 action_成員,用以記錄 old_action,并傳回該信号原來的 saved_action;若不是我們關注的信号,則還走 libc的 sigaction 函數,會真正設定new action到kernel。 同時也 hook 了 signal() 函數,目的與 hook sigaction函數一樣,隻不過,當走預設路徑時,并不是使用libc的 signal()函數,而是也使用 libc的sigaction函數。 sigchainlib中實作的 sigprocmask也類似,目的是當有程式調用 sigprocmask設定 SIG_BLOCK 要阻塞我們關注的 signal時,要把我們關注的 signal從信号掩碼中去除掉,以便影響我們的功能。
接下來的 chains[signal].AddSpecialHandlers(sa),和 Claim(signal) :
SigchainAction special_handlers_[2];
這裡把兩個 special handler的 SigChainAction 都設定為前面 FaultManger::Init 函數中初始化的 SigChainAction,其中 sc_sigaction = art_fault_handler; Claim(SIGSEGV):void AddSpecialHandler(SigchainAction* sa) { for (SigchainAction& slot : special_handlers_) { if (slot.sc_sigaction == nullptr) { slot = *sa; return; } } fatal("too many special signal handlers"); }
void Claim(int signo) { if (!claimed_) { Register(signo); claimed_ = true; } } void Register(int signo) { struct sigaction handler_action = {}; handler_action.sa_sigaction = SignalChain::Handler; handler_action.sa_flags = SA_RESTART | SA_SIGINFO | SA_ONSTACK; sigfillset(&handler_action.sa_mask); linked_sigaction(signo, &handler_action, &action_); }
可以看到在調用 Claim 函數 Register信号時,調用用過了前面的 linked_sigaction,其實就是 libc的 sigaction()函數。這裡指定了 SIGSEGV信号的信号處理函數,即 void SignalChain::Handler(int signo, siginfo_t* siginfo, void* ucontext_raw) 函數來處理 SIGSEGV 信号。
總的來講,fault_manager.Init()函數通過 sigchain中函數,設定了 SIGSEGV 信号的處理函數為 SignalChain::Handler函數,并 hook 了 libc 的 sigaction(),sigprocmask(), signal(),以及32bit時的 bsd_signal()這四個函數,防止被其他程式破壞我們的設定。
在 SignalChain::Handler() 函數中,會先調用 art_fault_handler 來先嘗試處理 SIGSEGV信号,如果處理不了,會再使用 saved sigaction(default 或者應用設定的sigaction)來處理這個信号。 回到 Runtime::Init()函數,fault_manager.Init()後面的 new SuspensionHandler(&fault_manager);幾條語句,實際在這幾個Handler的構造函數中,把它們各自都添加到了FaultManager的成員 generated_code_handlers_集合中,後面在 art_fault_hander函數中會使用這幾個 Handler 嘗試處理 SIGSEGV: 比如:
傳遞的第二個參數是 true:NullPointerHandler::NullPointerHandler(FaultManager* manager) : FaultHandler(manager) { manager_->AddHandler(this, true); }
到這裡,已經把 NullPointerHander 添加到 generated_code_handlers_集合中; 再看 art_fault_hander 函數:void FaultManager::AddHandler(FaultHandler* handler, bool generated_code) { DCHECK(initialized_); if (generated_code) { generated_code_handlers_.push_back(handler); } else { other_handlers_.push_back(handler); } }
static bool art_fault_handler(int sig, siginfo_t* info, void* context) { return fault_manager.HandleFault(sig, info, context); }
總結:FaultManger 初始化完成了兩件事情:bool FaultManager::HandleFault(int sig, siginfo_t* info, void* context) { ... if (IsInGeneratedCode(info, context, true)) { for (const auto& handler : generated_code_handlers_) { VLOG(signals) << "invoking Action on handler " << handler; if (handler->Action(sig, info, context)) { return true; } } ... }
是以,總的來講,ART 中對 Java Exception的支援完全是通過 SIGSEGV 這個信号實作的。
- 設定 SIGSEGV 信号必須先通過 ART 處理
- ART 處理 SIGSEGV時,在 art_fault_handler 函數中主要先通過 generated_code_handlers_ 進行處理
- 把 NullPointerHander 等幾個 Handler 添加到 generated_code_handlers_
2. ART 中對 SIGSEGV 信号的處理
前面已經知道,SIGSEGV信号會通過 SignalChain::Handler 函數處理:這個函數的功能:void SignalChain::Handler(int signo, siginfo_t* siginfo, void* ucontext_raw) { if (!GetHandlingSignal()) { for (const auto& handler : chains[signo].special_handlers_) { if (handler.sc_sigaction == nullptr) { break; } bool handler_noreturn = (handler.sc_flags & SIGCHAIN_ALLOW_NORETURN); sigset_t previous_mask; linked_sigprocmask(SIG_SETMASK, &handler.sc_mask, &previous_mask); ScopedHandlingSignal restorer; if (!handler_noreturn) { SetHandlingSignal(true); } if (handler.sc_sigaction(signo, siginfo, ucontext_raw)) { return; } linked_sigprocmask(SIG_SETMASK, &previous_mask, nullptr); } } // Forward to the user's signal handler. int handler_flags = chains[signo].action_.sa_flags; ucontext_t* ucontext = static_cast<ucontext_t*>(ucontext_raw); sigset_t mask; sigorset(&mask, &ucontext->uc_sigmask, &chains[signo].action_.sa_mask); if (!(handler_flags & SA_NODEFER)) { sigaddset(&mask, signo); } linked_sigprocmask(SIG_SETMASK, &mask, nullptr); if ((handler_flags & SA_SIGINFO)) { chains[signo].action_.sa_sigaction(signo, siginfo, ucontext_raw); } else { auto handler = chains[signo].action_.sa_handler; if (handler == SIG_IGN) { return; } else if (handler == SIG_DFL) { fatal("exiting due to SIG_DFL handler for signal %d", signo); } else { handler(signo); } } }
是以,一般情況下,收到 SIGSEGV信号後,先走到目前函數,然後走到 art_fault_hanlder 函數:
- 如果目前線程沒有正在處理信号,則嘗試使用 special hander的 sc_sigaction 函數來處理該信号,即使用 art_fault_handler函數嘗試處理 SIGSEGV
- 如果 art_fault_handler 能夠處理目前信号,則處理完成後 return
- 如果不能處理目前信号,則會調用 SIGSEGV信号對應的 SignalChain中儲存的 saved action(action_) 來處理這個信号,即如果應用程式設定過該信号的處理函數,則調用其,如果沒有應該會走 linker 中設定的 sigaction,最終走到 debuggerd處理該信号。
// Signal handler called on SIGSEGV. static bool art_fault_handler(int sig, siginfo_t* info, void* context) { return fault_manager.HandleFault(sig, info, context); }
可以看到,在HandleFault中,會先通過 IsInGeneratedCode() 判斷目前的 SIGSEGV是否是發生在 generated code中,也就是判斷是否是在從 java 代碼編譯出來的 native code中,如果是的話,才會依次使用 generated_code_handlers_ 以及 other handlers嘗試處理該 SIGSEGV信号。bool FaultManager::HandleFault(int sig, siginfo_t* info, void* context) { VLOG(signals) << "Handling fault"; #ifdef TEST_NESTED_SIGNAL // Simulate a crash in a handler. raise(SIGSEGV); #endif if (IsInGeneratedCode(info, context, true)) { VLOG(signals) << "in generated code, looking for handler"; for (const auto& handler : generated_code_handlers_) { VLOG(signals) << "invoking Action on handler " << handler; if (handler->Action(sig, info, context)) { // We have handled a signal so it's time to return from the // signal handler to the appropriate place. return true; } } // We hit a signal we didn't handle. This might be something for which // we can give more information about so call all registered handlers to // see if it is. if (HandleFaultByOtherHandlers(sig, info, context)) { return true; } } // Set a breakpoint in this function to catch unhandled signals. art_sigsegv_fault(); return false; }
這裡把這個函數的關鍵代碼展示出來,判斷是否在 generated code中:bool FaultManager::IsInGeneratedCode(siginfo_t* siginfo, void* context, bool check_dex_pc) { ... ThreadState state = thread->GetState(); if (state != kRunnable) { return false; } if (!Locks::mutator_lock_->IsSharedHeld(thread)) { return false; } GetMethodAndReturnPcAndSp(siginfo, context, &method_obj, &return_pc, &sp); const OatQuickMethodHeader* method_header = method_obj->GetOatQuickMethodHeader(return_pc); uint32_t dexpc = method_header->ToDexPc(method_obj, return_pc, false); return !check_dex_pc || dexpc != DexFile::kDexNoIndex; }
假設目前處理的這個SIGSEGV就是發生在 generated code中,那麼接下來,依次通過 generated_code_handlers_ 和 other handlers的Action函數嘗試處理該信号。 generated_code_handlers_ 中依次是如下這幾個handler:
- 如果SIGSEGV發生在generated code中,則目前線程肯定是 kRunnable狀态,且持有 mutator_lock_
- 根據目前 context 嘗試擷取對應的 ArtMethod,如果發生在 generated code中,則肯定能擷取成功
- 根據ArtMethod擷取目前 SIGSEGV發生位置對應的 dex_pc,如果發生在 generated code中,也應該能夠擷取成功
而other_handlers_中隻有一個handler:if (implicit_suspend_checks_) { new SuspensionHandler(&fault_manager); } if (implicit_so_checks_) { new StackOverflowHandler(&fault_manager); } if (implicit_null_checks_) { new NullPointerHandler(&fault_manager); }
是以,一個 SIGSEGV信号來了之後,上面的這四個handler處理是有優先級的,就是它們的順序。 每個handler嘗試處理時,都是通過各自的 Action函數,從目前 context中擷取一定資訊,判斷是否比對各自期待的資訊,如果比對,則就能夠處理目前這個 SIGSEGV,傳回 true,後面的handler就不再需要處理了;否則繼續交由下一個handler嘗試處理;這些handler都不能處理的話,最終再交給預設的處理函數,最終走到debuggerd。 另外,看到這幾個Handler的添加都是有條件的,拿一個7.0的手機看了下,這幾個開關的值分别是:if (kEnableJavaStackTraceHandler) { new JavaStackTraceHandler(&fault_manager); }
是以真正的運作環境中,SIGSEGV 信号隻需要先被 StackOverflowHander 和 NullPointerHandler 這倆個handler嘗試處理,不能處理,則走到Linker中的處理函數。(gdb) p 'art::Runtime'::instance_->implicit_suspend_checks_ $2 = false (gdb) p 'art::Runtime'::instance_->implicit_so_checks_ $3 = true (gdb) p 'art::Runtime'::instance_->implicit_null_checks_ $4 = true static constexpr bool kEnableJavaStackTraceHandler = false;
3. SuspensionHander 實作
各個handler的實作主要就是在其對應的 Action()函數中,而由于其要擷取目前 context的資訊,是以這些Action函數是平台相關的,比如x86平台context的處理和arm平台不一樣,arm平台上 32bit和64bit對context的處理也不相同。這裡我們主要分析 arm 32上的實作。// A suspend check is done using the following instruction sequence: // 0xf723c0b2: f8d902c0 ldr.w r0, [r9, #704] ; suspend_trigger_ // .. some intervening instruction // 0xf723c0b6: 6800 ldr r0, [r0, #0]
這幾行注釋是 SuspensionHandler實作的原理。
當想要一個線程在generated code中執行的時候進行 suspend check時,實際就是把線程 thread的 suspend_trigger_設定為 nullptr,按照上面的實作,線上程執行 generated code的過程中,會先通過 ldr.w r0,[r9, #704] 擷取 suspend_trigger_成員(其中 r9表示thread,704 時 suspend_trigger_成員對應于 thread的offset),然後執行 ldr r0,[r0,#0]來取r0中的資料,而trigger的情況下,suspend_trigger_是0,此時就會觸發一個 SIGSEGV,然後走到 SuspensionHandler::Action函數先嘗試處理,發現比對後,就跳轉到Suspend Check中。
反之,當不需要進行suspend check時,把 suspend_trigger_的位址指派給它自己就可以了,此時不會觸發SIGSEGV。
trigger suspend 的enable和disable:
void TriggerSuspend() { tlsPtr_.suspend_trigger = nullptr; }
其enable有3種情況:void RemoveSuspendTrigger() { tlsPtr_.suspend_trigger = reinterpret_cast<uintptr_t*>(&tlsPtr_.suspend_trigger); }
知道這些知識點後,再看 SuspensionHandler::Action函數的實作就簡單了:
- 在 bool Thread::ModifySuspendCountInternal()函數結尾,如果發現更改後的suspend_count大于0,說明目前線程被請求suspend,那麼當然是越快越好,此時會調用 TriggerSuspend()函數,以便目前線程執行 generated code過程中程序 Suspend check,進而進入suspend狀态
- 在 bool Thread::RequestCheckpoint(Closure* function) 函數給一個線程設定Checkpoint function成功後,會調用 TriggerSuspend() 函數,因為被設定了Checkpoint function,也是越快執行越好,trigger後,在suspend check時,會先檢查 checkpoint function,如果存在,則立即執行 checkpoint function
- 在 bool Thread::RequestEmptyCheckpoint() 函數成功後也會調用 TriggerSuspend();EmptyCheckpoint的功用沒有詳細了解
bool SuspensionHandler::Action(int sig ATTRIBUTE_UNUSED, siginfo_t* info ATTRIBUTE_UNUSED, void* context) { // These are the instructions to check for. The first one is the ldr r0,[r9,#xxx] // where xxx is the offset of the suspend trigger. uint32_t checkinst1 = 0xf8d90000 + Thread::ThreadSuspendTriggerOffset<PointerSize::k32>().Int32Value(); uint16_t checkinst2 = 0x6800; struct ucontext* uc = reinterpret_cast<struct ucontext*>(context); struct sigcontext *sc = reinterpret_cast<struct sigcontext*>(&uc->uc_mcontext); uint8_t* ptr2 = reinterpret_cast<uint8_t*>(sc->arm_pc); uint8_t* ptr1 = ptr2 - 4; VLOG(signals) << "checking suspend"; uint16_t inst2 = ptr2[0] | ptr2[1] << 8; VLOG(signals) << "inst2: " << std::hex << inst2 << " checkinst2: " << checkinst2; if (inst2 != checkinst2) { // Second instruction is not good, not ours. return false; } uint8_t* limit = ptr1 - 40; // Compiler will hoist to a max of 20 instructions. bool found = false; while (ptr1 > limit) { uint32_t inst1 = ((ptr1[0] | ptr1[1] << 8) << 16) | (ptr1[2] | ptr1[3] << 8); VLOG(signals) << "inst1: " << std::hex << inst1 << " checkinst1: " << checkinst1; if (inst1 == checkinst1) { found = true; break; } ptr1 -= 2; // Min instruction size is 2 bytes. } if (found) { sc->arm_lr = sc->arm_pc + 3; // +2 + 1 (for thumb) sc->arm_pc = reinterpret_cast<uintptr_t>(art_quick_implicit_suspend); // Now remove the suspend trigger that caused this fault. Thread::Current()->RemoveSuspendTrigger(); VLOG(signals) << "removed suspend trigger invoking test suspend"; return true; } return false; }
Action函數中,實際就是先判斷出發 SIGSEGV的代碼是否是 0x6800( ldr r0,[r0,#0]),如果是,才有可能是Suspend Check,然後檢查這個代碼之前的40個位元組之内是否出現了 0xf8d902c0 指令(ldr.w r0,[r9, #704]),至于為什是 40個位元組,這個應該跟 thumb指令長度以及 ART 編譯java 代碼的 code generator相關,還沒有研究。
我們暫時跳過這個疑問,假設經過檢測後,發現比對,确實是因為 TriggerSuspend()觸發的一個 SIGSEGV信号,那麼我們就需要處理這個 SIGSEGV信号了。處理的方式就是通過設定 arm_pc來跳轉到隐式的 suspend check處理函數,另外在跳轉之前 lr 會設定為 pc+2+1(+2因為目前pc指向的指令是2個位元組,+1是因為在從susped check 傳回回來後,需要運作在 thumb模式):
下面就進入到了suspend check函數,同樣是平台相關的:sc->arm_lr = sc->arm_pc + 3; // +2 + 1 (for thumb) sc->arm_pc = reinterpret_cast<uintptr_t>(art_quick_implicit_suspend);
我們看到,這裡實際是跳到了 artTestSuspendFromCode 函數中:ENTRY art_quick_implicit_suspend mov r0, rSELF SETUP_SAVE_REFS_ONLY_FRAME r1 @ save callee saves for stack crawl bl artTestSuspendFromCode @ (Thread*) RESTORE_SAVE_REFS_ONLY_FRAME_AND_RETURN END art_quick_implicit_suspend
然後就到了 thread的 CheckSuspend()函數:extern "C" void artTestSuspendFromCode(Thread* self) REQUIRES_SHARED(Locks::mutator_lock_) { // Called when suspend count check value is 0 and thread->suspend_count_ != 0 ScopedQuickEntrypointChecks sqec(self); self->CheckSuspend(); }
inline void Thread::CheckSuspend() { DCHECK_EQ(Thread::Current(), this); for (;;) { if (ReadFlag(kCheckpointRequest)) { RunCheckpointFunction(); } else if (ReadFlag(kSuspendRequest)) { FullSuspendCheck(); } else if (ReadFlag(kEmptyCheckpointRequest)) { RunEmptyCheckpoint(); } else { break; } } }
可以看到在這個函數裡,按照 CheckpointFunction,SuspendCheck,EmptyCheckpoint的優先級進行執行,對應了上面講到的 3種 TriggerSuspend()的情況。
到這裡知道了SuspensionHandler工作的大體流程,但是有一個問題:
這個隐式的suspend check是在 generated code中的怎樣的位置,它在怎樣的時機執行?要搞明白這個問題,還需要研究隐式的suspend check的設計需求以及code generator生成這種代碼的流程。因為隐式的 Suspend Check沒有打開,暫不研究了。
因為在 generated code中,并沒有安插這類隐式的 suspend check代碼。那麼使用的suspend check應該就是顯示的檢查了。在這裡簡單提一下Suspend Check的場景:
Supend Check 會在java函數的傳回時,線程運作狀态轉換為 kRunnable狀态時,以及 kRunnable狀态的線程的 thread loop(goto),cmp(if-ge),switch(packed-swtich)這些執行過程,都需要進行suspend check。簡單總結就是:1.線程從其他狀态切換到 kRunnable狀态時需要檢查 2.kRunnable狀态的線程執行跳轉時需要檢查
1.運作在Interpreter模式時的 suspend check:
在各個 suspen check的點執行 MterpSuspendCheck函數來檢查是否需要進入suspend 狀态。
extern "C" size_t MterpSuspendCheck(Thread* self) REQUIRES_SHARED(Locks::mutator_lock_) { self->AllowThreadSuspension(); return MterpShouldSwitchInterpreters(); }
inline void Thread::AllowThreadSuspension() { DCHECK_EQ(Thread::Current(), this); if (UNLIKELY(TestAllFlags())) { CheckSuspend(); } // Invalidate the current thread's object pointers (ObjPtr) to catch possible moving GC bugs due // to missing handles. PoisonObjectPointers(); }
CheckSuspend()函數在前面已經提到了。
2.generated code中現實的 suspend check:
suspend check的安插點,需要達到相同的目的,但有些許不同,generated code中的檢測代碼是compiler 在編譯 java method的時候安插進去的:
在這個函數的 generated code中,函數入口位置 0x0060aaf8進行檢查線程的私有資料 stata_and_flags,如果不是0,則需要跳轉到 0x0060ab2c 處,進行suspend check,可以看到是 跳轉到了 [tr, #1232] 處:4: void java.lang.ThreadLocal$ThreadLocalMap.<init>(java.lang.ThreadLocal$ThreadLocalMap, java.lang.ThreadLocal$ThreadLocalMap) (dex_method_idx=3662) DEX CODE: 0x0000: 7020 4d0e 1000 | invoke-direct {v0, v1}, void java.lang.ThreadLocal$ThreadLocalMap.<init>(java.lang.ThreadLocal$ThreadLocalMap) // [email protected] 0x0003: 0e00 | return-void CODE: (code_offset=0x0060aae4 size_offset=0x0060aae0 size=100)... 0x0060aae4: d1400bf0 sub x16, sp, #0x2000 (8192) 0x0060aae8: b940021f ldr wzr, [x16] StackMap [native_pc=0x60aaec] (dex_pc=0x0, native_pc_offset=0x8, dex_register_map_offset=0xffffffff, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b00000000000000) 0x0060aaec: f81b0fe0 str x0, [sp, #-80]! 0x0060aaf0: a90357f4 stp x20, x21, [sp, #48] 0x0060aaf4: a9047bf6 stp x22, lr, [sp, #64] 0x0060aaf8: 79400270 ldrh w16, [tr] ; state_and_flags 0x0060aafc: 35000190 cbnz w16, #+0x30 (addr 0x60ab2c) 0x0060ab00: aa0303f4 mov x20, x3 0x0060ab04: aa0103f5 mov x21, x1 0x0060ab08: aa0203f6 mov x22, x2 0x0060ab0c: d0ff6ac0 adrp x0, #-0x12a6000 (addr -0xc9c000) 0x0060ab10: f9428c00 ldr x0, [x0, #1304] 0x0060ab14: f940181e ldr lr, [x0, #48] 0x0060ab18: d63f03c0 blr lr StackMap [native_pc=0x60ab1c] (dex_pc=0x0, native_pc_offset=0x38, dex_register_map_offset=0x0, inline_info_offset=0xffffffff, register_mask=0x700000, stack_mask=0b00000000000000) v0: in register (21) [entry 0] v1: in register (22) [entry 1] v2: in register (20) [entry 2] 0x0060ab1c: a94357f4 ldp x20, x21, [sp, #48] 0x0060ab20: a9447bf6 ldp x22, lr, [sp, #64] 0x0060ab24: 910143ff add sp, sp, #0x50 (80) 0x0060ab28: d65f03c0 ret 0x0060ab2c: a9010be1 stp x1, x2, [sp, #16] 0x0060ab30: f90013e3 str x3, [sp, #32] 0x0060ab34: f9426a7e ldr lr, [tr, #1232] ; pTestSuspend 0x0060ab38: d63f03c0 blr lr StackMap [native_pc=0x60ab3c] (dex_pc=0x0, native_pc_offset=0x58, dex_register_map_offset=0x3, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b00000101010000) v0: in stack (16) [entry 3] v1: in stack (24) [entry 4] v2: in stack (32) [entry 5] 0x0060ab3c: a9410be1 ldp x1, x2, [sp, #16] 0x0060ab40: f94013e3 ldr x3, [sp, #32] 0x0060ab44: 17ffffef b #-0x44 (addr 0x60ab00)
是以真實的情況是跳轉到 thread 的 tlsPtr_.quick_entrypoints->pTestSuspend 函數,而它的值實際是指向 art_quick_test_suspend 函數入口:(gdb) p (('art::Thread'*)0x7f87e4ea00)->tlsPtr_.quick_entrypoints->pTestSuspend $2 = (void (*)(void)) 0x7f878bab10 <art_quick_test_suspend> (gdb) p &(('art::Thread'*)0x7f87e4ea00)->tlsPtr_.quick_entrypoints->pTestSuspend $3 = (void (**)(void)) 0x7f87e4eed0 (gdb) p 0x7f87e4eed0-0x7f87e4ea00 $4 = 1232
最終跳轉到 artTestSuspendFromCode函數,接下來就與 art_quick_implicit_suspend 基本相同了。ENTRY art_quick_test_suspend #ifdef ARM_R4_SUSPEND_FLAG ldrh rSUSPEND, [rSELF, #THREAD_FLAGS_OFFSET] cbnz rSUSPEND, 1f @ check Thread::Current()->suspend_count_ == 0 mov rSUSPEND, #SUSPEND_CHECK_INTERVAL @ reset rSUSPEND to SUSPEND_CHECK_INTERVAL bx lr @ return if suspend_count_ == 0 1: mov rSUSPEND, #SUSPEND_CHECK_INTERVAL @ reset rSUSPEND to SUSPEND_CHECK_INTERVAL #endif SETUP_SAVE_EVERYTHING_FRAME r0 @ save everything for GC stack crawl mov r0, rSELF bl artTestSuspendFromCode @ (Thread*) RESTORE_SAVE_EVERYTHING_FRAME bx lr END art_quick_test_suspend
4. StackOverflowHandler 的實作
從這個Handler的存在,我們知道,Android上對 java stack overflow的檢測,也是通過 SIGSEGV實作的。
具體的分析見:ART異常處理機制(2) - StackOverflowError 實作
5. NullPointerHandler 實作
如果 StackOVerflowHandler不能處理這次的 SIGSEGV信号,那麼接下來 NullPointerHandler将嘗試去處理。
具體分析見:ART異常處理機制(3) - NullPointerException實作
6. JavaStackTraceHandler 實作
看下其代碼:從實作上看,bool JavaStackTraceHandler::Action(int sig ATTRIBUTE_UNUSED, siginfo_t* siginfo, void* context) { // Make sure that we are in the generated code, but we may not have a dex pc. bool in_generated_code = manager_->IsInGeneratedCode(siginfo, context, false); if (in_generated_code) { LOG(ERROR) << "Dumping java stack trace for crash in generated code"; ArtMethod* method = nullptr; uintptr_t return_pc = 0; uintptr_t sp = 0; Thread* self = Thread::Current(); manager_->GetMethodAndReturnPcAndSp(siginfo, context, &method, &return_pc, &sp); // Inside of generated code, sp[0] is the method, so sp is the frame. self->SetTopOfStack(reinterpret_cast<ArtMethod**>(sp)); self->DumpJavaStack(LOG_STREAM(ERROR)); } return false; // Return false since we want to propagate the fault to the main signal handler. }
這個 Handler 實作比較簡單,其目的是:當 generated code 中發生 SIGSEGV 後,前面的幾個handler都沒有能夠處理的情況下,列印一下 java stack trace,便于提供更多直覺的資訊。
- 隻要 SIGSEGV發生在 generated code,就會DumpJavaStack,目的是友善用來分析
- 無論有沒有dump java stack,都會傳回 false,相當于不消費這個 SIGSEGV,最終仍然交給 main signal handler處理
7. 其他類型的 Java Exception的實作
7.1 ArrayIndexOutOfBoundsException
貼一段通路 array 資料時檢測 IndexOutOfBounds 的代碼,以分析這個 Exception的實作。
Java 代碼:
DEX CODE:public void setPropertyName(@NonNull String propertyName) { // mValues could be null if this is being constructed piecemeal. Just record the // propertyName to be used later when setValues() is called if so. if (mValues != null) { PropertyValuesHolder valuesHolder = mValues[0]; String oldName = valuesHolder.getPropertyName(); valuesHolder.setPropertyName(propertyName); mValuesMap.remove(oldName); mValuesMap.put(propertyName, valuesHolder); } mPropertyName = propertyName; // New property/values/target should cause re-initialization prior to starting mInitialized = false; }
QUICK CODE:40: void android.animation.ObjectAnimator.setPropertyName(java.lang.String) (dex_method_idx=1461) DEX CODE: 0x0000: 1203 | const/4 v3, #+0 0x0001: 5442 5316 | iget-object v2, v4, [Landroid/animation/PropertyValuesHolder; android.animation.ObjectAnimator.mValues // [email protected] 0x0003: 3802 1700 | if-eqz v2, +23 0x0005: 5442 5316 | iget-object v2, v4, [Landroid/animation/PropertyValuesHolder; android.animation.ObjectAnimator.mValues // [email protected] 0x0007: 4601 0203 | aget-object v1, v2, v3 0x0009: 6e10 4406 0100 | invoke-virtual {v1}, java.lang.String android.animation.PropertyValuesHolder.getPropertyName() // [email protected] 0x000c: 0c00 | move-result-object v0 0x000d: 6e20 7106 5100 | invoke-virtual {v1, v5}, void android.animation.PropertyValuesHolder.setPropertyName(java.lang.String) // [email protected] 0x0010: 5442 5416 | iget-object v2, v4, Ljava/util/HashMap; android.animation.ObjectAnimator.mValuesMap // [email protected] 0x0012: 6e20 82fc 0200 | invoke-virtual {v2, v0}, java.lang.Object java.util.HashMap.remove(java.lang.Object) // [email protected] 0x0015: 5442 5416 | iget-object v2, v4, Ljava/util/HashMap; android.animation.ObjectAnimator.mValuesMap // [email protected] 0x0017: 6e30 80fc 5201 | invoke-virtual {v2, v5, v1}, java.lang.Object java.util.HashMap.put(java.lang.Object, java.lang.Object) // [email protected] 0x001a: 5b45 5116 | iput-object v5, v4, Ljava/lang/String; android.animation.ObjectAnimator.mPropertyName // [email protected] 0x001c: 5c43 4f16 | iput-boolean v3, v4, Z android.animation.ObjectAnimator.mInitialized // [email protected] 0x001e: 0e00 | return-void
在跳轉到 pThrowArrayBounds之前,準備了兩個參數:r0 (index),r1 (array size)CODE: (code_offset=0x01aef425 size_offset=0x01aef420 size=176)... 0x01aef424: f5ad5c00 sub r12, sp, #8192 0x01aef428: f8dcc000 ldr.w r12, [r12, #0] StackMap [native_pc=0x1aef42d] (dex_pc=0x0, native_pc_offset=0x8, dex_register_map_offset=0xffffffff, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000) 0x01aef42c: e92d4de0 push {r5, r6, r7, r8, r10, r11, lr} 0x01aef430: b089 sub sp, sp, #36 0x01aef432: 9000 str r0, [sp, #0] 0x01aef434: f8b9c000 ldrh.w r12, [r9, #0] ; state_and_flags 0x01aef438: f1bc0f00 cmp.w r12, #0 0x01aef43c: d13d bne +122 (0x01aef4ba) 0x01aef43e: 6a4d ldr r5, [r1, #36] ; r1是 this,這裡 r5 是 mValues 0x01aef440: 2d00 cmp r5, #0 0x01aef442: d02a beq +84 (0x01aef49a) 0x01aef444: 460f mov r7, r1 0x01aef446: 4690 mov r8, r2 0x01aef448: 2600 movs r6, #0 ; 這個 0 是 mValues[0] 的下标index 0 0x01aef44a: 68a8 ldr r0, [r5, #8] ;這裡應該是擷取 mValues的size StackMap [native_pc=0x1aef44d] (dex_pc=0x7, native_pc_offset=0x28, dex_register_map_offset=0x0, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000) v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef44c: 4286 cmp r6, r0 ;比較 index(0) 和 mValues 的 size 0x01aef44e: d23c bcs +120 (0x01aef4ca) ; 若index(0)大于等于 mValues的size,擇跳轉到 0x01aef4ca 抛出異常 0x01aef450: 68e9 ldr r1, [r5, #12] 0x01aef452: 468a mov r10, r1 0x01aef454: 6808 ldr r0, [r1, #0] StackMap [native_pc=0x1aef457] (dex_pc=0x9, native_pc_offset=0x32, dex_register_map_offset=0x3, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000) v1: in register (1) [entry 4] v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef456: f8d000c0 ldr.w r0, [r0, #192] 0x01aef45a: f8d0e020 ldr.w lr, [r0, #32] 0x01aef45e: 47f0 blx lr StackMap [native_pc=0x1aef461] (dex_pc=0x9, native_pc_offset=0x3c, dex_register_map_offset=0x7, inline_info_offset=0xffffffff, register_mask=0x5a0, stack_mask=0b0000000000) v1: in register (10) [entry 5] v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef460: 4642 mov r2, r8 0x01aef462: 4651 mov r1, r10 0x01aef464: 4683 mov r11, r0 0x01aef466: 6808 ldr r0, [r1, #0] 0x01aef468: f8d000f0 ldr.w r0, [r0, #240] 0x01aef46c: f8d0e020 ldr.w lr, [r0, #32] 0x01aef470: 47f0 blx lr StackMap [native_pc=0x1aef473] (dex_pc=0xd, native_pc_offset=0x4e, dex_register_map_offset=0xb, inline_info_offset=0xffffffff, register_mask=0xda0, stack_mask=0b0000000000) v0: in register (11) [entry 6] v1: in register (10) [entry 5] v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef472: 6ab9 ldr r1, [r7, #40] 0x01aef474: 465a mov r2, r11 0x01aef476: 460d mov r5, r1 0x01aef478: 6808 ldr r0, [r1, #0] StackMap [native_pc=0x1aef47b] (dex_pc=0x12, native_pc_offset=0x56, dex_register_map_offset=0xf, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000) v0: in register (11) [entry 6] v1: in register (10) [entry 5] v2: in register (1) [entry 4] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef47a: f8d000d8 ldr.w r0, [r0, #216] 0x01aef47e: f8d0e020 ldr.w lr, [r0, #32] 0x01aef482: 47f0 blx lr StackMap [native_pc=0x1aef485] (dex_pc=0x12, native_pc_offset=0x60, dex_register_map_offset=0xb, inline_info_offset=0xffffffff, register_mask=0xda0, stack_mask=0b0000000000) v0: in register (11) [entry 6] v1: in register (10) [entry 5] v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef484: 6ab9 ldr r1, [r7, #40] 0x01aef486: 4642 mov r2, r8 0x01aef488: 4653 mov r3, r10 0x01aef48a: 460d mov r5, r1 0x01aef48c: 6808 ldr r0, [r1, #0] StackMap [native_pc=0x1aef48f] (dex_pc=0x17, native_pc_offset=0x6a, dex_register_map_offset=0xf, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000) v0: in register (11) [entry 6] v1: in register (10) [entry 5] v2: in register (1) [entry 4] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef48e: f8d000d0 ldr.w r0, [r0, #208] 0x01aef492: f8d0e020 ldr.w lr, [r0, #32] 0x01aef496: 47f0 blx lr StackMap [native_pc=0x1aef499] (dex_pc=0x17, native_pc_offset=0x74, dex_register_map_offset=0xb, inline_info_offset=0xffffffff, register_mask=0xda0, stack_mask=0b0000000000) v0: in register (11) [entry 6] v1: in register (10) [entry 5] v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef498: e002 b +4 (0x01aef4a0) 0x01aef49a: 460f mov r7, r1 0x01aef49c: 4690 mov r8, r2 0x01aef49e: 2600 movs r6, #0 0x01aef4a0: f8c78074 str.w r8, [r7, #116] 0x01aef4a4: f1b80f00 cmp.w r8, #0 0x01aef4a8: d003 beq +6 (0x01aef4b2) 0x01aef4aa: f8d90080 ldr.w r0, [r9, #128] ; card_table 0x01aef4ae: 09f9 lsrs r1, r7, #7 0x01aef4b0: 5440 strb r0, [r0, r1] 0x01aef4b2: 767e strb r6, [r7, #25] 0x01aef4b4: b009 add sp, sp, #36 0x01aef4b6: e8bd8de0 pop {r5, r6, r7, r8, r10, r11, pc} 0x01aef4ba: 9104 str r1, [sp, #16] 0x01aef4bc: 9205 str r2, [sp, #20] 0x01aef4be: f8d9e2a8 ldr.w lr, [r9, #680] ; pTestSuspend 0x01aef4c2: 47f0 blx lr StackMap [native_pc=0x1aef4c5] (dex_pc=0x0, native_pc_offset=0xa0, dex_register_map_offset=0x13, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000110000) v4: in stack (16) [entry 7] v5: in stack (20) [entry 8] 0x01aef4c4: 9904 ldr r1, [sp, #16] 0x01aef4c6: 9a05 ldr r2, [sp, #20] 0x01aef4c8: e7b9 b -142 (0x01aef43e) 0x01aef4ca: 4601 mov r1, r0 ;把 mValues 的 size 作為第二個參數 0x01aef4cc: 4630 mov r0, r6 ;把 index(0) 作為第一個參數 0x01aef4ce: f8d9e2b0 ldr.w lr, [r9, #688] ; pThrowArrayBounds 0x01aef4d2: 47f0 blx lr ; 調用 pThrowArrayBounds(artThrowArrayBoundsFromCode)抛出 ArrayIndexOutOfBoundsException StackMap [native_pc=0x1aef4d5] (dex_pc=0x7, native_pc_offset=0xb0, dex_register_map_offset=0x0, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000) v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3]
qpoints->pThrowArrayBounds = art_quick_throw_array_bounds;
看下這個宏:/* * Called by managed code to create and deliver an ArrayIndexOutOfBoundsException. Arg1 holds * index, arg2 holds limit. */ TWO_ARG_RUNTIME_EXCEPTION_SAVE_EVERYTHING art_quick_throw_array_bounds, artThrowArrayBoundsFromCode
在原有參數的基礎上又加了第三個參數 r2,它是 Thread* self;然後跳轉到 artThrowArrayBoundsFromCode:.macro TWO_ARG_RUNTIME_EXCEPTION_SAVE_EVERYTHING c_name, cxx_name .extern \cxx_name ENTRY \c_name SETUP_SAVE_EVERYTHING_FRAME r2 @ save all registers as basis for long jump context mov r2, r9 @ pass Thread::Current bl \cxx_name @ \cxx_name(Thread*) END \c_name .endm
// Called by generated code to throw an array index out of bounds exception. extern "C" NO_RETURN void artThrowArrayBoundsFromCode(int index, int length, Thread* self) REQUIRES_SHARED(Locks::mutator_lock_) { ScopedQuickEntrypointChecks sqec(self); ThrowArrayIndexOutOfBoundsException(index, length); self->QuickDeliverException(); }
7.2 ArithmeticException
看下注釋:看一個例子: Java CODE:/* * Called by managed code to create and deliver an ArithmeticException. */ NO_ARG_RUNTIME_EXCEPTION_SAVE_EVERYTHING art_quick_throw_div_zero, artThrowDivZeroFromCode
DEX CODE:public static int floorDiv(int x, int y) { int r = x / y; // if the signs are different and modulo not zero, round down if ((x ^ y) < 0 && (r * y != x)) { r--; } return r; }
QUICK CODE:24: int java.lang.Math.floorDiv(int, int) (dex_method_idx=2574) DEX CODE: 0x0000: 9300 0203 | div-int v0, v2, v3 0x0002: 9701 0203 | xor-int v1, v2, v3 0x0004: 3b01 0800 | if-gez v1, +8 0x0006: 9201 0003 | mul-int v1, v0, v3 0x0008: 3221 0400 | if-eq v1, v2, +4 0x000a: d800 00ff | add-int/lit8 v0, v0, #-1 0x000c: 0f00 | return v0
可以看到在除法運算中安插了除數為0的檢測。Interpreter模式下的檢測不再介紹。CODE: (code_offset=0x005d0024 size_offset=0x005d0020 size=72)... 0x005d0024: f81f0fe0 str x0, [sp, #-16]! 0x005d0028: f90007fe str lr, [sp, #8] 0x005d002c: 340001c2 cbz w2, #+0x38 (addr 0x5d0064) 0x005d0030: 1ac20c20 sdiv w0, w1, w2 0x005d0034: 4a020023 eor w3, w1, w2 0x005d0038: 36f80103 tbz w3, #31, #+0x20 (addr 0x5d0058) 0x005d003c: 1b007c42 mul w2, w2, w0 0x005d0040: 6b02003f cmp w1, w2 0x005d0044: 1a9f17e1 cset w1, eq 0x005d0048: 51000402 sub w2, w0, #0x1 (1) 0x005d004c: 7100003f cmp w1, #0x0 (0) 0x005d0050: 1a821003 csel w3, w0, w2, ne 0x005d0054: aa0303e0 mov x0, x3 0x005d0058: f94007fe ldr lr, [sp, #8] 0x005d005c: 910043ff add sp, sp, #0x10 (16) 0x005d0060: d65f03c0 ret 0x005d0064: f942767e ldr lr, [tr, #1256] ; pThrowDivZero 0x005d0068: d63f03c0 blr lr StackMap [native_pc=0x5d006c] (dex_pc=0x0, native_pc_offset=0x48, dex_register_map_offset=0x0, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b000000) v2: in register (1) [entry 0] v3: in register (2) [entry 1]
7.3 StringIndexOutOfBoundsException
與上面的 ArrayIndexOutOfBoundsException 類似:/* * Called by managed code to create and deliver a StringIndexOutOfBoundsException * as if thrown from a call to String.charAt(). Arg1 holds index, arg2 holds limit. */ TWO_ARG_RUNTIME_EXCEPTION_SAVE_EVERYTHING art_quick_throw_string_bounds, artThrowStringBoundsFromCode
9. Throw & Catch的實作
throw-catch-finally: ART異常處理機制(4) - throw & catch & finally實作