天天看點

Linux系統調用__get_thread擷取TLS失敗導緻應用程式奔潰

背景

Android模拟器運作在PC端,Android應用運作在模拟器内部,當PC機在BIOS中沒有打開虛拟化技術(vt-x: intel的硬體虛拟化技術; AMD-V: AMD CPU的硬體虛拟化技術)的時候,在模拟器内部運作ARM庫的遊戲,出現崩潰或者運作一段時間之後崩潰的問題. 具體奔潰點在

__get_tls()+6

處. 這裡以

當樂.apk

這個遊戲為例子,删除其中libs下的x86庫,隻保留arm類型庫檔案,安裝運作後整個崩潰日志如下:

- :: E/ZKOPCountUtil( ): find Name = 當樂
- :: D/dalvikvm( ): GC_CONCURRENT freed K, % free K/K, paused ms+ms, total ms
- :: D/dalvikvm( ): WAIT_FOR_CONCURRENT_GC blocked ms
- :: D/dalvikvm( ): WAIT_FOR_CONCURRENT_GC blocked ms
- :: D/dalvikvm( ): WAIT_FOR_CONCURRENT_GC blocked ms
- :: W/View    ( ): requestLayout() improperly called by android.support.v7.widget.AppCompatTextView{52831f4c V.ED.... ......I. 20,0-148,91 #7f0d0438 app:id/expand_title} during layout: running second layout pass
- :: D/Volley  ( ): [] b.a: HTTP response for request=<[ ] http://res5.d.cn/cp/img/502487/o_1bbl6epie170sbec184qs9i1ggou.png 0x22e400ee LOW 2> [lifetime=4156], [size=67], [rc=200], [retryCount=0]
- :: F/libc    ( ): Fatal signal  (SIGSEGV) at x24244c8d (code=), thread  (Thread-)
- :: I/DEBUG   (  ): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
- :: I/DEBUG   (  ): Build fingerprint: 'SAMSUNG/hlteatt/hlteuc:4.4.4/tt/eng.jenkins.20170306.140753:userdebug/test-keys'
- :: I/DEBUG   (  ): Revision: '0'
- :: I/DEBUG   (  ): pid: , tid: , name: Thread-  >>> com.diguayouxi <<<
- :: I/DEBUG   (  ): signal  (SIGSEGV), code  (SEGV_MAPERR), fault addr c8d
- :: D/dalvikvm(  ): GC_CONCURRENT freed K, % free K/K, paused ms+ms, total ms
- :: I/GAv4-SVC( ): Google Analytics . is starting up.
- :: I/DEBUG   (  ):     eax c89  ebx b76b7fcc  ecx   edx 
- :: I/DEBUG   (  ):     esi b76c694c  edi 
- :: I/DEBUG   (  ):     xcs   xds b  xes b  xfs b  xss b
- :: I/DEBUG   (  ):     eip b76343c6  ebp   esp cc  flags 
- :: D/dalvikvm(  ): GC_CONCURRENT freed K, % free K/K, paused ms+ms, total ms
- :: I/DEBUG   (  ):
- :: I/DEBUG   (  ): backtrace:
- :: I/DEBUG   (  ):     #00  pc c6  /system/lib/libc.so (__get_thread+)
- :: I/DEBUG   (  ):     #01  pc de2d  /system/lib/libc.so (pthread_mutex_lock+)
- :: I/DEBUG   (  ):     #02  pc a745  /system/lib/libc.so (flockfile+)
- :: I/DEBUG   (  ):     #03  pc f  /system/lib/libc.so (fread+)
- :: I/DEBUG   (  ):     #04  pc f6a  /system/lib/libc.so (android_getaddrinfo_proxy+)
- :: I/DEBUG   (  ):     #05  pc c30  /system/lib/libc.so (android_getaddrinfoforiface+)
- :: I/DEBUG   (  ):     #06  pc e97  /system/lib/libc.so (getaddrinfo+)
- :: I/DEBUG   (  ):     #07  pc   /system/lib/libjavacore.so (Posix_getaddrinfo(_JNIEnv*, _jobject*, _jstring*, _jobject*)+)
- :: I/DEBUG   (  ):     #08  pc a4ab  /system/lib/libdvm.so (dvmPlatformInvoke+)
- :: I/DEBUG   (  ):     #09  pc a27  [heap]
- :: I/DEBUG   (  ):     #10  pc da2  /system/lib/libdvm.so (dvmCallJNIMethod(unsigned int const*, JValue*, Method const*, Thread*)+434)
03-27 15:: I/DEBUG   (  ):     #11  pc b8  /system/lib/libdvm.so
- :: I/DEBUG   (  ):     #12  pc cf7  <unknown>
- :: I/DEBUG   (  ):     #13  pc b962  /system/lib/libdvm.so (dvmMterpStd(Thread*)+)
- :: I/DEBUG   (  ):     #14  pc   /system/lib/libdvm.so (dvmInterpret(Thread*, Method const*, JValue*)+217)
03-27 15:: I/DEBUG   (  ):     #15  pc bd027  /system/lib/libdvm.so (dvmCallMethodV(Thread*, Method const*, Object*, bool, JValue*, char*)+759)
03-27 15:: I/DEBUG   (  ):     #16  pc bd437  /system/lib/libdvm.so (dvmCallMethod(Thread*, Method const*, Object*, JValue*, ...)+55)
03-27 15:: I/DEBUG   (  ):     #17  pc c3  /system/lib/libdvm.so (interpThreadStart(void*)+)
- :: I/DEBUG   (  ):     #18  pc bc3c  /system/lib/libc.so (__thread_entry+)
- :: I/DEBUG   (  ):     #19  pc e1b5  /system/lib/libc.so (__pthread_clone+)
- :: I/DEBUG   (  ):     #20  pc fdf  /system/lib/libdvm.so (internalThreadStart(void*)+)
- :: I/DEBUG   (  ):
- :: I/DEBUG   (  ): stack:
- :: I/DEBUG   (  ):          c  b4db080e  /system/lib/libdvm.so (dvmMterp_OP_RETURN_VOID_BARRIER+)
- :: I/DEBUG   (  ):            b8cadbc0  [heap]
- :: I/DEBUG   (  ):            
- :: I/DEBUG   (  ):            
- :: I/DEBUG   (  ):          c  b7629f39  /system/lib/libc.so (pthread_mutex_unlock+)
- :: I/DEBUG   (  ):          a0  
- :: I/DEBUG   (  ):          a4  db6fdee  /data/dalvik-cache/[email protected]@[email protected]
- :: I/DEBUG   (  ):          a8  dce4
- :: I/DEBUG   (  ):          ac  b7629fba  /system/lib/libc.so (pthread_mutex_unlock+)
- :: I/DEBUG   (  ):          b0  
- :: I/DEBUG   (  ):          b4  b8cadbd0  [heap]
- :: I/DEBUG   (  ):          b8  dd30518  /dev/ashmem/dalvik-LinearAlloc (deleted)
- :: I/DEBUG   (  ):          bc  b7629fba  /system/lib/libc.so (pthread_mutex_unlock+)
- :: I/DEBUG   (  ):          c0  
- :: I/DEBUG   (  ):          c4  b8cae030  [heap]
- :: I/DEBUG   (  ):          c8  b7629d69  /system/lib/libc.so (pthread_mutex_lock+)
- :: I/DEBUG   (  ):     #00  cc  b7629e2e  /system/lib/libc.so (pthread_mutex_lock+)
- :: I/DEBUG   (  ):     #01  d0  a59e7eec  /dev/ashmem/dalvik-heap (deleted)
- :: I/DEBUG   (  ):          d4  b8ea6808  [heap]
- :: I/DEBUG   (  ):          d8  b76bc718
- :: I/DEBUG   (  ):          dc  b762ed4f  /system/lib/libc.so (dlmalloc+)
- :: I/DEBUG   (  ):          e0  b76bc800
- :: I/DEBUG   (  ):          e4  b8cae030  [heap]
- :: I/DEBUG   (  ):          e8  
- :: I/DEBUG   (  ):          ec  
- :: I/DEBUG   (  ):          f0  
- :: I/DEBUG   (  ):          f4  b8e2bee8  [heap]
- :: I/DEBUG   (  ):          f8  b7629d69  /system/lib/libc.so (pthread_mutex_lock+)
- :: I/DEBUG   (  ):          fc  b76b7fcc  /system/lib/libc.so
- :: I/DEBUG   (  ):            b8ea6808  [heap]
- :: I/DEBUG   (  ):            
- :: I/DEBUG   (  ):            b76c63a0
- :: I/DEBUG   (  ):          c  b7676746  /system/lib/libc.so (flockfile+)
- :: I/DEBUG   (  ):     #02    b76c694c
- :: I/DEBUG   (  ):            b8e2bee8  [heap]
- :: I/DEBUG   (  ):            
- :: I/DEBUG   (  ):          c  b76b7fcc  /system/lib/libc.so
- :: I/DEBUG   (  ):            da  [stack:]
- :: I/DEBUG   (  ):            b7676726  /system/lib/libc.so (flockfile+)
- :: I/DEBUG   (  ):            b76b7fcc  /system/lib/libc.so
- :: I/DEBUG   (  ):          c  b7662520  /system/lib/libc.so (fread+)
- :: I/DEBUG   (  ):
- :: I/DEBUG   (  ): memory map around fault addr c8d:
- :: I/DEBUG   (  ):     c142000-c145000 rw-
- :: I/DEBUG   (  ):     (no map for address)
- :: I/DEBUG   (  ):     d000-e000 ---
- :: I/PhenotypeConfigurator(  ): Scheduling Phenotype for one-off execution  seconds from now ()
- :: D/dalvikvm( ): GC_CONCURRENT freed K, % free K/K, paused ms+ms, total ms
           

問題定位

根據奔潰日志,找到相應的函數

__get_tls()

,在源碼中實作如下:

//android-4.4.4\bionic\libc\arch-x86\bionic\__get_tls.c

/* see the implementation of __set_tls and pthread.c to understand this
 * code. Basically, the content of gs:[0] always is a pointer to the base
 * address of the tls region
 */
void*   __get_tls(void)
{
  void*  tls;
  asm ( "   movl  %%gs:0, %0" : "=r"(tls) );
  return tls;
}
           

從代碼的注釋可以看出,這個

gs寄存器

儲存的是指向TLS(Thread Local Storage:線程本地存儲)的基位址指針.用IDA能更加直覺的看到奔潰的點.如下是用IDA打開libc.so的

__get_tls()

函數,那麼在

__get_tls()+6

這行崩潰,也就是

mov eax, [eax+4]

間接取址崩潰.

.text:C0
.text:C0 ; =============== S U B R O U T I N E =======================================
.text:C0
.text:C0
.text:C0                 public __get_thread
.text:C0 __get_thread    proc near               ; CODE XREF: __pthread_cleanup_push+Bp
.text:C0                                         ; __pthread_cleanup_pop+Bp ...
.text:C0                 mov     eax, large gs:
.text:C6                 mov     eax, [eax+]
.text:C9                 nop
.text:CA                 nop
.text:CB                 nop
.text:CC                 nop
.text:CD                 retn
.text:CD __get_thread    endp
           

那麼問題來了,eax是從gs寄存器讀取的值,加4後間接尋址失敗.這裡gs寄存器的值肯定有問題,從奔潰日志的來看,eax寄存器的值就是gs:0的值,這裡位址有問題.那麼現在我們需要了解的是這個gs寄存器哪裡設定,作用時啥?

既然代碼注釋說明了gs時存放tls基位址指針的,tls存放在核心GDT表中,那麼這個gs應該是由核心來設定的.這裡以x86的段配置設定為例子,段定義檔案在

asm\Segment.h

中,如下:

// genymotion_kernel_3.10\arch\x86\include\asm\Segment.h

/*
 * The layout of the per-CPU GDT under Linux:
 *
 *   0 - null
 *   1 - reserved
 *   2 - reserved
 *   3 - reserved
 *
 *   4 - unused         <==== new cacheline
 *   5 - unused
 *
 *  ------- start of TLS (Thread-Local Storage) segments:
 *
 *   6 - TLS segment #1         [ glibc's TLS segment ]
 *   7 - TLS segment #2         [ Wine's %fs Win32 segment ]
 *   8 - TLS segment #3
 *   9 - reserved
 *  10 - reserved
 *  11 - reserved
 *
 *  ------- start of kernel segments:
 *
 *  12 - kernel code segment        <==== new cacheline
 *  13 - kernel data segment
 *  14 - default user CS
 *  15 - default user DS
 *  16 - TSS
 *  17 - LDT
 *  18 - PNPBIOS support (16->32 gate)
 *  19 - PNPBIOS support
 *  20 - PNPBIOS support
 *  21 - PNPBIOS support
 *  22 - PNPBIOS support
 *  23 - APM BIOS support
 *  24 - APM BIOS support
 *  25 - APM BIOS support
 *
 *  26 - ESPFIX small SS
 *  27 - per-cpu            [ offset to per-cpu data area ]
 *  28 - stack_canary-20        [ for stack protector ]
 *  29 - unused
 *  30 - unused
 *  31 - TSS for double fault handler
 */

 ... ...
 //省去部分代碼


 /*
 * Save a segment register aw
 */
#define savesegment(seg, value)             \
    asm("mov %%" #seg ",%0":"=r" (value) : : "memory")

/*
 * x86_32 user gs accessors.
 */
#ifdef CONFIG_X86_32
#ifdef CONFIG_X86_32_LAZY_GS
#define get_user_gs(regs)   (u16)({unsigned long v; savesegment(gs, v); v;})
#define set_user_gs(regs, v)    loadsegment(gs, (unsigned long)(v))
#define task_user_gs(tsk)   ((tsk)->thread.gs)
#define lazy_save_gs(v)     savesegment(gs, (v))
#define lazy_load_gs(v)     loadsegment(gs, (v))
#else   /* X86_32_LAZY_GS */
#define get_user_gs(regs)   (u16)((regs)->gs)
#define set_user_gs(regs, v)    do { (regs)->gs = (v); } while (0)
#define task_user_gs(tsk)   (task_pt_regs(tsk)->gs)
#define lazy_save_gs(v)     do { } while (0)
#define lazy_load_gs(v)     do { } while (0)
#endif  /* X86_32_LAZY_GS */
#endif  /* X86_32 */
           

問題解決

從上表可以看出整個GDT的分段,其中包括TLS段,關鍵的是在最後有關擷取gs寄存器值的方法.可以看到,在核心配置了

CONFIG_X86_32

的情況下,有兩個擷取gs寄存器值的方法,依賴于核心中宏

CONFIG_X86_32_LAZY_GS

的定義與否.

通過檢視核心中

CONFIG_X86_32_LAZY_GS

的定義,發現處于選中狀态,那麼此時gs的值是從局部變量v中指派給gs的,這個時候局部變量的值由于沒有初始化,是以為一個随機值.如果沒有選

CONFIG_X86_32_LAZY_GS

,那麼直接擷取gs寄存器的值傳回,這是regs的值在哪裡設定gs暫且不表.看到這裡也許還是不明白gs在整個核心中的作用以及流程.沒有關系,後續在深入. 至于解決這個問題,由于發現

CONFIG_X86_32_LAZY_GS

對擷取gs寄存器的影響,配置核心,去除

CONFIG_X86_32_LAZY_GS

選項,重編後驗證,當樂.apk正常運作.說明此配置影響gs寄存器的取值.

解決patch如下,合入x86的deconfig配置檔案即可:

@@ -37,7 +37,6 @@ CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
 CONFIG_HAVE_INTEL_TXT=y
 CONFIG_X86_32_SMP=y
 CONFIG_X86_HT=y
-CONFIG_X86_32_LAZY_GS=y
 CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-ecx -fcall-saved-edx"
 CONFIG_ARCH_CPU_PROBE_RELEASE=y
 CONFIG_ARCH_SUPPORTS_UPROBES=y
@@ -452,7 +451,7 @@ CONFIG_ARCH_RANDOM=y
 CONFIG_X86_SMAP=y
 # CONFIG_EFI is not set
 # CONFIG_SECCOMP is not set
-# CONFIG_CC_STACKPROTECTOR is not set
+CONFIG_CC_STACKPROTECTOR=y
 # CONFIG_HZ_100 is not set
 CONFIG_HZ_250=y
 # CONFIG_HZ_300 is not set
           
  • 上述

    CONFIG_X86_32_LAZY_GS

    CONFIG_CC_STACKPROTECTOR

    是依賴關系,去除

    CONFIG_X86_32_LAZY_GS

    配置需要選擇

    CONFIG_CC_STACKPROTECTOR=y

  • 如果打開上述核心配置選項出現核心編譯錯誤

    error: undefined reference to '__stack_chk_guard'

    ,請參考本人的另外一篇文章: Linux編譯x86架構核心出現_stack_chk_guard未定義錯誤

總結

好了,此問題解決了,但是還有很多疑點沒有搞清楚,這個最要命了,作為開發,不了解整個流程總是心裡沒底,不踏實.但是還是得慢慢來,後續就是對整個GDT以及記憶體進行學習

感謝

2017 …… ,卷起褲管跑,撸起袖子幹!

yanxiangyfg的專欄 : “忠于實踐,記錄點滴”

繼續閱讀