天天看点

使用AddressSanitizer进行内存访问越界检查

一、AddressSanitizer简介

    本人这次使用AddressSanitizer是因工作上负责的程序发生了内存越界访问,非法修改了第三方内存管理库的内存数据,使程序偶尔发生coredump。使用valgrind时,一直报以下错误,网上也没有找到解决方法,对比后选择AddressSanitizer。

valgrind: mmap(0xf10000, 1027244032) failed in UME with error 22 (Invalid argument).
valgrind: this can be caused by executables with very large text, data or bss segments.
           

    AddressSanitizer是google开发一个应用内存检查工具,性能据说比valgrind要好不少,可以配合clang或者GCC编译器使用,GCC需要4.8及以上版本。4.8版本GCC对AddressSanitizer支持有限,功能不太完善,输出的错误信息也不够友好,使用不太方便,建议使用4.9及以上版本。但是我这次使用的是4.8.3 版本的GCC。详细了解AddressSanitizer信息可以访问其github项目地址:

https://github.com/google/sanitizers/wiki/AddressSanitizer

二、使用方法

    环境:centos7.1,GCC 4.8.3

    需要安装的库:libasan.x86_64,新版本的gcc可能还需要安装libubsan,虽然说AddressSanitizer是gcc的一部分,但这两库默认是没有安装的。

    使用方法很简单,只要在编译程序时加上-fsanitize=address -fno-omit-frame-pointer两个编译选项即可,需要说明的是要使用系统自带的内存管理库,不能使用第三方的内存管理库,因为这个功能要拦截malloc,free等标准函数。gcc几个常用编译选项如下:

-fsanitize=address    #开启地址越界检查功能

-fno-omit-frame-pointer  #开启后,可以出界更详细的错误信息

-fsanitize=leak   #开启内存泄露检查功能

GCC编译选项详细了解可参考地址:https://gcc.gnu.org/onlinedocs/

三、一个bug记录

指定输入字符串长度的ssanf引起异常

20180418 19:44:47.979.036 root txn_checkpoint waste time 0 second or 3 millions[svdb.c:499]
=================================================================
==1597== ERROR: AddressSanitizer: unknown-crash on address 0x7ffbc1a3db30 at pc 0x7ffff4e5619f bp 0x7ffbc1a3d8c0 sp 0x7ffbc1a3d868
20180418 19:44:47.980.555 mng NODE(1-1-1-2, deviceid=111111222) didn't report itself status timeout,now set it status is down, and alarm[state_mng_svr.c:4607]
WRITE of size 21 at 0x7ffbc1a3db30 thread T18
20180418 19:44:47.980.831 mng mng client select nic(0) bond0 ip(172.16.0.58) to join multicast(239.73.220.45)[state_mng_clnt.c:188]
20180418 19:44:47.985.171 mng notify NODE 1-1-1-2 UP, and clear alarm[state_mng_svr.c:382]
    #0 0x7ffff4e5619e (/usr/lib64/libasan.so.0.0.0+0xb19e)
    #1 0x7ffff4e568b6 (/usr/lib64/libasan.so.0.0.0+0xb8b6)
    #2 0x7ffff4e569e9 (/usr/lib64/libasan.so.0.0.0+0xb9e9)
    #3 0x80b83c (/opt/fonsview/NE/ss/bin/ss+0x80b83c)
    #4 0x7977aa (/opt/fonsview/NE/ss/bin/ss+0x7977aa)
    #5 0x79b964 (/opt/fonsview/NE/ss/bin/ss+0x79b964)
    #6 0x75bbf6 (/opt/fonsview/NE/ss/bin/ss+0x75bbf6)
    #7 0x41716c (/opt/fonsview/NE/ss/bin/ss+0x41716c)
    #8 0x7ffff4e64a97 (/usr/lib64/libasan.so.0.0.0+0x19a97)
    #9 0x7ffff4116df4 (/usr/lib64/libpthread-2.17.so+0x7df4)
    #10 0x7ffff256b1ac (/usr/lib64/libc-2.17.so+0xf61ac)
Address 0x7ffbc1a3db30 is located at offset 160 in frame <hem_http_decode_request> of T18's stack:
  This frame has 5 object(s):
    [32, 40) 'range_start'
    [96, 104) 'range_end'
    [160, 180) 'ctype'
    [224, 256) 'fmt'
    [288, 352) 'range_str'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
Thread T18 created by T0 here:
    #0 0x7ffff4e55c3a (/usr/lib64/libasan.so.0.0.0+0xac3a)
    #1 0x418107 (/opt/fonsview/NE/ss/bin/ss+0x418107)
    #2 0x41101f (/opt/fonsview/NE/ss/bin/ss+0x41101f)
    #3 0x406107 (/opt/fonsview/NE/ss/bin/ss+0x406107)
    #4 0x7ffff2496af4 (/usr/lib64/libc-2.17.so+0x21af4)
Shadow bytes around the buggy address:
  0x0ffff833fb10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffff833fb20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffff833fb30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffff833fb40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffff833fb50: 00 00 f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 f4
=>0x0ffff833fb60: f4 f4 f2 f2 f2 f2[00]00 04 f4 f2 f2 f2 f2 00 00
  0x0ffff833fb70: 00 00 f2 f2 f2 f2 00 00 00 00 00 00 00 00 f3 f3
  0x0ffff833fb80: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffff833fb90: 00 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f2 f2 f2 f2
  0x0ffff833fba0: 04 f4 f4 f4 f2 f2 f2 f2 00 00 f4 f4 f3 f3 f3 f3
  0x0ffff833fbb0: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:     fa
  Heap righ redzone:     fb
  Freed Heap region:     fd
  Stack left redzone:    f1
  Stack mid redzone:     f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:    f5
  Stack use after scope: f8
  Global redzone:        f9
  Global init order:     f6
  Poisoned by user:      f7
  ASan internal:         fe
==1597== ABORTING
           

可以看到打印出的调用栈信息只指令地址,没有具体的函数名,尽管我在编译程序时加了-ggdb选项,不过还是有一个栈顶调用函数名hem_http_decode_request,如果想定位到具体的行,可以用objdump反编译程序,再根据指令地址确定文件名及行号。这里直接给出具体示例代码如下:

char ctype[32]={0};
char fmt[32]={0};
snprintf(fmt,sizeof(fmt),"%%%ld[^&? ]",(long)sizeof(ctype));
sscanf(p_str,fmt,ctype);//程序退出位置
           

修改只需要把 (long)sizeof(ctype)) 改成 (long)sizeof(ctype))-1 ,原因应该是fmt里给出的字符串长度不能包含结尾的空字符,它只算有效字符长度,用gdb调试可以发现地址0x7ffbc1a3db30就是变量ctype的地址。

四、小结

    使用AddressSanitizer很快发现了代码中一个堆上的数据越界访问问题,修改后程序没有再发生core, 感觉用AddressSanitizer定位越界访问还是挺方便的,后面有时间可以再试下其内存泄露检查功能。