赞
踩
原文地址:http://blog.csdn.net/jinzhuojun/article/details/46659155
C/C++等底层语言在提供强大功能及性能的同时,其灵活的内存访问也带来了各种纠结的问题。如果crash的地方正是内存使用错误的地方,说明你人品好。如果crash的地方内存明显不是consistent的,或者内存管理信息都已被破坏,并且还是随机出现的,那就比较麻烦了。当然,祼看code打log是一个办法,但其效率不是太高,尤其是在运行成本高或重现概率低的情况下。另外,静态检查也是一类方法,有很多工具(lint, cppcheck, klockwork, splint, o, etc.)。但缺点是误报很多,不适合针对性问题。另外好点的一般还要钱。最后,就是动态检查工具。下面介绍几个Linux平台下主要的运行时内存检查工具。绝大多数都是开源免费且支持x86和ARM平台的。
首先,比较常见的内存问题有下面几种:
• memory overrun:写内存越界
• double free:同一块内存释放两次
• use after free:内存释放后使用
• wild free:释放内存的参数为非法值
• access uninitialized memory:访问未初始化内存
• read invalid memory:读取非法内存,本质上也属于内存越界
• memory leak:内存泄露
• use after return:caller访问一个指针,该指针指向callee的栈内内存
• stack overflow:栈溢出
针对上面的问题,主要有以下几种方法:
1. 为了检测内存非法使用,需要hook内存分配和操作函数。hook的方法可以是用C-preprocessor,也可以是在链接库中直接定义(因为Glibc中的malloc/free等函数都是weak symbol),或是用LD_PRELOAD。另外,通过hook strcpy(),memmove()等函数可以检测它们是否引起buffer overflow。
2. 为了检查内存的非法访问,需要对程序的内存进行bookkeeping,然后截获每次访存操作并检测是否合法。bookkeeping的方法大同小异,主要思想是用shadow memory来验证某块内存的合法性。至于instrumentation的方法各种各样。有run-time的,比如通过把程序运行在虚拟机中或是通过binary translator来运行;或是compile-time的,在编译时就在访存指令时就加入检查操作。另外也可以通过在分配内存前后加设为不可访问的guard page,这样可以利用硬件(MMU)来触发SIGSEGV,从而提高速度。
3. 为了检测栈的问题,一般在stack上设置canary,即在函数调用时在栈上写magic number或是随机值,然后在函数返回时检查是否被改写。另外可以通过mprotect()在stack的顶端设置guard page,这样栈溢出会导致SIGSEGV而不至于破坏数据。
以上方法有些强于功能,有些胜在性能,有些则十分方便易用,总之各有千秋。以下是几种常用工具在Linux x86_64平台的实验结果,注意其它平台可能结果有差异。另外也可能由于版本过老,编译环境差异,姿势不对,总之各种原因造成遗漏,如有请谅解~
Tool\Problem | memory overrun | double free | use after free | wild free | access uninited | read invalid memory | memory leak | use after return | stack overflow |
---|---|---|---|---|---|---|---|---|---|
Memory checking tools in Glibc | Yes | Yes | Yes | Yes(if use memcpy, strcpy, etc) | |||||
TCMalloc(Gperftools) | Yes | ||||||||
Valgrind | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Address Sanitizer(ASan) | Yes | Yes | Yes | Yes | (Memory Sanitizer) | Yes | Yes | Yes | Yes |
Memwatch | Yes | Yes | Yes | ||||||
Dr.Memory | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | |
Electric Fence | Yes | Yes | Yes | Yes | |||||
Dmalloc | Yes | Yes | Yes | Yes | Yes |
下面简单介绍一下这些工具以及基本用法。更详细用法请参见各自manual。
Glibc中自带了一些Heap consistency checking机制。
用mallopt()的M_CHECK_ACTION可以设置内存检测行为,设MALLOC_CHECK_环境变量效果也是一样的。从Glibc 2.3.4开始,默认为3。即打印出错信息,stack trace和memory mapping,再退出程序。设置LIBC_FATAL_STDERR_=1可以将这些信息输出到stderr。比如运行以下有double free的程序:
$ MALLOC_CHECK_=3 ./bug
会打印如下信息然后退出:
<code class="hljs asciidoc has-numbering"><span class="hljs-bullet">*** </span>Error in <span class="hljs-smartquote">`./bug'</span>: free(): invalid pointer: 0x00000000010d6010 *** ======= Backtrace: ========= /lib/x86<span class="hljs-emphasis">_64-linux-gnu/libc.so.6(+0x7338f)[0x7f367073238f] /lib/x86_</span>64-linux-gnu/libc.so.6(+0x81fb6)[0x7f3670740fb6] ./bug[0x400845] ./bug[0x400c36] /lib/x86<span class="hljs-emphasis">_64-linux-gnu/libc.so.6(_</span><span class="hljs-emphasis">_libc_</span>start<span class="hljs-emphasis">_main+0xf5)[0x7f36706e0ec5] ./bug[0x400729] ======= Memory map: ======== 00400000-00402000 r-xp 00000000 08:01 2893041 /home/jzj/code/bug 00601000-00602000 r--p 00001000 08:01 2893041 /home/jzj/code/bug 00602000-00603000 rw-p 00002000 08:01 2893041 /home/jzj/code/bug 010d6000-010f7000 rw-p 00000000 00:00 0 [heap] 7f36704a8000-7f36704be000 r-xp 00000000 08:01 4203676 /lib/x86_</span>64-linux-gnu/libgcc<span class="hljs-emphasis">_s.so.1 7f36704be000-7f36706bd000 ---p 00016000 08:01 4203676 /lib/x86_</span>64-linux-gnu/libgcc<span class="hljs-emphasis">_s.so.1 7f36706bd000-7f36706be000 r--p 00015000 08:01 4203676 /lib/x86_</span>64-linux-gnu/libgcc<span class="hljs-emphasis">_s.so.1 7f36706be000-7f36706bf000 rw-p 00016000 08:01 4203676 /lib/x86_</span>64-linux-gnu/libgcc<span class="hljs-emphasis">_s.so.1 … Aborted (core dumped)</span></code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li><li>12</li><li>13</li><li>14</li><li>15</li><li>16</li><li>17</li><li>18</li><li>19</li></ul>
mcheck是Glibc中的堆内存一致性检查机制。使用时只要加上头文件:
<code class="hljs vala has-numbering"><span class="hljs-preprocessor">#include <mcheck></span></code><ul class="pre-numbering" style="display: block;"><li>1</li></ul>
再在要开始检查的地方加上:
<code class="hljs php has-numbering"><span class="hljs-keyword">if</span> (mcheck(<span class="hljs-keyword">NULL</span>) != <span class="hljs-number">0</span>) { fprintf(stderr, <span class="hljs-string">"mcheck() failed\n"</span>); <span class="hljs-keyword">exit</span>(EXIT_FAILURE); } …</code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li></ul>
编译时加-lmcheck然后运行即可:
$ g++ -Wall -g problem.cpp -o bug -lmcheck
宏_FORTIFY_SOURCE提供轻量级的buffer overflow检测。设置后会调用Glibc里带_chk后缀的函数,做一些运行时检查。主要检查各种字符串缓冲区溢出和内存操作。比如memmove, memcpy, memset, strcpy, strcat, vsprintf等。注意一些平台上编译时要加-O1或以上优化。这样就可以检查出因为那些内存操作函数导致的缓冲溢出问题:
$ g++ -Wall -g -O2 -D_FORTIFY_SOURCE=2 problem.cpp -o bug
<code class="hljs vbnet has-numbering">*** buffer overflow detected ***: ./bug terminated ======= Backtrace: ========= /<span class="hljs-keyword">lib</span>/x86_64-linux-gnu/libc.so<span class="hljs-number">.6</span>(+<span class="hljs-number">0x7338f</span>)[<span class="hljs-number">0x7f9976e1638f</span>] /<span class="hljs-keyword">lib</span>/x86_64-linux-gnu/libc.so<span class="hljs-number">.6</span>(__fortify_fail+<span class="hljs-number">0x5c</span>)[<span class="hljs-number">0x7f9976eadc9c</span>] /<span class="hljs-keyword">lib</span>/x86_64-linux-gnu/libc.so<span class="hljs-number">.6</span>(+<span class="hljs-number">0x109b60</span>)[<span class="hljs-number">0x7f9976eacb60</span>]</code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li></ul>
mtrace可以用于检查malloc/free是否正确配对。用时用mtrace()和muntrace()表示开始和结束内存分配trace(如果检测到结束结尾的话可以不用muntrace())。但这是简单地记录没有free对应的malloc,可能会有一些false alarm。
<code class="hljs actionscript has-numbering">#<span class="hljs-preprocessor"><span class="hljs-keyword">include</span> <mcheck.h> mtrace();</span> <span class="hljs-comment">// …</span> muntrace(); </code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li></ul>
然后编译:
$ g++ -Wall -g problem.cpp -o bug
运行时先设输出的log文件:
$ export MALLOC_TRACE=output.log
用mtrace命令将输出文件变得可读:
$ mtrace ./bug $MALLOC_TRACE
就可以得到哪些地方的内存申请还没有被free掉。
<code class="hljs livecodeserver has-numbering">Memory <span class="hljs-operator">not</span> freed: Address Size Caller <span class="hljs-number">0x00000000008d4520</span> <span class="hljs-number">0x400</span> <span class="hljs-keyword">at</span> /home/jzj/code/problem.cpp:<span class="hljs-number">73</span></code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li></ul>
Gperftools(Google Performance Tools)为一组工具集,包括了thread-caching malloc(TCMalloc)和CPU profiler等组件。TCMalloc和Glibc中的ptmalloc相比更快,并可以有效减少多线程之间的竞争,因为它会为每个线程单独分配线程本地的Cache。这里先只关注它的内存相关组件。通过tcmalloc可以做heap-checking和heap-profiling。
如果懒得build,Ubuntu可以如下安装:
$ sudo apt-get install libgoogle-perftool-dev google-perftools
然后编译时加-ltcmalloc,注意一定要放最后链接,如:
$ g++ -Wall -g problem.cpp -g -o bug -ltcmalloc
编译时不链接的话就也可以用LD_PRELOAD:
$ export LD_PRELOAD=”/usr/lib/libtcmalloc.so”
运行的时候执行:
$ HEAPCHECK=normal ./bug
就可以报出内存泄露:
<code class="hljs livecodeserver has-numbering">Have memory regions w/o callers: might report <span class="hljs-constant">false</span> leaks Leak check <span class="hljs-title">_main</span>_ detected leaks <span class="hljs-operator">of</span> <span class="hljs-number">1024</span> <span class="hljs-keyword">bytes</span> <span class="hljs-operator">in</span> <span class="hljs-number">1</span> objects The <span class="hljs-number">1</span> largest leaks: *** WARNING: Cannot <span class="hljs-built_in">convert</span> addresses <span class="hljs-built_in">to</span> symbols <span class="hljs-operator">in</span> output below. *** Reason: Cannot find <span class="hljs-string">'pprof'</span> (is PPROF_PATH <span class="hljs-built_in">set</span> correctly?) *** If you cannot fix this, <span class="hljs-keyword">try</span> running pprof directly. Leak <span class="hljs-operator">of</span> <span class="hljs-number">1024</span> <span class="hljs-keyword">bytes</span> <span class="hljs-operator">in</span> <span class="hljs-number">1</span> objects allocated <span class="hljs-built_in">from</span>: @ <span class="hljs-number">400</span>ba3 @ <span class="hljs-number">400</span>de0 @ <span class="hljs-number">7</span>fe1be24bec5 @ <span class="hljs-number">400899</span> @ <span class="hljs-number">0</span> </code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li><li>12</li></ul>
如果想只检查某部分,可以用HeapProfileLeakChecker生成内存快照,然后执行完要检查部分后调用assert(checker.NoLeaks())。具体用法见:http://goog-perftools.sourceforge.net/doc/heap_checker.html
更详细的信息可以用google-pprof获得,如:
<code class="hljs brainfuck has-numbering"><span class="hljs-comment">$</span> <span class="hljs-comment">google</span><span class="hljs-literal">-</span><span class="hljs-comment">pprof</span> <span class="hljs-string">.</span><span class="hljs-comment">/bug</span> <span class="hljs-comment">"/tmp/bug</span><span class="hljs-string">.</span><span class="hljs-comment">1353</span><span class="hljs-string">.</span><span class="hljs-comment">_main_</span><span class="hljs-literal">-</span><span class="hljs-comment">end</span><span class="hljs-string">.</span><span class="hljs-comment">heap"</span> <span class="hljs-literal">-</span><span class="hljs-literal">-</span><span class="hljs-comment">inuse_objects</span> <span class="hljs-literal">-</span><span class="hljs-literal">-</span><span class="hljs-comment">lines</span> <span class="hljs-literal">-</span><span class="hljs-literal">-</span><span class="hljs-comment">heapcheck</span> <span class="hljs-literal">-</span><span class="hljs-literal">-</span><span class="hljs-comment">edgefraction=1e</span><span class="hljs-literal">-</span><span class="hljs-comment">10</span> <span class="hljs-literal">-</span><span class="hljs-literal">-</span><span class="hljs-comment">nodefraction=1e</span><span class="hljs-literal">-</span><span class="hljs-comment">10</span> <span class="hljs-literal">-</span><span class="hljs-literal">-</span><span class="hljs-comment">gv</span></code><ul class="pre-numbering" style="display: block;"><li>1</li></ul>
关于Tcmalloc更多的配置信息可以参见:http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html
Valgrind是Valgrind core和Valgrind工具插件的集合,除了用于检查内存错误,还可以用来分析函数调用,缓存使用,多线程竞争,堆栈使用等问题。这里只关注memcheck工具,因为太常用 ,它默认就是打开的。其原理是让程序跑在一个虚拟机上,因此速度会慢几十倍。好在现实中很多程序是IO bound的,所以很多时候没有慢到忍无可忍的地步。好处是它不需要重新编译目标程序。它会通过hash表记录每个heap block,同时通过shadow memory记录这些内存区域的信息。这样就可以在每次访存时检查其合法性。
运行时可根据需要加配置参数,如:
<code class="hljs brainfuck has-numbering"><span class="hljs-comment">$</span> <span class="hljs-comment">valgrind</span> <span class="hljs-literal">-</span><span class="hljs-literal">-</span><span class="hljs-comment">tool=memcheck</span> <span class="hljs-literal">-</span><span class="hljs-literal">-</span><span class="hljs-comment">error</span><span class="hljs-literal">-</span><span class="hljs-comment">limit=no</span> <span class="hljs-literal">-</span><span class="hljs-literal">-</span><span class="hljs-comment">track</span><span class="hljs-literal">-</span><span class="hljs-comment">origins=yes</span> <span class="hljs-literal">-</span><span class="hljs-literal">-</span><span class="hljs-comment">trace</span><span class="hljs-literal">-</span><span class="hljs-comment">children=yes</span> <span class="hljs-literal">-</span><span class="hljs-literal">-</span><span class="hljs-comment">track</span><span class="hljs-literal">-</span><span class="hljs-comment">fds=yes</span> <span class="hljs-string">.</span><span class="hljs-comment">/bug</span> </code><ul class="pre-numbering" style="display: block;"><li>1</li></ul>
如memory overrun就会报以下错误:
<code class="hljs haml has-numbering">=<span class="ruby">=<span class="hljs-number">1735</span>== <span class="hljs-constant">Invalid</span> write of size <span class="hljs-number">1</span> </span>=<span class="ruby">=<span class="hljs-number">1735</span>== at <span class="hljs-number">0x4008A7</span><span class="hljs-symbol">:</span> overrun() (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">26</span>) </span>=<span class="ruby">=<span class="hljs-number">1735</span>== by <span class="hljs-number">0x400C2B</span><span class="hljs-symbol">:</span> main (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">127</span>) </span>=<span class="ruby">=<span class="hljs-number">1735</span>== <span class="hljs-constant">Address</span> <span class="hljs-number">0x51fc460</span> is <span class="hljs-keyword">not</span> stack<span class="hljs-string">'d, malloc'</span>d <span class="hljs-keyword">or</span> (recently) free<span class="hljs-string">'d </span></span>=<span class="ruby">=<span class="hljs-number">1735</span>== </span></code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li></ul>
use after free检测结果:
<code class="hljs haml has-numbering">=<span class="ruby">=<span class="hljs-number">1739</span>== <span class="hljs-constant">Invalid</span> write of size <span class="hljs-number">1</span> </span>=<span class="ruby">=<span class="hljs-number">1739</span>== at <span class="hljs-number">0x4C2E51C</span><span class="hljs-symbol">:</span> __GI_strncpy (<span class="hljs-keyword">in</span> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) </span>=<span class="ruby">=<span class="hljs-number">1739</span>== by <span class="hljs-number">0x40098B</span><span class="hljs-symbol">:</span> use_after_free() (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">46</span>) </span>=<span class="ruby">=<span class="hljs-number">1739</span>== by <span class="hljs-number">0x400C3F</span><span class="hljs-symbol">:</span> main (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">133</span>) </span>=<span class="ruby">=<span class="hljs-number">1739</span>== <span class="hljs-constant">Address</span> <span class="hljs-number">0x51fc040</span> is <span class="hljs-number">0</span> bytes inside a block of size <span class="hljs-number">1</span>,<span class="hljs-number">024</span> free<span class="hljs-string">'d </span></span>=<span class="ruby">=<span class="hljs-number">1739</span>== at <span class="hljs-number">0x4C2BDEC</span><span class="hljs-symbol">:</span> free (<span class="hljs-keyword">in</span> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) </span>=<span class="ruby">=<span class="hljs-number">1739</span>== by <span class="hljs-number">0x400975</span><span class="hljs-symbol">:</span> use_after_free() (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">45</span>) </span>=<span class="ruby">=<span class="hljs-number">1739</span>== by <span class="hljs-number">0x400C3F</span><span class="hljs-symbol">:</span> main (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">133</span>) </span>=<span class="ruby">=<span class="hljs-number">1739</span>== </span>=<span class="ruby">=<span class="hljs-number">1739</span>== <span class="hljs-constant">Invalid</span> write of size <span class="hljs-number">1</span> </span>=<span class="ruby">=<span class="hljs-number">1739</span>== at <span class="hljs-number">0x4C2E5AC</span><span class="hljs-symbol">:</span> __GI_strncpy (<span class="hljs-keyword">in</span> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) </span>=<span class="ruby">=<span class="hljs-number">1739</span>== by <span class="hljs-number">0x40098B</span><span class="hljs-symbol">:</span> use_after_free() (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">46</span>) </span>=<span class="ruby">=<span class="hljs-number">1739</span>== by <span class="hljs-number">0x400C3F</span><span class="hljs-symbol">:</span> main (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">133</span>) </span>=<span class="ruby">=<span class="hljs-number">1739</span>== <span class="hljs-constant">Address</span> <span class="hljs-number">0x51fc045</span> is <span class="hljs-number">5</span> bytes inside a block of size <span class="hljs-number">1</span>,<span class="hljs-number">024</span> free<span class="hljs-string">'d </span></span>=<span class="ruby">=<span class="hljs-number">1739</span>== at <span class="hljs-number">0x4C2BDEC</span><span class="hljs-symbol">:</span> free (<span class="hljs-keyword">in</span> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) </span>=<span class="ruby">=<span class="hljs-number">1739</span>== by <span class="hljs-number">0x400975</span><span class="hljs-symbol">:</span> use_after_free() (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">45</span>) </span>=<span class="ruby">=<span class="hljs-number">1739</span>== by <span class="hljs-number">0x400C3F</span><span class="hljs-symbol">:</span> main (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">133</span>)</span></code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li><li>12</li><li>13</li><li>14</li><li>15</li><li>16</li><li>17</li></ul>
access uninitialized memory结果:
<code class="hljs haml has-numbering">=<span class="ruby">=<span class="hljs-number">1742</span>== <span class="hljs-constant">Conditional</span> jump <span class="hljs-keyword">or</span> move depends on uninitialised value(s) </span>=<span class="ruby">=<span class="hljs-number">1742</span>== at <span class="hljs-number">0x4EB17F1</span><span class="hljs-symbol">:</span> _IO_file_overflow<span class="hljs-variable">@@GLIBC_2</span>.<span class="hljs-number">2.5</span> (fileops.<span class="hljs-symbol">c:</span><span class="hljs-number">867</span>) </span>=<span class="ruby">=<span class="hljs-number">1742</span>== by <span class="hljs-number">0x4E819CF</span><span class="hljs-symbol">:</span> vfprintf (vfprintf.<span class="hljs-symbol">c:</span><span class="hljs-number">1661</span>) </span>=<span class="ruby">=<span class="hljs-number">1742</span>== by <span class="hljs-number">0x4E8B498</span><span class="hljs-symbol">:</span> printf (printf.<span class="hljs-symbol">c:</span><span class="hljs-number">33</span>) </span>=<span class="ruby">=<span class="hljs-number">1742</span>== by <span class="hljs-number">0x400AA6</span><span class="hljs-symbol">:</span> access_uninit() (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">72</span>) </span>=<span class="ruby">=<span class="hljs-number">1742</span>== by <span class="hljs-number">0x400C5A</span><span class="hljs-symbol">:</span> main (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">142</span>) </span>=<span class="ruby">=<span class="hljs-number">1742</span>== <span class="hljs-constant">Uninitialised</span> value was created by a heap allocation </span>=<span class="ruby">=<span class="hljs-number">1742</span>== at <span class="hljs-number">0x4C2AB80</span><span class="hljs-symbol">:</span> malloc (<span class="hljs-keyword">in</span> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) </span>=<span class="ruby">=<span class="hljs-number">1742</span>== by <span class="hljs-number">0x400A87</span><span class="hljs-symbol">:</span> access_uninit() (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">71</span>) </span>=<span class="ruby">=<span class="hljs-number">1742</span>== by <span class="hljs-number">0x400C5A</span><span class="hljs-symbol">:</span> main (problem.<span class="hljs-symbol">cpp:</span><span class="hljs-number">142</span>)</span></code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li></ul>
像memory overrun和use after free这类问题比较难搞是因为出错的时候往往不是第一现场。用Valgrind就比较容易抓到第一现场。如果一个对象A被释放后,同一块内存再次被申请为对象B,但程序中还是通过指向对象A的dangling pointer进行访问,会覆盖已有数据或者读出错误数据。但这种情况Valgrind检查不出来,因为Valgrind不会做语义上的分析。但是Valgrind可以配置内存分配策略,通过设置空闲内存队列大小和优先级让被释放的内存不马上被重用。从而增大抓到此类问题的概率。
对于栈中内存,Memcheck只会做未初始化数据访问的检测,而不会做栈或全局数组中的越界检测。这是由SGCheck来完成的,它与memcheck功能互补。使用SGCheck只需在valgrind后加上–tool=exp-sgcheck参数即可。
另外memcheck还提供一系列参数可以调整检测策略,具体可参见Valgrind User Manual或者http://valgrind.org/docs/manual/mc-manual.html
早先是LLVM中的特性,后被加入GCC 4.8。在GCC 4.9后加入对ARM平台的支持。因此用时不需要第三方库,通过在编译时指定flag即可打开开关。它是 Mudflap的替代品(Mudflap从GCC 4.9开始不再支持,指定了也不做事)。ASan在编译时在访存操作中插入额外指令,同时通过Shadow memory来记录和检测内存的有效性。slowdown官方称为2x左右。
使用时只要在CFLAGS中加上如下flag。注意如果链接so,只有可执行文件需要加flag。
$ g++ -Wall -g problem.cpp -o bug -fsanitize=address -fno-omit-frame-pointer
直接运行,检测出错误时会报出类似以下错误:
<code class="hljs lasso has-numbering"><span class="hljs-subst">==</span><span class="hljs-number">22543</span><span class="hljs-subst">==</span>ERROR: AddressSanitizer: heap<span class="hljs-attribute">-buffer</span><span class="hljs-attribute">-overflow</span> <span class="hljs-keyword">on</span> address <span class="hljs-number">0x61900000fea0</span> at pc <span class="hljs-number">0x400f22</span> bp <span class="hljs-number">0x7ffe3c21be90</span> sp <span class="hljs-number">0x7ffe3c21be88</span> WRITE of size <span class="hljs-number">1</span> at <span class="hljs-number">0x61900000fea0</span> <span class="hljs-keyword">thread</span> T0 <span class="hljs-variable">#0</span> <span class="hljs-number">0x400f21</span> <span class="hljs-keyword">in</span> overrun() /home/jzj/code/problem<span class="hljs-built_in">.</span>cpp:<span class="hljs-number">26</span> <span class="hljs-variable">#1</span> <span class="hljs-number">0x401731</span> <span class="hljs-keyword">in</span> main /home/jzj/code/problem<span class="hljs-built_in">.</span>cpp:<span class="hljs-number">127</span> <span class="hljs-variable">#2</span> <span class="hljs-number">0x7fb2a46b8ec4</span> <span class="hljs-keyword">in</span> __libc_start_main (/lib/x86_64<span class="hljs-attribute">-linux</span><span class="hljs-attribute">-gnu</span>/libc<span class="hljs-built_in">.</span>so<span class="hljs-number">.6</span><span class="hljs-subst">+</span><span class="hljs-number">0x21ec4</span>) <span class="hljs-variable">#3</span> <span class="hljs-number">0x400d08</span> (/home/jzj/code/bug<span class="hljs-subst">+</span><span class="hljs-number">0x400d08</span>) <span class="hljs-subst">==</span><span class="hljs-number">26753</span><span class="hljs-subst">==</span>ERROR: AddressSanitizer: attempting double<span class="hljs-attribute">-free</span> <span class="hljs-keyword">on</span> <span class="hljs-number">0x61900000fa80</span> <span class="hljs-keyword">in</span> <span class="hljs-keyword">thread</span> T0: <span class="hljs-variable">#0</span> <span class="hljs-number">0x7f591b4ba5c7</span> <span class="hljs-keyword">in</span> __interceptor_free (/usr/lib/x86_64<span class="hljs-attribute">-linux</span><span class="hljs-attribute">-gnu</span>/libasan<span class="hljs-built_in">.</span>so<span class="hljs-number">.1</span><span class="hljs-subst">+</span><span class="hljs-number">0x545c7</span>) <span class="hljs-variable">#1</span> <span class="hljs-number">0x400e46</span> <span class="hljs-keyword">in</span> double_free() /home/jzj/code/problem<span class="hljs-built_in">.</span>cpp:<span class="hljs-number">17</span> <span class="hljs-variable">#2</span> <span class="hljs-number">0x40173b</span> <span class="hljs-keyword">in</span> main /home/jzj/code/problem<span class="hljs-built_in">.</span>cpp:<span class="hljs-number">130</span> <span class="hljs-variable">#3</span> <span class="hljs-number">0x7f591b0c2ec4</span> <span class="hljs-keyword">in</span> __libc_start_main (/lib/x86_64<span class="hljs-attribute">-linux</span><span class="hljs-attribute">-gnu</span>/libc<span class="hljs-built_in">.</span>so<span class="hljs-number">.6</span><span class="hljs-subst">+</span><span class="hljs-number">0x21ec4</span>) <span class="hljs-variable">#4</span> <span class="hljs-number">0x400d08</span> (/home/jzj/code/bug<span class="hljs-subst">+</span><span class="hljs-number">0x400d08</span>)</code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li><li>12</li><li>13</li></ul>
检测一些特定问题需要加上专门的选项,比如要检查访问指向已被释放的栈空间需要加上:
ASAN_OPTIONS=detect_stack_use_after_return=1
如果要检测memory leak需要加上:
ASAN_OPTIONS=detect_leaks=1
各种参数配置请参见:https://code.google.com/p/address-sanitizer/wiki/Flags
Address-sanitizer是Sanitizer系工具中的一员。有一部分功能是在其余工具里,比如memory leak检测在LeakSanitizer中,uninitialized memory read检测在MemorySanitizer中。data race检测在ThreadSanitizer中。它们最初都是LLVM中的特性,后被移植到GCC,所以用GCC的话最好用4.9,至少也是4.8以后版本。
AddressSanitizer不能检测读未初始化内存,而这MemorySanitizer(MSan)能做到。它包含compiler instrumentation模块和run-time的库。目前只支持Linux x86_64平台。使用时需在编译选项加-fsanitize=memory -fPIE -pie,为了得到更详细的信息,最好加上-fno-omit-frame-pointer和-fsanitize-memory-track-origins。它实现了Valgrind的部分功能,但由于使用了compile-time instrumentation,所以速度更快。可惜目前只在LLVM上有,在GCC上还没有,暂且略过。
Memwatch是一个轻量级的内存问题检测工具。主要用于检测内存分配释放相关问题及内存越界访问问题。通过C preprocessor,Memwatch替换所有 ANSI C的内存分配 函数,从而记录分配行为。注意它不保证是线程安全的。效率上,大块分配不受影响,小块分配会受影响,因此它没法使用原分配函数中的memory pool。最坏情况下会有3-5x的slowdown。它可以比较方便地模拟内存受限情况。对于未初始化内存访问,和已释放内存访问,Memwatch会poison相应内存(分配出来写0xFE,释放内存写0xFD)从而在出错时方便调试。
使用时需要修改源码。该库需要单独下载:
http://www.linkdata.se/sourcecode/memwatch/
然后在要检查的代码中包含头文件:
<code class="hljs ruleslanguage has-numbering"><span class="hljs-array">#include </span><span class="hljs-string">"memwatch.h"</span></code><ul class="pre-numbering" style="display: block;"><li>1</li></ul>
然后加下面宏编译:
$ gcc -DMEMWATCH -DMW_STDIO test.c memwatch.c -o test
默认结果输出在memwatch.log。比如程序如果有double free的话会输出:
<code class="hljs livecodeserver has-numbering">Modes: __STDC__ <span class="hljs-number">64</span>-bit mwDWORD==(unsigned int) mwROUNDALLOC==<span class="hljs-number">8</span> sizeof(mwData)==<span class="hljs-number">56</span> mwDataSize==<span class="hljs-number">56</span> double-free: <<span class="hljs-number">3</span>> test.c(<span class="hljs-number">17</span>), <span class="hljs-number">0x25745e0</span> was freed <span class="hljs-built_in">from</span> test.c(<span class="hljs-number">16</span>) Stopped <span class="hljs-keyword">at</span> Sun Jun <span class="hljs-number">14</span> <span class="hljs-number">10</span>:<span class="hljs-number">57</span>:<span class="hljs-number">15</span> <span class="hljs-number">2015</span> Memory usage statistics (<span class="hljs-built_in">global</span>): N)umber <span class="hljs-operator">of</span> allocations made: <span class="hljs-number">1</span> L)argest memory usage : <span class="hljs-number">1024</span> T)otal <span class="hljs-operator">of</span> all alloc() calls: <span class="hljs-number">1024</span> U)nfreed <span class="hljs-keyword">bytes</span> totals : <span class="hljs-number">0</span></code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li></ul>
Memory leak的输出:
<code class="hljs lasso has-numbering">Modes: __STDC__ <span class="hljs-number">64</span><span class="hljs-attribute">-bit</span> mwDWORD<span class="hljs-subst">==</span>(unsigned int) mwROUNDALLOC<span class="hljs-subst">==</span><span class="hljs-number">8</span> sizeof(mwData)<span class="hljs-subst">==</span><span class="hljs-number">56</span> mwDataSize<span class="hljs-subst">==</span><span class="hljs-number">56</span> Stopped at Sun Jun <span class="hljs-number">14</span> <span class="hljs-number">10</span>:<span class="hljs-number">56</span>:<span class="hljs-number">22</span> <span class="hljs-number">2015</span> unfreed: <span class="hljs-subst"><</span><span class="hljs-number">1</span><span class="hljs-subst">></span> test<span class="hljs-built_in">.</span>c(<span class="hljs-number">63</span>), <span class="hljs-number">1024</span> <span class="hljs-built_in">bytes</span> at <span class="hljs-number">0x195f5e0</span> {FE FE FE FE FE FE FE FE FE FE FE FE FE FE FE FE <span class="hljs-attribute">...</span><span class="hljs-attribute">...</span><span class="hljs-attribute">...</span><span class="hljs-attribute">...</span><span class="hljs-attribute">...</span><span class="hljs-built_in">.</span>} Memory usage statistics (<span class="hljs-built_in">global</span>): N)umber of allocations made: <span class="hljs-number">1</span> L)argest memory usage : <span class="hljs-number">1024</span> T)otal of <span class="hljs-literal">all</span> alloc() calls: <span class="hljs-number">1024</span> U)nfreed <span class="hljs-built_in">bytes</span> totals : <span class="hljs-number">1024</span></code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li></ul>
Electric Fence主要用于追踪buffer overflow的读和写。它利用硬件来抓住越界访问的指令。其原理是为每一次内存申请额外申请一个page或一组page,然后把这些buffer范围外的page设为不可读写。这样,如果程序访问这些区域,由于页表中这个额外page的权限是不可读写,会产生段错误。那些被free()释放的内存也会被设为不可访问,因此访问也会产生段错误。因为读写权限以页为单位,所以如果多的页放在申请内存区域后,可防止overflow。如果要防止underflow,就得用环境变量EF_PROTECT_BELOW在区域前加保护页。因为Electric Fence至少需要丙个页来满足内存分配申请,因此内存使用会非常大,好处是它利用了硬件来捕获非法访问,因此速度快。也算是空间换时间吧。
目前支持Window, Linux平台,语言支持C/C++。限制包括无法检测使用未初始化内存,memory leak等。同时它不是线程安全的。Ubuntu上懒得编译可以安装现成的:
$ sudo apt-get install electric-fence
它是以库的方式需要被链接到程序中:
$ g++ -Wall -g problem.cpp -o bug -lefence
或者用LD_PRELOAD,不过记得不要同时链接其它的malloc debugger库。
$ export LD_PRELOAD=libefence.so.0.0
另外,EF_PROTECT_BELOW,EF_PROTECT_FREE,EF_ALLOW_MALLOC_0和EF_FILL这些环境变量都是用来控制其行为的。可以参见manual:http://linux.die.net/man/3/efence
比如memory overrun和double free就可以得到如下结果:
<code class="hljs python has-numbering"> Electric Fence <span class="hljs-number">2.2</span> Copyright (C) <span class="hljs-number">1987</span>-<span class="hljs-number">1999</span> Bruce Perens <bruce<span class="hljs-decorator">@perens.com></span> Segmentation fault (core dumped) Electric Fence <span class="hljs-number">2.2</span> Copyright (C) <span class="hljs-number">1987</span>-<span class="hljs-number">1999</span> Bruce Perens <bruce<span class="hljs-decorator">@perens.com></span> ElectricFence Aborting: free(<span class="hljs-number">7</span>fc1c17c8c00): address <span class="hljs-keyword">not</span> <span class="hljs-keyword">from</span> malloc(). Illegal instruction (core dumped)</code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li></ul>
它无法在log中打出详细信息,但如果运行前打开了coredump:
$ ulimit -c unlimited
就可以gdb打开coredump来分析了:
$ gdb ./bug -c core
注意因为多数平台在分配时遇到block size不是word size整数倍时会通过加padding byte进行word alignment。如果是在padded area中出现overrun则无法检测。这里可以通过在程序中设置EN_ALIGNMENT=1来防止byte padding,从而更容易检测off by one的问题。
DUMA(http://duma.sourceforge.net/)从Electric Fence中fork出来并加入一些其它特性,比如leak detection,Windows支持等。
比较经典的内存检测工具,虽然N年没更新了。dmalloc通过在分配区域增加padding magic number的做法来检测非法访问,因此它能够检测到问题但不能检测出哪条指令出的错。Dmalloc只能检测越界写,但不能检测越界读。另外,Dmalloc只检测堆上用malloc系函数(而不是sbrk()或mmap())分配的内存,而无法对栈内存和静态内存进行检测。 本质上它也是通过hook malloc(), realloc(), calloc(),free()等内存管理函数,还有strcat(), strcpy()等内存操作函数,来检测内存问题。它支持x86, ARM平台,语言上支持C/C++,并且支持多线程。
使用时可以先从官网下载源码包(http://dmalloc.com/releases/),然后编译安装:
$ tar zxvf dmalloc-5.5.2.tgz
$ cd dmalloc-5.5.2
$ ./configure
$ make && make install
少量修改源代码。只需要加上下面的头文件:
<code class="hljs vala has-numbering"><span class="hljs-preprocessor">#ifdef DMALLOC</span> <span class="hljs-preprocessor">#include "dmalloc.h"</span> <span class="hljs-preprocessor">#endif</span></code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li></ul>
然后编译时CFLAGS加上 -DDMALLOC -DDMALLOC_FUNC_CHECK,如:
$ g++ -Wall -g -DDMALLOC -DDMALLOC_FUNC_CHECK problem.cpp -o bug -ldmalloc
dmalloc的配置选项可以通过设置环境变量DMALLOC_OPTIONS来实现,例如:
$ export DMALLOC_OPTIONS=log=logfile,check-fence,check-blank,check-shutdown,check-heap,check-funcs,log-stats,log-non-free,print-messages,log-nonfree-space
这些用法可参见:
http://dmalloc.com/docs/latest/online/dmalloc_26.html
http://dmalloc.com/docs/latest/online/dmalloc_27.html
也可以用dmalloc这个命令来设置。直接dmalloc -v可用于查看当前设置。
发生错误时会给出类似以下输出:
<code class="hljs http has-numbering"><span class="hljs-attribute">1434270937</span>: <span class="hljs-string">2: error details: checking user pointer</span> <span class="hljs-attribute">1434270937</span>: <span class="hljs-string">2: pointer '0x7fc235336808' from 'unknown' prev access 'problem.cpp:35'</span> <span class="hljs-attribute">1434270937</span>: <span class="hljs-string">2: ERROR: _dmalloc_chunk_heap_check: free space has been overwritten (err 67)</span> <span class="hljs-attribute">1434270937</span>: <span class="hljs-string">2: error details: checking pointer admin</span> <span class="hljs-attribute">1434270937</span>: <span class="hljs-string">2: pointer '0x7fc235336808' from 'problem.cpp:37' prev access 'problem.cpp:35'</span> <span class="hljs-attribute">1434270937</span>: <span class="hljs-string">2: ERROR: free: free space has been overwritten (err 67)</span> <span class="http"><span class="hljs-attribute">1434271030</span>: <span class="hljs-string">3: error details: finding address in heap</span> <span class="hljs-attribute">1434271030</span>: <span class="hljs-string">3: pointer '0x7f0a7e29d808' from 'problem.cpp:27' prev access 'unknown'</span> <span class="hljs-attribute">1434271030</span>: <span class="hljs-string">3: ERROR: free: tried to free previously freed pointer (err 61)</span></span></code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li></ul>
另外Dmalloc还提供一些函数,如dmalloc_mark(),dmalloc_log_changed()和dmalloc_log_unfreed()等来打印内存信息和分析内存变化:
http://dmalloc.com/docs/5.3.0/online/dmalloc_13.html
重量级内存监测工具之一,用于检测如未初始化内存访问,越界访问,已释放内存访问,double free,memory leak以及Windows上的handle leak, GDI API usage error等。它支持Windows, Linux和Mac操作系统, IA-32和AMD64平台,和其它基于binary instrumentation的工具一样,它不需要改目标程序的binary。有个缺点是目前只针对x86上的32位程序。貌似目前正在往ARM上port。其优点是对程序的正常执行影响小,和Valgrind相比,性能更好。官网为http://www.drmemory.org/。Dr. Memory基于DynamioRIO Binary Translator。原始代码不会直接运行,而是会经过translation后生成code cache,这些code cache会调用shared instrumentation来做内存检测。
Dr. Memory提供各平台的包下载。
https://github.com/DynamoRIO/drmemory/wiki/Downloads
下载后即可直接使用。首先编译要检测的测试程序:
$ g++ -m32 -g -Wall problem.cpp -o bug -fno-inline -fno-omit-frame-pointer
(在64位host上编译32位程序需要安装libc6-dev-i386和g++-multilib)
然后把Dr.Memory的bin加入PATH,如:
$ export PATH=/home/jzj/tools/DrMemory-Linux-1.8.0-8/bin:$PATH
之后就可以使用Dr.Memory启动目标程序:
\$ drmemory – ./bug
更多用法参见 drmemory -help或http://drmemory.org/docs/page_options.html。
像遇到double-free和heap overflow问题的话就会给出类似下面结果:
<code class="hljs avrasm has-numbering">~~Dr<span class="hljs-preprocessor">.M</span>~~ ~~Dr<span class="hljs-preprocessor">.M</span>~~ Error <span class="hljs-preprocessor">#1: INVALID HEAP ARGUMENT to free 0x08ceb0e8</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ <span class="hljs-preprocessor"># 0 replace_free [/work/drmemory_package/common/alloc_replace.c:2503]</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ <span class="hljs-preprocessor"># 1 double_free [/home/jzj/code/problem.cpp:23]</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ <span class="hljs-preprocessor"># 2 main [/home/jzj/code/problem.cpp:157]</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ Note: <span class="hljs-localvars">@0</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00.127</span> <span class="hljs-keyword">in</span> thread <span class="hljs-number">26159</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ Note: memory was previously freed here: ~~Dr<span class="hljs-preprocessor">.M</span>~~ Note: <span class="hljs-preprocessor"># 0 replace_free [/work/drmemory_package/common/alloc_replace.c:2503]</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ Note: <span class="hljs-preprocessor"># 1 double_free [/home/jzj/code/problem.cpp:22]</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ Note: <span class="hljs-preprocessor"># 2 main [/home/jzj/code/problem.cpp:157]</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ ~~Dr<span class="hljs-preprocessor">.M</span>~~ Error <span class="hljs-preprocessor">#1: UNADDRESSABLE ACCESS beyond heap bounds: writing 0x0988f508-0x0988f509 1 byte(s)</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ <span class="hljs-preprocessor"># 0 overrun [/home/jzj/code/problem.cpp:32]</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ <span class="hljs-preprocessor"># 1 main [/home/jzj/code/problem.cpp:154]</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ Note: <span class="hljs-localvars">@0</span>:<span class="hljs-number">00</span>:<span class="hljs-number">00.099</span> <span class="hljs-keyword">in</span> thread <span class="hljs-number">26191</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ Note: prev lower malloc: <span class="hljs-number">0x0988f0e8</span>-<span class="hljs-number">0x0988f4e8</span> ~~Dr<span class="hljs-preprocessor">.M</span>~~ Note: instruction: <span class="hljs-keyword">mov</span> $<span class="hljs-number">0x6a</span> -> (%eax)</code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li><li>12</li><li>13</li><li>14</li><li>15</li><li>16</li><li>17</li><li>18</li></ul>
前面的工具大多用于堆内存检错,对于栈内存GCC本身提供了一些检错机制。加上-fstack-protector后,GCC会多加指令来检查buffer/stack overflow。原理是为函数加guard variable。在函数进入时初始化,函数退出时检查。相关的flag有-fstack-protector-strong -fstack-protector -fstack-protector-all等。使用例子:
$ g++ -Wall -O2 -U_FORTIFY_SOURCE -fstack-protector-all problem.cpp -o bug
运行时会检测到stack overflow:
<code class="hljs diff has-numbering"><span class="hljs-header">*** stack smashing detected ***: ./bug terminated</span> Aborted (core dumped)</code><ul class="pre-numbering" style="display: block;"><li>1</li><li>2</li></ul>
对于线程的栈可以参考pthread_attr_setguardsize()。
Rational purity是IBM的商业化产品,要收费,所以木有用过,精神上支持。和Valgrind很像,也基于binary instrumentation,适用于不同平台。另一个工具Insure++基于compile-time和binary instrumentation,可以检测use-after-free,out-of-bounds,wild free和memory leak等内存问题。但也是要收费的,也精神上支持。。。。。。
大体来说,遇到诡异的内存问题,先可以试下Glibc和GCC里自带的检测机制,因为enable起来方便。如果检测不出来,那如果toolchain版本较新且有编译环境,可以先尝试ASan,因为其功能强大,且效率高。接下来,如果程序是I/O bound或slowdown可以接受,可以用Valgrind和Dr.Memory。它们功能强大且无需重新编译,但速度较慢,且后者不支持64位程序和ARM平台。然后可以根据实际情况和具体需要考虑Memwatch,Dmalloc和Electric Fence等工具。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。