c#查找内存泄漏
The place matters
地方很重要
The main idea of this article is to describe an approach that gives an ability to find memory leaks in C code on macOS. It considers one of the possible options for finding memory leaks and represents a skeleton that may be extended if necessary.
本文的主要思想是描述一种能够在macOS上的C代码中查找内存泄漏的方法。 它考虑了查找内存泄漏的可能选项之一,并表示可以根据需要扩展的框架。
In contrast to the previous article, this approach is not entirely POSIX compatible since it uses backtrace
and backtrace_symbols
functions that are not a part of the IEEE 1003.1 standard. So, if your operating system does not provide these functions, you should find alternatives on your own.
与上一篇文章相反,此方法不完全与POSIX兼容,因为它使用了backtrace
和backtrace_symbols
函数,这些函数不属于IEEE 1003.1标准。 因此,如果您的操作系统不提供这些功能,则应自行寻找替代方法。
The approach was tested on macOS Catalina 10.15.6 only with clang
compiler by version 11.0.3.
该方法仅在macOS Catalina 10.15.6上使用11.0.3版的clang
编译器进行了测试。
The article has a more practical nature than theoretical. It touches some deep and exciting system mechanisms, but it only mentions them and does not dive too deep into system runtime. Though it can be a quite exciting journey, it’s beyond of article’s scope.
这篇文章比理论上更具实用性。 它触及了一些深刻而令人兴奋的系统机制,但仅提及了它们,而没有深入探讨系统运行时。 尽管这可能是一个非常令人兴奋的旅程,但这超出了本文的范围。
In the previous article was considered a basic approach to detect memory leaks. The only thing that approach does is informing a developer if any memory leaks happened in the run application and their number. Generally speaking, it’s not enough in most cases. We usually want not just to know that there are any leaks in the application, but we also want to fix them. To fix them we need to know where they happened. More precisely — we want to know the exact place where a memory piece was allocated.
在上一篇文章中,它被认为是检测内存泄漏的基本方法。 该方法唯一要做的就是通知开发人员是否在运行的应用程序及其编号中发生任何内存泄漏。 一般来说,在大多数情况下还不够。 我们通常不仅希望知道应用程序中是否存在任何泄漏,而且还希望修复它们。 要修复它们,我们需要知道它们的发生位置。 更确切地说,我们想知道分配内存块的确切位置。
Below we will consider one of the possible solutions on how to achieve that.
下面我们将考虑如何实现这一目标的可能解决方案之一。
赶紧跑 (Just run)
The process is quite the same as in the previous article:
该过程与上一篇文章完全相同:
Download the file with leak checking code here.
在此处下载带有泄漏检查代码的文件。
- Place it into your project anywhere you want. Just be sure that it compiles and links with the entire project. So you need to add it to your Makefile, project file, etc. 将其放在您想要的任何位置的项目中。 只要确保它可以编译并链接到整个项目即可。 因此,您需要将其添加到Makefile,项目文件等中。
In your
main.c
file add the declaration ofcheck_leaks
function at the beginning of the file.在
main.c
文件中,在文件开头添加check_leaks
函数的声明。Add
check_leaks
function call before exitingmain()
function end.在退出
main()
函数结束之前添加check_leaks
函数调用。
- void check_leaks();
-
-
- int main(int argc, const char *argv[]) {
- //...
- check_leaks();
- return 0;
- }
5. Run your application.
5.运行您的应用程序。
6. Check the output.
6.检查输出。
If you have any memory leaks you should see detailed information about them — their addresses, leaked memory sizes and call stack.
如果您有任何内存泄漏,您应该看到有关它们的详细信息-它们的地址,泄漏的内存大小和调用堆栈。
挖得更深一点 (Dig a little bit deeper)
If you need just to run the code to detect memory leaks, you may stop reading here. This section contains some technical details about the main concepts lying in our solution. Let’s consider the main idea of the approach that we used. It based on two key concepts:
如果您只需要运行代码以检测内存泄漏,则可以在此处停止阅读。 本节包含有关我们解决方案中主要概念的一些技术细节。 让我们考虑一下我们使用的方法的主要思想。 它基于两个关键概念:
Intercept
malloc
/free
functions calls.拦截
malloc
/free
函数调用。- Obtain the call stack info to get more details where a leak happened. 获取调用堆栈信息以获取发生泄漏的更多详细信息。
First of all, to start our discussion, let’s define the terms. What is a memory leak? In a narrow meaning, it’s a piece of memory that was allocated and wasn’t freed. So our task is to detect all memory allocation and all memory deallocations in our application. After that, we can compare allocations and deallocations counts, and if they are different — we have got memory leaks.
首先,让我们开始讨论,让我们定义术语。 什么是内存泄漏? 从狭义上讲,它是一块已分配但未释放的内存。 因此,我们的任务是检测应用程序中的所有内存分配和所有内存释放。 在那之后,我们可以比较分配和释放的数量,如果它们不同的话,我们就会发生内存泄漏。
The other task — detect the places where leaked pieces were allocated as precisely as possible.
另一个任务-尽可能精确地检测泄漏物品的分配位置。
Look into these tasks one by one.
逐一研究这些任务。
系统功能拦截 (System functions interception)
To intercept malloc
/free
functions, we used dlsym function. dlsym
is a part of the POSIX standard, so you may easily find its description in man dlsym
or in the POSIX standard document (see References section). Here we will just demonstrate how it may be used for our goals.
为了拦截malloc
/ free
函数,我们使用了dlsym函数。 dlsym
是POSIX标准的一部分,因此您可以在man dlsym
或POSIX标准文档中轻松找到其描述(请参阅参考资料部分)。 在这里,我们仅演示如何将其用于我们的目标。
- static void *(*real_malloc)(unsigned long) = 0; // 1
- int malloc_counter = 0; // 2
-
-
- static void malloc_init(void) { // 3
- real_malloc = (void *(*)(unsigned long))dlsym(RTLD_NEXT, "malloc");
- if (real_malloc == 0)
- fprintf(stderr, "Error in `dlsym`: %s\n", dlerror());
- }
-
-
- void *malloc(unsigned long size) { // 4
- if (real_malloc == 0)
- malloc_init(); // 5
-
-
- malloc_counter++; // 6
-
-
- return real_malloc(size); // 7
- }
The code snippet above demonstrates the main idea of system functions interception. We will follow it not line by line but by calling logic.
上面的代码段演示了系统功能拦截的主要思想。 我们将不逐行跟踪它,而是通过调用逻辑。
In (4) we placed a function that has the exact same signature as the system malloc
function does. When we call malloc
anywhere from our application, it is this implementation will be called. Next (5) we check if we have already initialized our memory leaks detection logic, and if we haven’t, we call malloc_init
. The malloc_init
does the second trick — it calls dlsym
function that returns a pointer to the real malloc
function and stores it in reall_malloc
static variable declared at (1).
在(4)中,我们放置了一个具有与系统malloc
函数完全相同的签名的函数。 当我们从应用程序的任何地方调用malloc
,将调用此实现。 接下来(5),我们检查是否已经初始化了内存泄漏检测逻辑,如果尚未初始化,则调用malloc_init
。 malloc_init
做第二个技巧-它调用dlsym
函数,该函数返回指向实际malloc
函数的指针,并将其存储在(1)声明的reall_malloc
静态变量中。
Then in (6) we increment malloc_counter
variable declared at (2) that gives us the ability to count memory leaks at the end of our program.
然后在(6)中,我们递增在(2)中声明的malloc_counter
变量,这使我们能够在程序结束时计算内存泄漏。
And as the final step (7) we call real_malloc
function that does real memory allocation and returns a pointer to the caller.
最后一步(7),我们调用real_malloc
函数,该函数进行实际内存分配并返回指向调用方的指针。
In the same way we intercept function free
and may intercept others.
同样,我们free
拦截函数,也可能拦截其他函数。
We should notice here that any system call may be intercepted in this way. It gives the developer a huge room to experiment and tune their applications.
在这里我们应该注意,任何系统调用都可以通过这种方式被拦截。 它为开发人员提供了巨大的实验和调试应用程序的空间。
获取调用栈 (Obtaining call stack)
Having information where a malloc
call has been made give the developer an ability to find it fast and fix it. Call stack is one of the approaches that can help with that. It’s not 100% precise since it does not give any information about a file name and a line where the call has been made, but it still provides a lot of information and makes it easier to find the leaked call.
了解在哪里进行了malloc
调用的信息,使开发人员能够快速找到并修复它。 调用堆栈是可以提供帮助的方法之一。 它不是100%精确的,因为它不提供有关文件名和进行调用的行的任何信息,但是它仍然提供了很多信息,并使查找泄漏的调用更加容易。
- void *callstack[128]; // 1
- int frames = backtrace(callstack, 128); // 2
- char **strs = backtrace_symbols(callstack, frames); // 3
Here is the example from man backtrace
. It gives the main idea of how we can obtain a call stack information.
这是man backtrace
的示例。 它给出了如何获取调用堆栈信息的主要思想。
In (1) we declare an array of pointer to void
, we will store pointers to functions on the call stack. Then we call backtrace
(2) function that fills the pointers array with call stack functions pointers and returns its size. In (3) we call backtrace_symbols
that ‘converts’ pointers to functions to their names. As a result, we obtain C strings array with information about functions on the stack, including their names.
在(1)中,我们声明了一个指向void
的指针数组,我们将在调用堆栈上存储指向函数的指针。 然后,我们调用backtrace
(2)函数,该函数用调用堆栈函数的指针填充指标数组,并返回其大小。 在(3)中,我们调用backtrace_symbols
来将指针转换为函数名称。 结果,我们获得了C字符串数组,其中包含有关栈中函数的信息,包括其名称。
Here we should notice that it’s not necessary to call free
function for strs
and its content. backtrace_symbols
returns pointers but does not pass ownership to that memory area. It handles this memory somewhere under the hood. Moreover, it does not call malloc
, so it won’t lead to stack overflow.
在这里我们应该注意,没有必要为strs
及其内容调用free
函数。 backtrace_symbols
返回指针,但不将所有权传递给该内存区域。 它在引擎盖下的某个地方处理此内存。 而且,它不会调用malloc
,因此不会导致堆栈溢出。
After obtaining call stack info, we may parse it and use it in any way. See the full implementation for details. Probably you may want to do something different, so feel free to write your own implementation that requires your needs.
获取调用堆栈信息后,我们可以解析它并以任何方式使用它。 有关详细信息,请参见完整的实现。 可能您可能想做一些不同的事情,所以可以随时编写自己的需要的实现。
下一步是什么 (What’s next)
For sure not all the allocation functions has been considered in this article. There are still calloc
, realloc
and others. You may intercept them too if you use them in your project. You may also write logs into a file instead of stderr
, group leaks if they have same call stack and so on.
当然,本文没有考虑所有分配功能。 仍然有calloc
, realloc
和其他。 如果在项目中使用它们,也可以拦截它们。 您也可以将日志而不是stderr
写入文件,如果它们具有相同的调用堆栈,则可能导致组泄漏,等等。
Actually, this approach gives you the total power on memory allocation management. If you are brave enough, you may experiment with the code and implement almost any logic to play with memory leaks.
实际上,这种方法为您提供了内存分配管理的全部功能。 如果您足够勇敢,则可以尝试使用该代码并实现几乎所有逻辑以解决内存泄漏问题。
Happy coding.
快乐的编码。
翻译自: https://medium.com/swlh/finding-memory-leaks-in-c-2-0-c0f150fd2b42
c#查找内存泄漏