繁依Fanyi0

这个屌丝很懒，什么也没留下！

热门标签

Python内存泄漏排查_python代码内存泄漏排查

作者：繁依Fanyi0 | 2024-08-07 02:05:32

踩

python代码内存泄漏排查

Python内存泄漏排查

1. 排查工具
2. 案例分析
3. 参考

记一次排查Python程序内存泄漏的问题。

1. 排查工具

工具	说明
gc	Python标准库内置模块
tracemalloc `推荐`	`Python3.4` 以上此工具为标准库
mem_top `推荐`	是对 `gc` 的封装，能够排序输出最多的 Top N，执行快
guppy	可以对堆里边的对象进行统计, 算是比较实用；但计算耗时长
objgraph	可以绘制对象引用图, 对于对象种类较少, 结构比较简单的程序适用
pympler	可以统计内存里边各种类型的使用, 获取对象的大小
pyrasite	非常强大的第三方库, 可以渗透进入正在运行的python进程动态修改里边的数据和代码

各个工具官网文档都有详细说明，也有基本示例用法，本文简单介绍工具的常见使用。

1.1 gc

gc 作为内置模块，Python2 和 Python3 都支持，用起来非常方便。

常用的方法有：

gc.collect(generation=2) 若被调用时不包含参数，则启动完全的垃圾回收；在排查内存泄漏时，为避免垃圾未及时回收的影响，在统计前可以先手动调用一下垃圾回收；
gc.get_objects() 返回一个收集器所跟踪的所有对象列表；
gc.get_referrers(*objs) 返回 直接 引用任意一个 objs 的对象列表。这个函数只定位支持垃圾回收的容器；引用了其它对象但不支持垃圾回收的扩展类型不会被找到。
gc.get_referents(*ojbs) 返回 被 任意一个参数中的对象直接引用的对象的列表，在排查内存泄漏中一般需要排查被引用的对象列表；
sys.getsizeof(obj) 返回对象的大小（以字节为单位）, 只计算直接分配给对象的内存消耗，不计算它所引用的对象的内存消耗。

示例用法：

import gc, sys

def top_memory(limit=3):
    gc.collect()
    objs_by_size = []
    for obj in gc.get_objects():
        size = sys.getsizeof(obj)
        objs_by_size.append((obj, size))
    # 按照内存分配大小排序
    sorted_objs = sorted(objs_by_size, key=lambda x: x[1], reverse=True)
    for obj, size in sorted_objs[:limit]:
        print(f"size: {size/1024/1024:.2f}MB, type: {type(obj)}, obj: {id(obj)} ")
        # 输出被引用列表
        for item in gc.get_referents(obj):
            print(f"{item}\n")
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

1.2 tracemalloc

Python3.4 以上的内置库。

tracemalloc 模块是一个用于对 python 已申请的内存块进行debug的工具。它能提供以下信息:

回溯对象分配内存的位置
按文件、按行统计python的内存块分配情况: 总大小、块的数量以及块平均大小
对比两个内存快照的差异，以便排查内存泄漏

常用函数介绍：

tracemalloc.start() 可以在运行时调用函数来启动追踪 Python 内存分配
tracemalloc.take_snapshot() 保存一个由 Python 分配的内存块的追踪的快照。返回一个新的 Snapshot 实例
Snapshot.compare_to 计算与某个旧快照的差异

代码示例：

import tracemalloc
tracemalloc.start()
# ... start your application ...

snapshot1 = tracemalloc.take_snapshot()
# ... call the function leaking memory ...
snapshot2 = tracemalloc.take_snapshot()

top_stats = snapshot2.compare_to(snapshot1, 'lineno')

print("[ Top 10 differences ]")
for stat in top_stats[:10]:
    print(stat)
1
2
3
4
5
6
7
8
9
10
11
12
13

官网有非常详细的说明文档和使用示例，详见

1.3 mem_top

mem_top 其实是对 gc 模块的方法的封装，调用 mem_top.mem_top() 函数能够直接打印出按照 被引用数量 、占用内存大小 、按照类型统计对象个数 三种方式排序的 top N 信息。

安装 pip install mem-top

函数说明：

mem_top(
    limit=10,                           # limit of top lines per section
    width=100,                          # width of each line in chars
    sep='\n',                           # char to separate lines with
    refs_format='{num}\t{type} {obj}',  # format of line in "refs" section
    bytes_format='{num}\t {obj}',       # format of line in "bytes" section
    types_format='{num}\t {obj}',       # format of line in "types" section
    verbose_types=None,                 # list of types to sort values by `repr` length
    verbose_file_name='/tmp/mem_top',   # name of file to store verbose values in
)
1
2
3
4
5
6
7
8
9
10

示例 mem_top.mem_top(limit=3, width=200) 输出：

refs:
1638	<type 'dict'> {'IPython.core.error': <module 'IPython.core.error' from '/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/IPython/core/error.pyc'>, 'ipython_genutils.py3compat': <module 'ipython_ge
765		<type 'list'> [u'd = {\n    "@babel/core": "^7.24.4",\n    "@babel/plugin-proposal-class-properties": "^7.18.6",\n    "@babel/preset-env": "^7.9.5",\n    "@jest/globals": "^29.7.0",\n    "babel-eslint": "^10.1.0",\
765		<type 'list'> [u'd = {\n    "@babel/core": "^7.24.4",\n    "@babel/plugin-proposal-class-properties": "^7.18.6",\n    "@babel/preset-env": "^7.9.5",\n    "@jest/globals": "^29.7.0",\n    "babel-eslint": "^10.1.0",\

bytes:
49432	 {'IPython.core.error': <module 'IPython.core.error' from '/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/IPython/core/error.pyc'>, 'ipython_genutils.py3compat': <module 'ipython_ge
33000	 set(['disp', 'union1d', 'all', 'issubsctype', 'atleast_2d', 'setmember1d', 'restoredot', 'ptp', 'blackman', 'pkgload', 'tostring', 'tri', 'arrayrange', 'array_equal', 'item', 'indices', 'loads', 'roun
12584	 {u'': 0, u'pmem_top.mem_top(limit=3, width=200) ': 37, u'primem_top.mem_top(limit=3, width=200) ': 39, u'printmem_top.mem_top() ': 23, u'print mem_top.mem_top(limit) ': 29, u'print mem_top.mem_top(lim

types:
8581	 <type 'function'>
7527	 <type 'tuple'>
6102	 <type 'dict'>
1
2
3
4
5
6
7
8
9
10
11
12
13
14

1.4 guppy

gunppy是一个非常强大的工具，但同时 缺点 也比较明细，执行耗时不适合生产debug。

安装 pip install guppy

注意 该库会寻找使用对象的 dir 相关属性，注意若是自行实现的 __dir__ 函数有问题，会导致该库初始化出现异常。

常用示例：

import datetime
import guppy

# 初始化了SessionContext，使用它可以访问heap信息
analyzer = guppy.hpy()

def do_something():

    # run your app ...

    print("==={} heap total===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
    # 返回heap内存详情
    heap = analyzer.heap()
    print(heap)
    # byvia返回该对象的被哪些引用， heap[0]是内存消耗最大的对象
    print("==={} references===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
    references = heap[0].byvia
    print(references)
    print("==={} references detail===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
    print(references[0].kind)  # 类型
    print(references[0].shpaths)  # 路径
    print(references[0].rp)  # 引用
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

输出结果：

===2024-07-21 16:27:12 heap total===
Partition of a set of 785315 objects. Total size = 104732120 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0 396372  50 35974232  34  35974232  34 unicode
     1  23029   3 23814136  23  59788368  57 dict (no owner)
     2 143799  18 13556704  13  73345072  70 str
     3  75473  10  7372992   7  80718064  77 tuple
     4   1085   0  2634680   3  83352744  80 dict of module
     5   2764   0  2500384   2  85853128  82 type
     6  19206   2  2458368   2  88311496  84 types.CodeType
     7  15857   2  2409224   2  90720720  87 list
     8  19402   2  2328240   2  93048960  89 function
     9   2764   0  2215840   2  95264800  91 dict of type
<931 more rows. Type e.g. '_.more' to view.>
===2024-07-21 16:27:14 references===
Partition of a set of 396372 objects. Total size = 35974232 bytes.
 Index  Count   %     Size   % Cumulative  % Referred Via:
     0  18748   5  1371888   4   1371888   4 '.keys()[0]'
     1  13046   3   974352   3   2346240   7 '.keys()[1]'
     2   9958   3   724328   2   3070568   9 '.keys()[2]'
     3   9027   2   658576   2   3729144  10 '.keys()[3]'
     4   8636   2   632264   2   4361408  12 '.keys()[4]'
     5   8175   2   607032   2   4968440  14 '.keys()[5]'
     6    715   0   515688   1   5484128  15 '.func_doc', '[0]'
     7   6557   2   502880   1   5987008  17 '.keys()[6]'
     8   5785   1   428904   1   6415912  18 '.keys()[7]'
     9   5168   1   392432   1   6808344  19 '.keys()[8]'
<3213 more rows. Type e.g. '_.more' to view.>
===2024-07-21 16:27:16 references detail===
<via '.keys()[0]'>
 0: hpy().Root.i0_modules['kombu'].__dict__.keys()[0]
Reference Pattern by <[dict of] class>.
 0: _ --- [-] 18748 <via '.keys()[0]'>: 0x7ff3f82dec30, 0x7ff3f82decc0...
 1: a      [-] 18753 dict (no owner): 0x7ff3f82f7050*24, 0x7ff3f82f73b0*3...
 2: aa ---- [-] 317 dict (no owner): 0x7ff3f88e43b0*1, 0x7ff3f88e44d0*1...
 3: a3       [-] 77 dict of aliyunsdkcore.endpoint.endpoint_resolver_rules.En...
 4: a4 ------ [-] 77 aliyunsdkcore.endpoint.endpoint_resolver_rules.EndpointR...
 5: a5         [-] 77 list: 0x7ff3f88f65f0*6, 0x7ff3f897e7d0*6...
 6: a6 -------- [-] 77 dict of aliyunsdkcore.endpoint.chained_endpoint_resolv...
 7: a7           [+] 77 aliyunsdkcore.endpoint.chained_endpoint_resolver.Chai...
 8: aab ---- [-] 80 dict (no owner): 0x7ff3f88e44d0*1, 0x7ff3f88e8b90*1...
 9: aaba      [-] 78 dict of aliyunsdkcore.retry.retry_condition.DefaultConfi...
<Type e.g. '_.more' for more.>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

除了官网的文档，还可以通过类的属性查看相关说明：

analyzer = guppy.hpy()
heap = analyzer.heap()
print("============== Heap Documents ====================")
print(analyzer.doc)
print("============= Heap Status Documents ================")
print(heap.doc)
1
2
3
4
5
6

输出：

============== Heap Documents ====================
Top level interface to Heapy. Available attributes:
Anything            Nothing             Via                 iso
Class               Rcs                 doc                 load
Clodo               Root                findex              monitor
Id                  Size                heap                pb
Idset               Type                heapu               setref
Module              Unity               idset               test
Use eg: hpy().doc.<attribute> for info on <attribute>.
============= Heap Status Documents ================
biper               byvia               get_examples        parts
brief               count               get_render          pathsin
by                  dictof              get_rp              pathsout
byclass             diff                get_shpaths         referents
byclodo             disjoint            imdom               referrers
byid                doc                 indisize            rp
byidset             dominos             kind                shpaths
bymodule            domisize            maprox              size
byrcs               dump                more                sp
bysize              er                  nodes               stat
bytype              fam                 owners              test_contains
byunity             get_ckc             partition           theone
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

从Heap Status的说明中可以看到，除了 byvia 统计方法外，还有其他方式，这里介绍几种：

byvia 堆状态的此属性根据引用的对象对堆状态条目进行分组；
bysize 堆状态的此属性根据对象的单独大小对堆状态条目进行分组；
bytype 堆状态的此属性按对象类型对堆状态条目进行分组，所有dict条目将合并为一个条目；
byrcs 堆状态的此属性按引用者类型对堆状态条目进行分组；
bymodule 堆状态的此属性按模块对堆状态条目进行分组；
byunity 堆状态的此属性按总大小对堆状态条目进行分组；
byidset 堆状态的此属性按 idset 对堆状态条目进行分组；
byid 堆状态的此属性按内存地址对堆状态条目进行分组；

一般情况下 byvia 和 bysize 就能解决很多场景的问题。

更多使用示例可以参考 guppy/heapy - Profile Memory Usage in Python

1.5 objgraph

安装 pip install objgraph

为了快速概览内存中的对象，使用函数 show_most_common_types() ；
objgraph会对所有存活的对象进行快照，调用函数 show_growth 查看调用前后的变化。

常见用用法示例：

import objgraph
import datetime

def do_something():

    # run your app ...

    print("==={} show_most_common_types===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
    objgraph.show_most_common_types(limit=5)
    print("==={} show_growth===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
    objgraph.show_growth(limit=5)
1
2
3
4
5
6
7
8
9
10
11

输出：

===2024-07-21 16:41:14 show_most_common_types===
function 18495
list     16072
dict     10912
tuple    6515
weakref  3773
===2024-07-21 16:41:14 show_growth===
function    18495    +18495
list        16072    +16072
dict        10903    +10903
tuple        6503     +6503
weakref      3773     +3773
1
2
3
4
5
6
7
8
9
10
11
12

objgraph 还可以直观的输出对象的引用关系图，需要搭配 xdot 使用。

1.6 pympler

安装 pip install pympler

常见用法示例：

import datetime
from pympler import tracker, muppy, summary

tr = tracker.SummaryTracker()

def do_something():

    # run your app ...

    print("==={} mem total===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
    all_objects = muppy.get_objects()
    sum1 = summary.summarize(all_objects)
    summary.print_(sum1)
    print("==={} mem diff===".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
    tr.print_diff()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

输出结果：

===2024-07-21 16:17:47 mem total===
                    types |   # objects |   total size
========================= | =========== | ============
                     dict |       35489 |     32.33 MB
                      str |       57287 |      5.50 MB
                  unicode |       41150 |      3.55 MB
                     type |        2748 |      2.37 MB
                     code |       17055 |      2.08 MB
                     list |       16024 |      1.80 MB
                    tuple |       12969 |      1.74 MB
                      set |        1704 |    539.06 KB
                  weakref |        3741 |    321.49 KB
      function (__init__) |        1426 |    167.11 KB
        getset_descriptor |        2294 |    161.30 KB
         _sre.SRE_Pattern |         241 |    116.76 KB
              abc.ABCMeta |         124 |    109.70 KB
       wrapper_descriptor |        1371 |    107.11 KB
  collections.OrderedDict |         341 |    103.82 KB

===2024-07-21 16:17:47 mem diff===
                types |   # objects |   total size
===================== | =========== | ============
                 list |       19695 |      3.77 MB
                  str |       23061 |      1.44 MB
                 dict |         505 |    344.71 KB
              unicode |         285 |     97.78 KB
                 type |          91 |     80.27 KB
                 code |         560 |     70.00 KB
                  int |        2421 |     56.74 KB
          _io.BytesIO |           1 |     24.25 KB
                tuple |         296 |     20.49 KB
     _sre.SRE_Pattern |          25 |      9.86 KB
              weakref |          97 |      8.34 KB
    collections.deque |           7 |      4.77 KB
    getset_descriptor |          54 |      3.80 KB
  function (__repr__) |          32 |      3.75 KB
  function (__init__) |          31 |      3.63 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

缺点：统计耗时长，若是放在程序中容易阻塞进程执行，不适合生产debug。

1.7 pyrasite

安装 pip install pyrasite pyrasite-gui urwid meliae

还依赖系统的 gdb (version 7.3+)

虽说工具非常强大，是一个可以通过Python进程ID获取进程运行状态的工具，直接运行时查看非常的方便。
非常遗憾，在Mac和Centos系统都未尝试成功。

原始需求是排查Python2程序的问题，所以也是用的python2.7环境进行尝试使用：

出现错误1：

    Complete output from command python setup.py egg_info: 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x25c5150>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/cython/
      Could not find a version that satisfies the requirement Cython (from versions: )
    No matching distribution found for Cython
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-RqQ7F6/meliae/setup.py", line 96, in <module>
        config()
      File "/tmp/pip-build-RqQ7F6/meliae/setup.py", line 93, in config
        setup(**kwargs)
      File "/usr/lib/python2.7/site-packages/setuptools/__init__.py", line 161, in setup
        _install_setup_requires(attrs)
      File "/usr/lib/python2.7/site-packages/setuptools/__init__.py", line 156, in _install_setup_requires
        dist.fetch_build_eggs(dist.setup_requires)
      File "/usr/lib/python2.7/site-packages/setuptools/dist.py", line 721, in fetch_build_eggs
        replace_conflicting=True,
      File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 782, in resolve
        replace_conflicting=replace_conflicting
      File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1065, in best_match
        return self.obtain(req, installer)
      File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1077, in obtain
        return installer(requirement)
      File "/usr/lib/python2.7/site-packages/setuptools/dist.py", line 777, in fetch_build_egg
        return fetch_build_egg(self, req)
      File "/usr/lib/python2.7/site-packages/setuptools/installer.py", line 130, in fetch_build_egg
        raise DistutilsError(str(e))
    distutils.errors.DistutilsError: Command '['/usr/bin/python2', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmpryTZj0', '--quiet', 'Cython']' returned non-zero exit status 1

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-RqQ7F6/meliae/
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

安装依赖报错，通过 pip install -U pip 解决。

安装成功后，找到Python进程ID为 75055

执行 pyrasite-memory-viewer 75055 出现错误2：

Traceback (most recent call last):
  File "/Users/skyler/Documents/py-env/venv2.7/bin/pyrasite-memory-viewer", line 8, in <module>
    sys.exit(main())
  File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/pyrasite/tools/memory_viewer.py", line 150, in main
    objects = loader.load(filename)
  File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/meliae/loader.py", line 541, in load
    source, cleanup = files.open_file(source)
  File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/meliae/files.py", line 32, in open_file
    source = open(filename, 'rb')
IOError: [Errno 2] No such file or directory: '/tmp/pyrasite-75055-objects.json'
1
2
3
4
5
6
7
8
9
10

简单通过 touch /tmp/pyrasite-75055-objects.json 继续执行：

Traceback (most recent call last):
  File "/Users/skyler/Documents/py-env/venv2.7/bin/pyrasite-memory-viewer", line 8, in <module>
    sys.exit(main())
  File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/pyrasite/tools/memory_viewer.py", line 150, in main
    objects = loader.load(filename)
  File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/meliae/loader.py", line 556, in load
    max_parents=max_parents)
  File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/meliae/loader.py", line 635, in _load
    factory=objs.add):
  File "/Users/skyler/Documents/py-env/venv2.7/lib/python2.7/site-packages/meliae/loader.py", line 629, in iter_objs
    % (line_num, len(objs), mb_read, input_mb, tdelta))
UnboundLocalError: local variable 'line_num' referenced before assignment
1
2
3
4
5
6
7
8
9
10
11
12

非常遗憾，pyrasite工具本文在Mac和Centos系统都未尝试成功。

2. 案例分析

环境：

Centos 7
Python2.7
mem-top==0.2.1

这里使用的 mem_top 工具，执行耗时快，不影响业务进程提供服务；

定义了全局计数器 count ，每执行100次输出一次目前进程内存占用情况；

import logging
import mem_top

logger = logging.getLogger("mem-debug")  # 自行配置logger相关配置
global count  # 定义全局计数器

def do_something():
    # run your app ...

    global count
    if count % 100 == 0:
        msg = mem_top.mem_top(limit=3, width=400)
        logger.info("{} {}".format(count, msg))
    else:
        logger.debug(count)
    count += 1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

截取部分输出：

refs:
157613189	<type 'list'> [<function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x
5742	<type 'list'> ['# module pyparsing.py\n', '#\n', '# Copyright (c) 2003-2018  Paul T. McGuire\n', '#\n', '# Permission is hereby granted, free of charge, to any person obtaining\n', '# a copy of this software and associated documentation files (the\n', '# "Software"), to deal in the Software without restriction, including\n', '# without limitation the rights to use, copy, modify, merge, publish,\n', '# distribut
4240	<type 'dict'> {'oss2.task_queue': <module 'oss2.task_queue' from '/usr/lib/python2.7/site-packages/oss2/task_queue.pyc'>, 'requests.Cookie': None, 'aliyunsdkcdn.request.v20180510': <module 'aliyunsdkcdn.request.v20180510' from '/usr/lib/python2.7/site-packages/aliyunsdkcdn/request/v20180510/__init__.pyc'>, 'elasticsearch.client.cat': <module 'elasticsearch.client.cat' from '/usr/lib/python2.7/site-packages/elas

bytes:
1377112608	 [<function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x7f777e945398>, <function search_function at 0x
196888	 {'oss2.task_queue': <module 'oss2.task_queue' from '/usr/lib/python2.7/site-packages/oss2/task_queue.pyc'>, 'requests.Cookie': None, 'aliyunsdkcdn.request.v20180510': <module 'aliyunsdkcdn.request.v20180510' from '/usr/lib/python2.7/site-packages/aliyunsdkcdn/request/v20180510/__init__.pyc'>, 'elasticsearch.client.cat': <module 'elasticsearch.client.cat' from '/usr/lib/python2.7/site-packages/elas
49432	 {'FOLLOWLOCATION': 52, 'NETRC_IGNORED': 0, 'E_WRITE_ERROR': 23, 'CONTENT_LENGTH_UPLOAD': 3145744, 'SSLVERSION_TLSv1_0': 4, 'SSLVERSION_TLSv1_1': 5, 'SSLVERSION_TLSv1_2': 6, 'E_COULDNT_CONNECT': 7, 'NETRC_OPTIONAL': 1, 'IOCTLFUNCTION': 20130, 'MAX_SEND_SPEED_LARGE': 30145, 'QUOTE': 10028, 'E_ABORTED_BY_CALLBACK': 42, 'INFOTYPE_TEXT': 0, 'READDATA': 10009, 'POLL_NONE': 0, 'E_CONV_REQD': 76, 'MAXCONN

types:
19638	 <type 'function'>
11322	 <type 'dict'>
7124	 <type 'tuple'>
1
2
3
4
5
6
7
8
9
10
11
12
13
14

从输出日志中可以看到内存泄漏是因为 <function search_function at 0x7f777e945398> 。

在代码中全局搜索 search_function 但并未发现使用，此时我们也可以通过其他工具是通过引用路径发现使用的地方，本人直接暴力从安装依赖库的路径去全局搜索了一下。

> cd ./py-env/venv2.7/lib/python2.7/site-packages/
> find . -type f -name "*.py" | xargs grep search_function
./gnupg/_util.py:    codecs.register(encodings.search_function)
1
2
3

到此发现是三方库 gnupg 中出现的问题，源码。

gnupg是一个加解密模块，在处理encode编码问题时，为了解决非utf-8的编码，lib内部处理编码时register了编码function，但没有unregister（python2.7也没有unregister函数，在python 3.10版本加入）
因为服务代码都是utf-8编码，不需要通过那个逻辑解决，注释了那行register代码，测试内存不泄漏。

由于时间紧迫，加上看lib作者已经很久没有维护改库了，所以使用 python-gnupg==0.4.6 替换了 gnupg==2.2.0 去解决了问题。

python-gnupg源码仓库：isislovecruft/python-gnupg
gnupg源码仓库：vsajip/python-gnupg

3. 参考

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/繁依Fanyi0/article/detail/940453

Python内存泄漏排查_python代码 内存泄漏排查