当前位置:   article > 正文

通过dmesg crash信息调试驱动代码_dmtcp message crash

dmtcp message crash

最近在给一个驱动程序添加一个功能 --> 通过给定的进程名找到对应进程的pid号,但是遇到了crash的情况,我们一起找找问题出在哪里!

首先给到dmesg中的crash信息:

[ 4534.975026] BUG: unable to handle kernel NULL pointer dereference at 0000000000000430
[ 4534.976059] IP: [<ffffffffc0747e78>] bts_write+0x1b8/0x830 [bts]
[ 4534.977065] PGD 2195a2067 PUD 219c6f067 PMD 0 
[ 4534.978066] Oops: 0000 [#3] SMP 
[ 4534.979027] Modules linked in: bts(OE) chr(OE) hid_generic usbhid hid rfcomm bnep bluetooth intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm arc4 ath9k amdkfd ath9k_common ath9k_hw amd_iommu_v2 ath radeon snd_hda_codec_idt snd_hda_codec_generic snd_hda_codec_hdmi crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec mac80211 snd_hda_core aesni_intel aes_x86_64 joydev snd_hwdep hp_wmi snd_pcm sparse_keymap input_leds lrw serio_raw gf128mul glue_helper ppdev ablk_helper lp parport_pc snd_seq_midi cfg80211 snd_seq_midi_event snd_rawmidi snd_seq ttm cryptd snd_seq_device snd_timer mei_me drm_kms_helper mei drm snd i2c_algo_bit soundcore hp_accel lpc_ich lis3lv02d tpm_infineon input_polldev parport video 8250_fintek hp_wireless mac_hid wmi psmouse ahci libahci firewire_ohci sdhci_pci firewire_core e1000e sdhci crc_itu_t ptp pps_core [last unloaded: bts]
[ 4534.985521] CPU: 0 PID: 3462 Comm: ops_main Tainted: G      D W  OE   4.2.0-42-generic #49~14.04.1-Ubuntu
[ 4534.986561] Hardware name: Hewlett-Packard HP ProBook 6470b/179C, BIOS 68ICE Ver. F.45 10/07/2013
[ 4534.987607] task: ffff8802203a5280 ti: ffff880220298000 task.ti: ffff880220298000
[ 4534.988636] RIP: 0010:[<ffffffffc0747e78>]  [<ffffffffc0747e78>] bts_write+0x1b8/0x830 [bts]
[ 4534.989674] RSP: 0018:ffff88022029bd38  EFLAGS: 00010246
[ 4534.990663] RAX: ffffffff81c15840 RBX: 0000000000000006 RCX: 0000000000000002
[ 4534.991635] RDX: 0000000000000002 RSI: ffff88022029bd51 RDI: ffff8802203a5859
[ 4534.992587] RBP: ffff88022029be98 R08: ffffffffc074b060 R09: 315f6e65706f5f34
[ 4534.993573] R10: 00007fd6ff1ba6a0 R11: 0000000000000246 R12: 0000000000000000
[ 4534.994497] R13: ffffffff81c15840 R14: ffff8802203a5858 R15: ffff8800b8e7b000
[ 4534.995411] FS:  00007fd6ff3cb740(0000) GS:ffff88023ec00000(0000) knlGS:0000000000000000
[ 4534.996324] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4534.997232] CR2: 0000000000000430 CR3: 000000022b479000 CR4: 00000000001406f0
[ 4534.998334] Stack:
[ 4534.999528]  ffff88022029bd68 ffffffff811f833e ffff88022029bd68 ffffffffc074a201
[ 4535.000466]  ffffffff81c15840 7700007472617473 6174732065746972 6563617274207472
[ 4535.001395]  646e616d6d6f6320 253a726f72726520 7320737462000a73 6f72726520706f74
[ 4535.002401] Call Trace:
[ 4535.003364]  [<ffffffff811f833e>] ? terminate_walk+0x6e/0xe0
[ 4535.004328]  [<ffffffff811ede38>] __vfs_write+0x18/0x40
[ 4535.005283]  [<ffffffff811ee479>] vfs_write+0xa9/0x190
[ 4535.006244]  [<ffffffff810dbefd>] ? call_rcu_sched+0x1d/0x20
[ 4535.007182]  [<ffffffff811ef1e6>] SyS_write+0x46/0xa0
[ 4535.008111]  [<ffffffff817c36f2>] entry_SYSCALL_64_fastpath+0x16/0x75
[ 4535.009038] Code: 00 00 49 8b 84 24 40 03 00 00 48 89 85 c0 fe ff ff 4c 8b ad c0 fe ff ff 4d 8d a5 c0 fc ff ff 49 81 fc 00 55 c1 81 75 bc 45 31 e4 <45> 8b a4 24 30 04 00 00 48 c7 c7 1d a2 74 c0 31 c0 44 89 e6 e8 
[ 4535.011028] RIP  [<ffffffffc0747e78>] bts_write+0x1b8/0x830 [bts]
[ 4535.011968]  RSP <ffff88022029bd38>
[ 4535.012902] CR2: 0000000000000430
[ 4535.013850] ---[ end trace bd7d268405d6447e ]---
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34

从dmesg Log中可以看到 BUG: unable to handle kernel NULL pointer dereference at 0000000000000430 从字面意思来看遇到了一个空指针类型的错误,还有第二个信息是十分重要的,bts_write+0x1b8/0x830 [bts] ,从这个信息我们可以看出出错的函数以及偏移,出错的函数在 bts_write ,相对偏移为0x1b8;

针对这个信息,第一件要做的事情就是把驱动编译过程文件xxx.o进行反汇编,现在Linux 自带的objdump就可以了;

//要是不知道具体参数 objdump -h就知道了
curtis@curtis-virtual-machine:/mnt/hgfs/share/write_code/runqueue$ objdump --help
Usage: objdump <option(s)> <file(s)>
 Display information from object <file(s)>.
 At least one of the following switches must be given:
  -a, --archive-headers    Display archive header information
  -f, --file-headers       Display the contents of the overall file header
  -p, --private-headers    Display object format specific file header contents
  -P, --private=OPT,OPT... Display object format specific contents
  -h, --[section-]headers  Display the contents of the section headers
  -x, --all-headers        Display the contents of all headers
  -d, --disassemble        Display assembler contents of executable sections
  -D, --disassemble-all    Display assembler contents of all sections
  -S, --source             Intermix source code with disassembly
  -s, --full-contents      Display the full contents of all sections requested
  -g, --debugging          Display debug information in object file
  -e, --debugging-tags     Display debug information using ctags style
  -G, --stabs              Display (in raw form) any STABS info in the file
  -W[lLiaprmfFsoRt] or
  --dwarf[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames,
          =frames-interp,=str,=loc,=Ranges,=pubtypes,
          =gdb_index,=trace_info,=trace_abbrev,=trace_aranges,
          =addr,=cu_index]
                           Display DWARF info in the file
  -t, --syms               Display the contents of the symbol table(s)
  -T, --dynamic-syms       Display the contents of the dynamic symbol table
  -r, --reloc              Display the relocation entries in the file
  -R, --dynamic-reloc      Display the dynamic relocation entries in the file
  @<file>                  Read options from <file>
  -v, --version            Display this program's version number
  -i, --info               List object formats and architectures supported
  -H, --help               Display this information

 The following switches are optional:
  -b, --target=BFDNAME           Specify the target object format as BFDNAME
  -m, --architecture=MACHINE     Specify the target architecture as MACHINE
  -j, --section=NAME             Only display information for section NAME
  -M, --disassembler-options=OPT Pass text OPT on to the disassembler
  -EB --endian=big               Assume big endian format when disassembling
  -EL --endian=little            Assume little endian format when disassembling
      --file-start-context       Include context from start of file (with -S)
  -I, --include=DIR              Add DIR to search list for source files
  -l, --line-numbers             Include line numbers and filenames in output
  -F, --file-offsets             Include file offsets when displaying information
  -C, --demangle[=STYLE]         Decode mangled/processed symbol names
                                  The STYLE, if specified, can be `auto', `gnu',
                                  `lucid', `arm', `hp', `edg', `gnu-v3', `java'
                                  or `gnat'
  -w, --wide                     Format output for more than 80 columns
  -z, --disassemble-zeroes       Do not skip blocks of zeroes when disassembling
      --start-address=ADDR       Only process data whose address is >= ADDR
      --stop-address=ADDR        Only process data whose address is <= ADDR
      --prefix-addresses         Print complete address alongside disassembly
      --[no-]show-raw-insn       Display hex alongside symbolic disassembly
      --insn-width=WIDTH         Display WIDTH bytes on a single line for -d
      --adjust-vma=OFFSET        Add OFFSET to all displayed section addresses
      --special-syms             Include special symbols in symbol dumps
      --prefix=PREFIX            Add PREFIX to absolute paths for -S
      --prefix-strip=LEVEL       Strip initial directory names for -S
      --dwarf-depth=N        Do not display DIEs at depth N or greater
      --dwarf-start=N        Display DIEs starting with N, at the same depth
                             or deeper
      --dwarf-check          Make additional dwarf internal consistency checks.      

objdump: supported targets: elf64-x86-64 elf32-i386 elf32-x86-64 a.out-i386-linux pei-i386 pei-x86-64 elf64-l1om elf64-k1om elf64-little elf64-big elf32-little elf32-big pe-x86-64 pe-i386 plugin srec symbolsrec verilog tekhex binary ihex
objdump: supported architectures: i386 i386:x86-64 i386:x64-32 i8086 i386:intel i386:x86-64:intel i386:x64-32:intel i386:nacl i386:x86-64:nacl i386:x64-32:nacl l1om l1om:intel k1om k1om:intel plugin

The following i386/x86-64 specific disassembler options are supported for use
with the -M switch (multiple options should be separated by commas):
  x86-64      Disassemble in 64bit mode
  i386        Disassemble in 32bit mode
  i8086       Disassemble in 16bit mode
  att         Display instruction in AT&T syntax
  intel       Display instruction in Intel syntax
  att-mnemonic
              Display instruction in AT&T mnemonic
  intel-mnemonic
              Display instruction in Intel mnemonic
  addr64      Assume 64bit address size
  addr32      Assume 32bit address size
  addr16      Assume 16bit address size
  data32      Assume 32bit data size
  data16      Assume 16bit data size
  suffix      Always display instruction suffix in AT&T syntax
Report bugs to <http://www.sourceware.org/bugzilla/>.

//这里使用-D参数把所有sections反汇编,并重定向到文件方便后续查看
curtis@curtis-HP-ProBook-6470b:~/Desktop/per_bts/drv$ objdump bts.o -D > err.txt
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88

objdump 默认情况下输出的是ATT汇编语法,如果不习惯可以转换成intel汇编语法,添加参数 -M intel ,下一步就是找到出错函数的基址,vim打开搜索bts_write就可以找到:

0000000000000cc0 <bts_write>:
     cc0:       e8 00 00 00 00          callq  cc5 <bts_write+0x5>
     cc5:       55                      push   %rbp
     cc6:       b9 20 00 00 00          mov    $0x20,%ecx
     ccb:       48 89 e5                mov    %rsp,%rbp
     cce:       41 57                   push   %r15
     cd0:       41 56                   push   %r14
     cd2:       45 31 f6                xor    %r14d,%r14d
     cd5:       41 55                   push   %r13
     cd7:       4c 8d ad c8 fe ff ff    lea    -0x138(%rbp),%r13
     cde:       41 54                   push   %r12
     ce0:       53                      push   %rbx
     ce1:       48 89 d3                mov    %rdx,%rbx
     ce4:       ba fe 00 00 00          mov    $0xfe,%edx
     ce9:       48 81 ec 38 01 00 00    sub    $0x138,%rsp
     cf0:       4c 8b bf d0 00 00 00    mov    0xd0(%rdi),%r15
     cf7:       48 8d bd c8 fe ff ff    lea    -0x138(%rbp),%rdi
     cfe:       65 48 8b 04 25 28 00    mov    %gs:0x28,%rax
     d05:       00 00
     d07:       48 89 45 c8             mov    %rax,-0x38(%rbp)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

从以上信息可以看出,函数的基址为0xcc0,想要找到具体的出错行,还需加上偏移0x1b8 --> 0xcc0+0x1b8=0xe78;
下一步就是如何定位出错代码行,这里就要用到另外一个工具,addr2line;

注意:有同学可能在编译驱动的时候在Makefile中没有添加参数 “KBUILD_CFLAGS+= -g” 参数,导致使用addr2line工具时无法找到oops具体对应的行号!!!

Makefile 示例如下:

obj-m += good.o

KBUILD_CFLAGS+= -g

all:
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
curtis@curtis-HP-ProBook-6470b:~/Desktop/per_bts/drv$ addr2line -h
Usage: addr2line [option(s)] [addr(s)]
 Convert addresses into line number/file name pairs.
 If no addresses are specified on the command line, they will be read from stdin
 The options are:
  @<file>                Read options from <file>
  -a --addresses         Show addresses
  -b --target=<bfdname>  Set the binary file format
  -e --exe=<executable>  Set the input file name (default is a.out)
  -i --inlines           Unwind inlined functions
  -j --section=<name>    Read section-relative offsets instead of addresses
  -p --pretty-print      Make the output easier to read for humans
  -s --basenames         Strip directory names
  -f --functions         Show function names
  -C --demangle[=style]  Demangle function names
  -h --help              Display this information
  -v --version           Display the program's version

curtis@curtis-HP-ProBook-6470b:~/Desktop/per_bts/drv$ addr2line -C -f -e bts.o e78
find_pid
/home/curtis/Desktop/per_bts/drv/bts_driver.c:108
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21

这里成功找到出错行函数以及出错行号,出错函数为find_pid,行号为108,在代码中找到对应函数;

static int find_pid(char *string_name)
{
        unsigned int pid;
        char *find_name = &string_name;  --> char *find_name = string_name;
        struct task_struct* task;

        task = find_task(find_name);
        pid = task->pid;   <--108printk("Have find pid is %d\n",pid);
        return pid;
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

仔细分析发现是因为find_task函数没有返回进程的task_struct结构体,导致出现空指针,根本原因是前后代码改动较大,忽略了对find_name的初始化出错了,传入的形参是字符串指针,改完之后,完美解决问题;

crash添加调试驱动符号信息

mod -S /path/to/driver.o

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/花生_TL007/article/detail/183422
推荐阅读
相关标签
  

闽ICP备14008679号