一、背景
https://xie.infoq.cn/article/6befd29e6d671f1527b3a03d6《qemu 单步调试 arm64 linux kernel》
介绍了如何单步调试 kernel, 但是我们经常写一些测试 driver, driver 的部分如何调试?
二、环境准备
调试 driver 就需要准备一个简单的 driver, 这里用最简单的 hello world 来演示如何调试,程序非常简单,生成一个字符设备,并且在 cat 的时候打印变量,还加了一个全局变量,用来 gdb 调试查看变量使用
#include <linux/module.h> /* Needed by all modules */
#include <linux/kernel.h> /* Needed for KERN_INFO */
#include <linux/init.h> /* Needed for the macros */
#include <linux/miscdevice.h> /* Needed for misc device */
int global_hello_value = 996;
#define DEVICE_NAME "Helloworld"
void test_for_debug(void)
{
printk(KERN_INFO "just for debug the text section %d", global_hello_value);
}
static int hello_open(struct inode *inode, struct file *file){
test_for_debug();
return 0;
}
static struct file_operations hello_fops = {
.owner = THIS_MODULE,
.open = hello_open,
};
static struct miscdevice hello_misc = {
.minor = MISC_DYNAMIC_MINOR,
.name = "Helloworld",
.fops = &hello_fops,
};
static int __init hello_start(void)
{
int ret;
ret = misc_register(&hello_misc);
if (ret < 0) {
printk(KERN_EMERG " %s register failed %d.\n", DEVICE_NAME, ret);
return ret;
}
printk(KERN_INFO "Hello world\n");
return 0;
}
static void __exit hello_end(void)
{
test_for_debug();
misc_deregister(&hello_misc);
printk(KERN_INFO "hello_end Goodbye\n");
}
MODULE_LICENSE("GPL");
MODULE_AUTHOR("geek");
MODULE_DESCRIPTION("A simple Hello world misc driver!");
MODULE_VERSION("0.1");
module_init(hello_start);
module_exit(hello_end);
复制代码
编译 ko 需要使用 make modules 指令
make ARCH=arm64 CROSS_COMPILE=aarch64-none-linux-gnu- modules -j8
复制代码
编译完成后,会在 hello driver 的源码目录生成对应的 ko 文件;
将 ko 放在我们的 rootfs 中,参考前一篇 无人知晓:qemu搭建arm64 linux kernel调试环境 rootfs image 制作部分,挂载 rootfs.img, 拷贝 ko, 然后 umount img
sudo mount rootfs.img rootfs
sudo mkdir rootfs/driver
sudo cp ../hello_driver.ko rootfs/driver/
sudo umount rootfs
复制代码
三、加载 ko
insmod 加载打印 hello world, 同时生成/dev/Helloworld 节点
~ # insmod driver/hello_driver.ko
[ 11.203785] Hello world
~ # cd /dev/
/dev # ls
Helloworld ptypc tty36 ttyp0
复制代码
gdb 连接后,尝试看下之前 ko 的 global_hello_value 变量,无法显示,test_for_debug 函数也是无法找到的
(gdb) p global_hello_value
No symbol "global_hello_value" in current context.
(gdb) b test_for_debug
Function "test_for_debug" not defined.
Make breakpoint pending on future shared library load? (y or [n]) n
复制代码
四、调试 ko
调试 ko 的步骤也是分三步:先加载 ko, 确认 ko 加载的地址;然后加载 ko symbols,加载地址需要和实际加载一致 ; 最后一步就是加断点,然后启动调试即可
方法 1: insmod ko 之后根据节点/sys/module/XXXX/sections/ 确认 ko 加载的地址
/driver # cat /sys/module/hello_driver/sections/.text
0xffff80007a860000
复制代码
加载 ko symbols
gdb 中执行
(gdb) add-symbol-file drivers/my_driver/hello_driver.ko -s .text 0xffff80007a860000
add symbol table from file "drivers/my_driver/hello_driver.ko" at
.text_addr = 0xffff80007a860000
(y or n) y
Reading symbols from drivers/my_driver/hello_driver.ko...
复制代码
然后再设置断点
(gdb) b test_for_debug
Breakpoint 1 at 0x4: test_for_debug. (3 locations)
(gdb) c
Continuing.
复制代码
continue 后 qemu 中 cat 节点即可触发中断
gdb 窗口
(gdb) bt
#0 0xffff80007a860004 in test_for_debug () at drivers/my_driver/hello_driver.c:12
#1 hello_open (inode=0xffff000002f11610, file=0xffff00000381bb00) at drivers/my_driver/hello_driver.c:16
#2 0xffff8000808623d8 in misc_open (inode=0xffff000002f11610, file=0xffff00000381bb00) at drivers/char/misc.c:165
#3 0xffff8000802c8250 in chrdev_open (inode=0xffff000002f11610, filp=0xffff00000381bb00) at fs/char_dev.c:414
#4 0xffff8000802bcfcc in do_dentry_open (f=0xffff00000381bb00, inode=0xffff000002f11610, open=0xffff8000802c8194 <chrdev_open>) at fs/open.c:929
#5 0xffff8000802beda0 in vfs_open (path=0x2 <hello_end+2>, file=0xffff000002f11610) at fs/open.c:1063
#6 0xffff8000802d6350 in do_open (op=<optimized out>, file=<optimized out>, nd=<optimized out>) at fs/namei.c:3640
#7 path_openat (nd=0xffff800082adbc40, op=0x4 <hello_end+4>, flags=2055602176) at fs/namei.c:3797
#8 0xffff8000802d706c in do_filp_open (dfd=-100, pathname=0xffff000002e40000, op=0xffff800082adbd74) at fs/namei.c:3824
#9 0xffff8000802bf048 in do_sys_openat2 (dfd=-100, filename=0x646c <error: Cannot access memory at address 0x646c>, how=0x3 <hello_end+3>) at fs/open.c:1422
#10 0xffff8000802bf388 in do_sys_open (mode=<optimized out>, flags=<optimized out>, filename=<optimized out>, dfd=<optimized out>) at fs/open.c:1437
#11 __do_sys_openat (mode=<optimized out>, flags=<optimized out>, filename=<optimized out>, dfd=<optimized out>) at fs/open.c:1453
#12 __se_sys_openat (mode=<optimized out>, flags=<optimized out>, filename=<optimized out>, dfd=<optimized out>) at fs/open.c:1448
#13 __arm64_sys_openat (regs=0xffff80007a860000 <hello_open>) at fs/open.c:1448
#14 0xffff800080027738 in __invoke_syscall (syscall_fn=<optimized out>, regs=<optimized out>) at arch/arm64/kernel/syscall.c:37
#15 invoke_syscall (regs=0xffff800082adbeb0, scno=58833664, sc_nr=2055602176, syscall_table=0x2 <hello_end+2>) at arch/arm64/kernel/syscall.c:51
#16 0xffff800080027840 in el0_svc_common (regs=0xffff800082adbeb0, scno=1, syscall_table=0x2 <hello_end+2>, sc_nr=<optimized out>) at arch/arm64/kernel/syscall.c:136
#17 0xffff8000800278fc in do_el0_svc (regs=0xffff000002f11610) at arch/arm64/kernel/syscall.c:155
#18 0xffff800081016224 in el0_svc (regs=0xffff800082adbeb0) at arch/arm64/kernel/entry-common.c:678
#19 0xffff800081016688 in el0t_64_sync_handler (regs=0xffff80007a860000 <hello_open>) at arch/arm64/kernel/entry-common.c:696
#20 0xffff800080011d4c in el0t_64_sync () at arch/arm64/kernel/entry.S:59
复制代码
但是还有一个问题,查看 global_hello_value 怎么失败了?
(gdb) p global_hello_value
Cannot access memory at address 0x288
复制代码
这是因为,前面我加载 symbols 时没有指定.data 段,只指定了.text 段
退出 gdb,重新修改 ko 加载 symbols 指令为(.text, .data 等段信息均在/sys/module/hello_driver/sections/ 下)
(gdb) add-symbol-file drivers/my_driver/hello_driver.ko -s .text 0xffff80007a860000 -s .data 0xffff80007a862000
add symbol table from file "drivers/my_driver/hello_driver.ko" at
.text_addr = 0xffff80007a860000
.data_addr = 0xffff80007a862000
(y or n) y
Reading symbols from drivers/my_driver/hello_driver.ko...
(gdb) p global_hello_value
$1 = 996
复制代码
方法 2:增加断点,在 load_module 中停住,这样可以 debug 初始化的部分,比如 module_init 中的函数(如果我们驱动在这里有 bug,根本没机会生成/sys/module 下的 driver 段信息)
ko 加载的可以参考这篇 linux ko模块动态加载源码分析
核心就是在 do_init_module 设置断点,然后从结构体 struct module 提取 module 加载信息
注意:module_init 函数在 .init.text 段,add-symbol-file 需要加入这个 .init.text ,才能在 module init 设置断点
(gdb)b do_init_module
设置好断点后继续,然后加载ko,触发断点后,将struct module的段信息及地址信息用gdb 显示出来
(gdb) n
2523 freeinit->init_text = mod->mem[MOD_INIT_TEXT].base;
(gdb) p *mod->sect_attrs->attrs@20
复制代码
在 hello_start 设置断点
(gdb) add-symbol-file drivers/my_driver/hello_driver.ko -s .text 18446603338276798464 -s .init.text 18446603338276823040 -s .data 18446603338276806656
add symbol table from file "drivers/my_driver/hello_driver.ko" at
.text_addr = 0xffff80007a860000
.init.text_addr = 0xffff80007a866000
.data_addr = 0xffff80007a862000
(y or n) y
Reading symbols from drivers/my_driver/hello_driver.ko...
(gdb) b hello_start
Breakpoint 2 at 0xffff80007a866000: file drivers/my_driver/hello_driver.c, line 34.
Thread 2 hit Breakpoint 2, hello_start () at drivers/my_driver/hello_driver.c:34
warning: Source file is more recent than executable.
34 ret = misc_register(&hello_misc);
(gdb) bt
#0 hello_start () at drivers/my_driver/hello_driver.c:34
#1 0xffff800080014dbc in do_one_initcall (fn=0xffff80007a866000 <hello_start>) at init/main.c:1232
#2 0xffff800080120d20 in do_init_module (mod=0xffff80007a862180) at kernel/module/main.c:2530
#3 0xffff800080122dfc in load_module (info=0xffff800082af3ac8, uargs=0xffff0000035e9d80 "\004", flags=0) at kernel/module/main.c:2981
#4 0xffff800080123020 in __do_sys_init_module (umod=0x23756d60, len=38752, uargs=0x5bdf21 "") at kernel/module/main.c:3058
#5 0xffff800080123140 in __se_sys_init_module (uargs=<optimized out>, len=<optimized out>, umod=<optimized out>) at kernel/module/main.c:3038
#6 __arm64_sys_init_module (regs=0x0 <hello_end>) at kernel/module/main.c:3038
#7 0xffff800080027738 in __invoke_syscall (syscall_fn=<optimized out>, regs=<optimized out>) at arch/arm64/kernel/syscall.c:37
#8 invoke_syscall (regs=0xffff800082af3eb0, scno=56532352, sc_nr=0, syscall_table=0x0 <hello_end>) at arch/arm64/kernel/syscall.c:51
#9 0xffff800080027840 in el0_svc_common (regs=0xffff800082af3eb0, scno=-48, syscall_table=0x0 <hello_end>, sc_nr=<optimized out>) at arch/arm64/kernel/syscall.c:136
#10 0xffff8000800278fc in do_el0_svc (regs=0x0 <hello_end>) at arch/arm64/kernel/syscall.c:155
#11 0xffff800081016224 in el0_svc (regs=0xffff800082af3eb0) at arch/arm64/kernel/entry-common.c:678
#12 0xffff800081016688 in el0t_64_sync_handler (regs=0x0 <hello_end>) at arch/arm64/kernel/entry-common.c:696
#13 0xffff800080011d4c in el0t_64_sync () at arch/arm64/kernel/entry.S:595
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
复制代码
方法 3:利用 vmlinux-gdb.py 脚本中的 lx-symbols 辅助命令
利用 gdb 脚本加载, 也是三步,先构建 gdb script 环境; 去掉 gdb 脚本执行限制 ; 调用 lx-symbols 获取 ko 加载地址,当让 script 中除了获取 lx-symbols 外还有很多其他脚本辅助我们调试;
参考:https://www.codenong.com/cs105354913/
构建脚本环境
make ARCH=arm64 CROSS_COMPILE=aarch64-none-linux-gnu- scripts_gdb
复制代码
根目录生成 vmlinux-gdb.py 表示成功
去掉 gdb 脚本加载限制,在~/.config/gdb/gdbinit 中添加 set auto-load safe-path / 环境变量;
此时在 linux source 加载 vmlinux 也会提示这个限制错误及解决方法
添加环境变量后重新启动 gdb,出现了加载 vmlinux-gdb.py 脚本的错误
上面的报错原因是因为编译 kernel 版本中打开了 config CONFIG_DEBUG_INFO_REDUCED, 这个会影响 gdb script 的完整功能,在 arch/arm64/ configs/defconfig 中去掉这个 config,重新编辑后加载 vmlinux;
编译 kernel 之后记得还需要重新生成下脚本
make ARCH=arm64 CROSS_COMPILE=aarch64-none-linux-gnu- scripts_gdb
(gdb) lx-symbols
loading vmlinux
scanning for modules in /home/geek/workspace/linux/linux-6.6.1
loading @0xffff80007a860000: /home/geek/workspace/linux/linux-6.6.1/drivers/my_driver/hello_driver.ko
复制代码
这个指令会自动加载 ko symbols, 直接加断点即可调试,免去了前面通过 add-symbol-file 设置 ko section 的繁琐步骤;(但是如果你是调试驱动 init 的部分,还是得使用上面的方法 2)
方法 3 虽然配置有些麻烦,但是 vmlinux-gdb.py 提供了非常多的调试功能 apropos lx 可以查看,这些调试功能能帮助我们提升调试的效率
(gdb) apropos lx
function lx_clk_core_lookup -- Find struct clk_core by name
function lx_current -- Return current task.
function lx_dentry_name -- Return string of the full path of a dentry.
function lx_device_find_by_bus_name -- Find struct device by bus and name (both strings)
function lx_device_find_by_class_name -- Find struct device by class and name (both strings)
function lx_i_dentry -- Return dentry pointer for inode.
function lx_module -- Find module by name and return the module variable.
function lx_per_cpu -- Return per-cpu variable.
function lx_radix_tree_lookup -- Lookup and return a node from a RadixTree.
function lx_rb_first -- Lookup and return a node from an RBTree
function lx_rb_last -- Lookup and return a node from an RBTree.
function lx_rb_next -- Lookup and return a node from an RBTree.
function lx_rb_prev -- Lookup and return a node from an RBTree.
function lx_task_by_pid -- Find Linux task by PID and return the task_struct variable.
function lx_thread_info -- Calculate Linux thread_info from task variable.
function lx_thread_info_by_pid -- Calculate Linux thread_info from task variable found by pid
lx-clk-summary -- Print clk tree summary
lx-cmdline -- Report the Linux Commandline used in the current kernel.
lx-configdump -- Output kernel config to the filename specified as the command
lx-cpus -- List CPU status arrays
lx-device-list-bus -- Print devices on a bus (or all buses if not specified)
lx-device-list-class -- Print devices in a class (or all classes if not specified)
lx-device-list-tree -- Print a device and its children recursively
lx-dmesg -- Print Linux kernel log buffer.
lx-dump-page-owner -- Dump page owner
lx-fdtdump -- Output Flattened Device Tree header and dump FDT blob to the filename
lx-genpd-summary -- Print genpd summary
lx-getmod-by-textaddr -- Look up loaded kernel module by text address.
lx-interruptlist -- Print /proc/interrupts
lx-iomem -- Identify the IO memory resource locations defined by the kernel
lx-ioports -- Identify the IO port resource locations defined by the kernel
lx-list-check -- Verify a list consistency
lx-lsmod -- List currently loaded modules.
lx-mounts -- Report the VFS mounts of the current process namespace.
lx-page_address -- struct page to linear mapping address
lx-page_to_pfn -- struct page to PFN
lx-page_to_phys -- struct page to physical address
lx-pfn_to_kaddr -- PFN to kernel address
lx-pfn_to_page -- PFN to struct page
lx-ps -- Dump Linux tasks.
lx-slabinfo -- Show slabinfo
lx-slabtrace -- Show specific cache slabtrace
lx-sym_to_pfn -- symbol address to PFN
lx-symbols -- (Re-)load symbols of Linux kernel and currently loaded modules.
lx-timerlist -- Print /proc/timer_list
lx-version -- Report the Linux Version of the current kernel.
lx-virt_to_page -- virtual address to struct page
lx-virt_to_phys -- virtual address to physical address
lx-vmallocinfo -- Show vmallocinfo
复制代码
比如我想要查看所有 task 的信息,执行 lx-ps 即可,如果要看 task_struct 的详细信息,将显示的地址转换成 struct task_struct *指针即可显示
比如查看一个进程的进程名
(gdb) p ((struct task_struct*)0xffff000002d68ec0)->comm
$6 = "watchdogd\000\000\000\000\000\000"
复制代码
评论