[Debug] Heap corruption and Dangling pointer === ###### tags: `debug` `heap corruntipon` `dangling pointer` `memory deallocation` [ToC] # Preface Recently, I get a task to modularize plenty of protocols in kernel. And when I try to unload my LKM lacp.ko, kernel crashed with different behaviors. It is really annoying me. Therefore, I record it here to describe it. # Symptom There are 3 cases. ## Case 1 kernel crashed at kernel/cred.c when unloading lacp.ko ``` ~ # rmmod lacp ~ # ------------[ cut here ]------------ kernel BUG at kernel/cred.c:107! Internal error: Oops - BUG: 0 [#1] SMP ARM Modules linked in: usb_storage xxxx(PO) yyyy(PO) zzzz(PO) wwwww(PO) [last unloaded: lacp] ``` ## Case 2 After unloading lacp.ko successfully, crashed when installing lacp.ko. ```shell== /modules # rmmod lacp.ko rmmod: can't unload 'lacp': unknown symbol in module, or unknown parameter /modules # insmod lacp.ko Unable to handle kernel paging request at virtual address ff771200 pgd = cbca8000 [ff771200] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: lacp(+) llll xxxx(PO) zzzz(PO) yyyy(PO) wwww(PO) [last unloaded: lacp] ``` ## Case 3 kernel crashed at kernel/rcutree.c when unloading lacp.ko ```shell=== /modules # rmmod lacp.ko /modules # ------------[ cut here ]------------ WARNING: at kernel/rcutree.c:1620 rcu_process_callbacks+0x484/0x534() Modules linked in: qqqq llll(PO) xxxxx(PO) zzzz(PO) yyyy(PO) [last wwww: lacp] ``` # RootCause Generally, this issue is associated with memory management. Hence, we have to review the code of init and exit in the parts of dynamic memory allocation and deallocation. After tracing code further, I got the view of allocated memory. Below diagram shows the relation of them. ![](https://i.imgur.com/7c1bPje.png) All rectangular represents a block of memory in heap. Arrows point to a rectangular means a pointer pointing to a block of memory in heap. For instance, A is a pointer who points to X. In lacp_mem_init, there are 5 blocks of memory allocated including X, Y, Y1, Y2, Y3 and Z. Y1, Y2 and Y3 are allocated after Y allocated. Z is allocated and Z->sub4 is point to X, too. ### Memory allocation Therefore, we can clearly know the memory allocation sequence is: 1. X 2. Y 3. Y1 4. Y2 5. Y3 6. Z ### Memory deallocation 1. Y1 2. Y2 3. Y3 4. Y 5. X 6. X 7. Z We can find out two problems here. 1. X is freed twice. 2. The deallocation sequence is wrong. It’s not a reversed sequence of allocation. Both 1 and 2 will cause heap corruption and result in undefined behavior that’s what our symptoms look like. # Correction The deallocation sequence should be Z->Y3->Y2->Y1->Y->X. Besides, when freed a block of memory, we have to make the pointer pointing to NULL. For instance, ```c== int *x = (int *)malloc(sizeof(int)); int y = 3; x = y; free(x); x = NULL; ``` # Summary Due to the dynamic nature of allocating and deallocating memory, the heap is vulnerable to the following typical corruption problems: - boundary overrun: a program writes beyond the malloc region. - boundary underrun: a program writes in front of the malloc region. - access to uninitialized memory: a program attempts to read memory that has not yet been initialized. - access to freed memory: a program attempts to read or write to memory that has been freed. - double frees: a program frees some structure that it had already freed. In such a case, a subsequent reference can pick up a meaningless pointer, causing a segmentation violation. - erroneous frees: a program calls free() on addresses that were not returned by malloc, such as static, global, or automatic variables, or other invalid expressions. See the malloc(3f) man page for more information. Therefore, we must be more cautious when we deallocate heap memory.