--- title: memory leak --- **Memory leak**主要可以分成兩大類: ## User space memory leak: 顧名思義就是在user space所產生的memory leak. 可以使用**valgrind**來偵測 安裝: ```shell= sudo apt install valgrind # Ubuntu, Debian, etc. sudo yum install valgrind # RHEL, CentOS, Fedora, etc. ``` To run valgrind, pass the executable as an argument (along with any parameters to the program). ```shell= valgrind --leak-check=full \ --show-leak-kinds=all \ --track-origins=yes \ --verbose \ --log-file=valgrind-out.txt \ ./executable exampleParam1 ``` executable->盡量不要用exe.sh形式,可能會抓不到。 qemu example: ```shell= valgrind --leak-check=full --show-leak-kinds=all --log-file=qemu.log ./x86_64-softmmu/qemu-system-x86_64 -drive if=none,id=drive2,cache=none,format=raw,file=/home/howard/Desktop/Ubuntu20G-1604-2.img -device virtio-blk,drive=drive2,scsi=off -m 512M -enable-kvm -net tap,ifname=tap2 -net nic,model=virtio,macaddr=ae:ae:00:00:00:26 -vga std -chardev socket,id=mon2,path=$HOME/vm2.monitor,server,nowait -mon chardev=mon2,id=monitor,mode=readline ``` cuju example: ```shell= valgrind --leak-check=full --show-leak-kinds=all --log-file=qemu.log ./x86_64-softmmu/qemu-system-x86_64 -drive if=none,id=drive0,cache=none,format=raw,file=/home/howard/Desktop/Ubuntu20G-1604-2.img -device virtio-blk,drive=drive0 -m 3G -enable-kvm -net tap,ifname=tap0 -net nic,model=virtio,vlan=0,macaddr=ae:ae:00:00:00:25 -cpu host -vga std -chardev socket,id=mon,path=/home/howard/vm1.monitor,server,nowait -mon chardev=mon,id=monitor,mode=readline ``` memory lost 分成幾種類型: - definitely lost: 真的 memory leak 了。 - indirectly lost: 間接的 memory leak,structure 本身發生 memory leak,而內部的 member 如果是 allocate 出來的,一樣會 memory leak,但是只要修好前面的問題,後面的問題也會跟著修復。 - possibly lost: allocate 一塊記憶體,並且放到指標 ptr,但事後又改變 ptr 指到這塊記憶體的中間。 - still reachable: 程式結束時有未釋放的記憶體,不過卻還有指標指著,通常會發生在 global 變數,這種情況基本上在把程式關閉後沒被釋放的記憶體就會自動被free掉。 Ref: http://blog.yslin.tw/2014/03/c-valgrind.html http://wen00072.github.io/blog/2014/11/29/catching-leakage-use-valgrind-checking-c-memory-leak/ http://valgrind.org/docs/manual/faq.html#faq.deflost ## Kernel space memory leak: 顧名思義就是在kernel space所產生的memory leak. 可以使用**kmemleak**來偵測 在kernel的.config檔案中,使用(開啟)如下巨集 CONFIG_SLUB_DEBUG=y CONFIG_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=n CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=40000 其中kmemleak緩衝區的大小通過配CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE設定 Kmemleak相關的程式碼在如下路徑的檔案中: kernel-4.9/drivers/misc/mediatek/mem/mtk_memcfg.c **->接著compiler kernel -> 載入該kernel -> reboot** ### 檢測方法&步驟(假設有洩漏): 1. 一次scan: echo scan > sys/kernel/debug/kmemleak(開始掃描)。 然後 cat sys/kernel/debug/kmemleak 會得到很多backtrace,但是這其中有些是誤抓的(kmemleak存在誤報情況)。 2. 然後echo clear > sys/kernel/debug/kmemleak 清除log。 第二次scan:echo scan > sys/kernel/debug/kmemleak(開始掃描) 過段時間等待leak的積累, 然後 cat sys/kernel/debug/kmemleak 很多第一次誤報的backtrace沒有了,會得到很多重複的backtrace,假設這樣的backtrace稱為A。 3. kmemleak的特徵是A backtrace會越來越多,不斷增長,而且這裡就是洩漏的點。 #### The scanning algorithm steps: 1. mark all objects as white (remaining white objects will later be considered orphan) 2. scan the memory starting with the data section and stacks, checking the values against the addresses stored in the rbtree. If a pointer to a white object is found, the object is added to the gray list 3. scan the gray objects for matching addresses (some white objects can become gray and added at the end of the gray list) until the gray set is finished 4. the remaining white objects are considered orphan and reported via /sys/kernel/debug/kmemleak Ref: https://www.itread01.com/content/1542343443.html https://www.kernel.org/doc/html/latest/dev-tools/kmemleak.html <font color="#f00">#最後發現問題點是在gfn_to_page並未釋放,導致memory leak。 #需搭配kvm_release_page_clean(page)來進行free。</font> ```code=c void kvm_release_page_clean(struct page *page) { WARN_ON(is_error_page(page)); kvm_release_pfn_clean(page_to_pfn(page)); } EXPORT_SYMBOL_GPL(kvm_release_page_clean); void kvm_release_pfn_clean(pfn_t pfn) { if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn)) put_page(pfn_to_page(pfn)); } ```