Some Research about the Linux Heap

# Some Research about the Linux Heap ###### tags: `Linux` `security` `project`  ## References > [Heap Exploitation](https://guyinatuxedo.github.io/25-heap/index.html) [heap-exploitation](https://heap-exploitation.dhavalkapil.com/) [MallocInternals - glibc wiki](https://sourceware.org/glibc/wiki/MallocInternals) [Heap Exploit 學習筆記](https://medium.com/@b3rm1nG/heap-exploit-%E5%AD%B8%E7%BF%92%E7%AD%86%E8%A8%98-d724d0afa59b) [Bins and Chunks - heap-exploitation](https://heap-exploitation.dhavalkapil.com/diving_into_glibc_heap/bins_chunks#unsorted-bin) [House of Force - heap-exploitation](https://heap-exploitation.dhavalkapil.com/attacks/house_of_force) [TCACHE exploitation](https://medium.com/@b3rm1nG/tcache-exploitation-871044f8b210 ) [how2heap](https://github.com/shellphish/how2heap) [opp556687 - Linux heap exploit](https://hackmd.io/@opp556687/BkAm4kiyD) ## Terminologies in the linux heap ### arena 每一個 heap 都屬於一個 arena，一個 arena 可能包含多個 references to heaps 以及 heaps 裡的 linked lists of chunks，也稱為 free lists。被指派至某個 arena 的 thread 就會從此 arena 的 free lists 裡動態配置記憶體。 main arena 使用程式一開始就有的 heap (right after .bss)，其餘的則透過 `mmap()` 來分配記憶體給他們的 heap(s)。 ### heap 動態分配的 memory (eg: `malloc()` / `new`)，為一連續的虛擬記憶體區塊，被細分為許多的 chunks ### chunk 一塊可以被 allocated, freed 或是與鄰近 chunks 結合起來的記憶體區塊，minimum size = `4*sizeof(void*)` (32 bytes in x86_64) ```c struct malloc_chunk { INTERNAL_SIZE_T mchunk_prev_size; /* Size of previous chunk, if it is free. */ INTERNAL_SIZE_T mchunk_size; /* Size in bytes, including overhead. */ struct malloc_chunk* fd; /* double links -- used only if this chunk is free. */ struct malloc_chunk* bk; /* Only used for large blocks: pointer to next larger size. */ struct malloc_chunk* fd_nextsize; /* double links -- used only if this chunk is free. */ struct malloc_chunk* bk_nextsize; }; typedef struct malloc_chunk* mchunkptr; ``` Free chunks 會以 `fd`, `bk` 串成一個 doubly linked list，allocated chunks 則不會用到這兩個指標，`malloc()` 會回傳原本 `fd` 的位址 Chunk size 恆為 8 的倍數，因此原本 size field 的 3 LSBs 恆為 0不需要儲存，可以當成 flags 來用，其中最後一個位元表示 previous chunk is in use (by the application) x86_64 系統而言，chunk header 占了 0x10 (16) bytes，因此 data (`malloc()` return 的位址) 是從 `(void *) mchunkptr + 0x10` 的地方開始儲存 ![image alt](https://miro.medium.com/max/640/1*CjulobLkVT16622tKU5bhA.webp) ### bin free (available) chunks 會根據大小以不同的串列連結起來，以便 library 可以快速的找到適合的 chunk 以滿足需求，bins 指的就是這些串列 - fast bins: 較小 (64-bit 系統而言貌似是指 size <= 160 bytes, **MAX_FAST_SIZE: 160** in 64-bit system) 的 chunks 會被存在 fast bins 裡面，也就是一個元素為 singly-linked list 的陣列，每個 list/bin 負責維護固定大小的 chunks。被加入 fast bins 的 chunks 不會被與鄰近的 chunk 合併 - unsorted bin: heap 裡僅有一個 unsorted bin，原先屬於 small bins, large bins 的 chunks 被釋放後會被放至此處。之後在 `malloc` 中這些 chunks 才會被排序並被移動至 small, large bins - small bins: 類似 fast bins 但 small bins chunks 在被加入到 small bins 時會先與鄰近 chunks 結合以合併 (coalesce) 成更大的 chunks，small bins 為雙向鏈結 - large bins: 當一個 chunk 所在的 bin (就是 large bin) 包含了不同大小的 chunks 時 (**MIN_LARGE_SIZE: 1024** in 64-bit system)，此 chunk 就被稱為 "large" chunk。與 small bins 一樣，chunks 在被加入到 large bins 時會先與鄰近 chunks 結合以合併 (coalesce) 成更大的 chunks，large bins 為雙向鏈結 - top chunk: 每個 arena 都會紀錄一個 "top" chunk，通常為最大的 available chunk。此為**鄰近 unmapped memory 的 chunk，也就表示其位於 arena 當中最上方 (記憶體位址最大)** 且 top chunk 不屬於任何 bin。**House of force** 攻擊就是透過修改 top chunk 的內容以做到隨機任意記憶體讀寫 ([opp556687 - Linux heap exploit#House-of-Force](https://hackmd.io/@opp556687/BkAm4kiyD#House-of-Force)) ### Thread Local Cache (**tcache**) `malloc` 能夠知道有多個 threads 的存在，但也就僅此而已。`malloc` 中並沒有針對多執行緒的優化，它認定 kernel 會以有效率的方法來處理這些議題每個執行緒會有一個 thread-local variable ([> Thread-Local Storage](https://web.mit.edu/rhel-doc/3/rhel-gcc-en-3/thread-local.html)) 來表示它最後使用的 arena，若該 arena 正在被使用的話則當另一執行緒需要使用該 arena 時 (為避免 [race condition](https://en.wikipedia.org/wiki/Race_condition)) 此執行緒會被 blocked 直到該 arena 變成 available。每個執行緒還會有一個稱為 **tcache** 的快取，其中包含了一些不需要 lock 任何 arena 就能直接存取的 chunks，他們以一個元素為 singly-linked list 的陣列來儲存 (像是 fastbins)，每個元素連結了單一大小的 chunks，也就是陣列的 index 表示 chunks size。`free()` 的時候會先看看 tcache 有無空間存放目標 chunk，沒有的話才會放到其他 bins ```c /* We overlay this structure on the user-data portion of a chunk when the chunk is stored in the per-thread cache. */ typedef struct tcache_entry { struct tcache_entry *next; } tcache_entry; /* There is one of these for each thread, which contains the per-thread cache (hence "tcache_perthread_struct"). Keeping overall size low is mildly important. Note that COUNTS and ENTRIES are redundant (we could have just counted the linked list each time), this is for performance reasons. */ typedef struct tcache_perthread_struct { char counts[TCACHE_MAX_BINS]; tcache_entry *entries[TCACHE_MAX_BINS]; } tcache_perthread_struct; ``` - `TCACHE_MAX_BINS` 應該是 64 - `tcache_entry` 為單向鏈結串列的節點，`next` 指向下一塊 chunk 的 payload - `tcache_perthread_struct` 用來維護 thread local cache - `counts` 表示 bin 裡面的 chunks counts，同個 bin 裡面最多只能放 7 塊，超過的話會被放到 fastbins, unsorted bin 裡面 - `entries` 相當於 tcache 裡面的 bins，也就是共有 64 個 bin --- > 底下演算法主要來自 [MallocInternals - glibc wiki](https://sourceware.org/glibc/wiki/MallocInternals) ## Malloc Algorithm (abstract explanation) - 如果在 tcache 中可以找到合適的 chunk 就將其回傳 - 如果要求的空間夠大則使用 `mmap()` 以直接向 OS 要求記憶體空間。此空間大小的門檻是動態決定的，且這樣分配的空間數量是有限制的 - 如果與要求空間大小適合的 fastbin 裡找的到 chunk 就使用它。若有其他的 availbable chunks 就將他們填到 tcache 裡面 - 如果與要求空間大小適合的 smallbin 裡找的到 chunk 就使用它。若有其他的 availbable chunks 就將他們填到 tcache 裡面 - 如果要求的空間夠大 (>= 1024 in 64-bit system) 則將 fastbins 的全部 chunks 移至 unsorted bin 並進行合併 (coalesce) - 開始將 chunks 由 unsorted bin 移到 small / large bins 並進行合併 (coalesce)，在過程中如果有發現大小合適的 chunk 就使用它 - 接著搜尋適合的 large bin，沒發現夠大的 chunk 就著搜尋更大的 large bins 直至發現一塊夠大的 chunk - 如果 fastbins 裡面還有 chunks (在要求的空間較小 (<= 1024 in 64-bit system) 時會發生) 就進行合併 (consolidate) 並重複前兩步 - 分離一部分的 top chunk (?) > 不確定這邊是甚麼意思，**原文 `Split off part of the "top" chunk, possibly enlarging "top" beforehand.`** > > [Core Functions - heap-exploitation](https://heap-exploitation.dhavalkapil.com/diving_into_glibc_heap/core_functions#void-_int_malloc-mstate-av-size_t-bytes) 根據上面的文章，這邊應該是指如果到最後還沒有回傳適合的 chunk，會在最後把 top chunk 分為兩塊，remainder chunk 成為新的 top chunk 而另一塊將被回傳給使用者 ## Free Algorithm (abstract explanation) 實際上的釋放 (freeing) 一塊記憶體並不是將其還給 OS，`free()` 事實上是將 a chunk of memory 標記為 "free to be reused"，在 OS 看來此空間依舊屬於該 process。如果 "top" chunk 足夠大的話，其一部分的記憶體區塊可能會被 unmapped 並還給 OS。 - 如果 tcache 有足夠空間便將 chunk (`free()` 的目標) 存在 tcache 並 return - 如果 chunk 夠小的話則將其放在適合的 fastbins 裡面 (該 chunk 不會被合併 (coalesce)) - 如果該 chunk 之前是透過 `mmap()` 配置的話即將其 `munmap()` - 確認該 chunk 是否與另一 available chunk 相鄰，是的話就將它們合併 (coalesce) - 如果合併後的 chunk 不是 top chunk 的話將其放到 unsorted bin - 如果該 chunk 夠大就將任何的 fastbins 合併 (coalesce) (?)，再確認 top chunk 是否足夠大，能夠將其一部分的記憶體區塊 unmapped 並還給 OS > 這邊也不確定是甚麼意思，**原文 `coalesce any fastbins`** ## Linux Heap Exploitation ### double free - 即對同一塊 chunk 進行兩次 free 的呼叫，會使 bin 裡面出現兩塊同樣的 chunk，一般而言 glibc 會檢查但僅會檢查上一次 free 的對象與這次是否相同，而不會對整個 bin 甚至是所有 free lists 的 chunks 都進行檢查 - 因此可以在 double free 中間穿插對另一塊 chunk 的 `free()` 來通過上述檢查，使同一個 bin 中有兩塊同樣的 free chunks ### House of Force (top chunk related) 1. 根據上方 malloc algorithm 可知如果到最後還沒有回傳適合的 chunk，會在最後把 top chunk 分為兩塊，remainder chunk 成為新的 top chunk 而另一塊將被回傳給使用者 2. 在此前提下若能透過 overflow 來控制 top chunk size 為一個非常大的數字 (64 個 1, 0xff...f)，就能不透過 `mmap()` 來分配空間 3. 透過 malloc 分配一個接近任意大小的 chunk (`malloc(evil_size)`)，能使我們得到一塊緊鄰目標 (接近任意位址) 的空間把這個大小定義為 evil_size，新的 top chunk 位址、舊的 top chunk 位址分別定義為 new_top、old_top，目標位址定義為 dest，可以由以下式子得出 evil_size 的值 (**2*sizeof(long) == 0x10** 為 chunk header size) 1. new_top = old_top + evil_size + 2*sizeof(long) 2. evil_size + 2*sizeof(long) = new_top - old_top 3. evil_size = new_top - old_top - 2*sizeof(long) 4. evil_size = dest - 2\*sizeof(long) - old_top - 2*sizeof(long) 5. evil_size = dest - old_top - 4*sizeof(long) Reference: [shellphish/how2heap/house_of_force](https://github.com/shellphish/how2heap/blob/master/glibc_2.27/house_of_force.c) 4. 最後我們就能讓 `malloc()` 回傳一個接近任意的位址，也就能達到接近任意位址的讀寫 - CTF: HITCON-Training-Lab11 (解法1 bamboobox1.py) - [Write up](https://github.com/sShaAanGg/ADC-projects/tree/main/project2/house-of-force) ### fastbin attack 1. 在 tcache 已滿的情況下，最後一個 `free()` 的目標 chunk 會被放到對應大小的 fastbin 的最前面 2. 如果連續呼叫 `free()` 並傳入同一塊 chunk 的話，在第二次的時候由於同樣的 chunk 還是該 fastbin 的第一個節點因此會偵測到有 double free 的出現，程式會被終止執行 3. 假如我們先 malloc 3 塊大小為 8 的 chunk，就可以利用上述兩點，先 free chunk 1 接著 free chunk 2 再 free chunk 1，這樣就會使存著 chunk 大小為 8 的 fastbin 前三個 available chunks 變成 chunk 1, chunk 2, chunk 1 4. 現在呼叫 `malloc(8)` 兩次會產生一個奇特的狀況，即我們擁有 chunk1 的讀寫存取權限而 chunk1 卻也同時位於對應的 fastbin 的 head，此時只要修改 chunk1 的前 8 bytes (free chunk 中的 `fd` 欄位) 為我們想要偽造的 chunk address 再呼叫 malloc(8) 即可拿到一塊偽造的 chunk，但為通過對 chunk size 的檢查，需要先把偽造的 chunk 的 size 也設定為 8