cs:app
下圖為常見的虛擬記憶體架構圖,其中有些資料結構的大小(如陣列)是在程式運行期間動態決定的,所以必需有一個可以動態成長的區域,稱作 heap,對於每一個進程(process),kernel 會維護一個指針 brk
用來指向 heap 的頂部
本次實驗要求學生實作動態記憶體分配器,包含 malloc
、free
及 realloc
,這類分配器我們稱為顯式分配器(explicit allocator
),程式必需顯式地呼叫 free
來釋放記憶體空間。一個顯式分配器必需符合以下約束條件
malloc
在 32 位元系統為 8 Byte 對齊,64 位元則為 16 Byte 對齊而分配器的撰寫必需在這些限制條件下,盡可能地找到吞吐率(Throughput)及空間利用率(Utilization)之間的平衡點
在 CMU Lab Writeup [2] 中可找到本次實驗的評分標準,其中 libc
的 malloc
的吞吐率
要處理好吞吐率與空間利用率的平衡,分配器的實作必需考慮以下問題
動態記憶體管理器需要一些資料結構,來區別 block 的邊界、block 是否已經分配等,課本給出一個基本的 block 格式,注意 malloc
回傳的指針指向 payload (而非 header) 區域的起始地址
Implicit free list 將 heap 視為一連續的已分配以及 free block 的序列。要留意的是因為 header & footer 各需要 4 Byte 的空間,代表最小的 block 大小為 8 Byte,如果程式僅要求 1 Byte 的空間,對齊要求為 8 Byte,我們還是需要配置 8(header + footer) + 8(1 + 7) = 16 Byte。
為了避免合併時複雜的邊界處理,我們在 heap 的起始端放置 prologue block,而尾端放置 epilogue block,分配器維護一個私有的全局變數 heap_listp
,指向 prologue block 第二個 block 的起始位置,就像一般的 block pointer 一樣
課本提供了 Impilict free list 的基本架構,首先我們定義一些基本的常數及 Macro
/* single word (4) or double word (8) alignment */
#define ALIGNMENT 8
/* rounds up to the nearest multiple of ALIGNMENT */
#define ALIGN(size) (((size) + (ALIGNMENT-1)) & ~0x7)
#define SIZE_T_SIZE (ALIGN(sizeof(size_t)))
/* Constants and macros */
#define WSIZE 4 /* Word & header/footer size in Byte */
#define DSIZE 8 /* Double words size in Byte */
#define CHUNKSIZE (1<<12) /* Extend heap size (4K Byte) when failed to find a proper block */
#define MAX(x, y) ((x) > (y)? (x) : (y))
/* PACK size & allocated bit into a word */
#define PACK(size, alloc) ((size) | (alloc))
/* read & write a word at address p */
#define GET(p) (*(unsigned int*)(p))
#define PUT(p, val) (*(unsigned int*)(p) = (val))
/* get size and allocated bit of a block */
#define GET_SIZE(p) (GET(p) & ~0x7)
#define GET_ALLOC(p) (GET(p) & 0x1)
/* compute address of header and footer from a given pointer of a block */
#define HDRP(bp) ((char *)(bp) - WSIZE)
#define FTRP(bp) ((char *)(bp) + GET_SIZE(HDRP(bp)) - DSIZE)
/* compute address of next and previous block from a given pointer of a block */
#define NEXT_BLKP(bp) ((char *)(bp) + GET_SIZE(((char *)(bp) - WSIZE)))
#define PREV_BLKP(bp) ((char *)(bp) - GET_SIZE(((char *)(bp)- DSIZE)))
在使用 malloc
及 free
之前,我們必需初始化 heap
heap_listp
指向 prologue block 的第二個 blockCHUNKSIZE
的 free block/*
* mm_init - initialize the malloc package.
*/
int mm_init(void)
{
/* Create initial empty heap */
if (heap_listp = mem_sbrk(4 * WSIZE) == (void *) - 1)
return -1;
PUT(heap_listp, 0);
PUT(heap_listp + WSIZE, PACK(DSIZE, 1));
PUT(heap_listp + 2 * WSIZE, PACK(DSIZE, 1));
PUT(heap_listp + 3 * WSIZE, PACK(0, 1));
heap_listp += DSIZE;
/* Extend heap by CHUNKSIZE */
if (extend_heap(CHUNKSIZE/WSIZE) == NULL)
return -1;
return 0;
}
mem_sbrk
為 CMU 提供的向系統要求 heap 空間的介面,撰寫於 memlib.c
中。memlib.c
維護一個私有的全局變數 mem_brk
紀錄 heap 頂端的地址。mem_sbrk
不允許 heap 縮減及超過允許的地址範圍。mem_brk
的回傳值為舊的 mem_brk
指針
/*
* mem_sbrk - simple model of the sbrk function. Extends the heap
* by incr bytes and returns the start address of the new area. In
* this model, the heap cannot be shrunk.
*/
void *mem_sbrk(int incr)
{
char *old_brk = mem_brk;
if ( (incr < 0) || ((mem_brk + incr) > mem_max_addr)) {
errno = ENOMEM;
fprintf(stderr, "ERROR: mem_sbrk failed. Ran out of memory...\n");
return (void *)-1;
}
mem_brk += incr;
return (void *)old_brk;
}
extend_heap
跟系統要求 words * 4/(words+1) * 4
Byte 的空間,有兩種情境會被呼叫
我們以初始化 heap 來簡單說明 extend_heap
函式在做的事情
初始化 empty heap 時 heap block 分配的狀態
呼叫 mem_sbrk
後 heap 的狀態
建立新 free block 的 header 及 footer,mem_sbrk
回傳的指針指向 old_mem_brk
,其實就是新 free block 的 block pointer 位置,因此我們使用 Macro HDRP
及 FTRP
取得 header & footer 的地址,並設定 block size & allocated bit
更新 epilogue header,以 block_ptr
為輸入呼叫 NEXT_BLPR
,取得的地址為mem_brk
,再呼叫 HDRP
即可取得 epilogue header 的地址
若因為 heap 空間不夠而呼叫 extend_heap
,此時 heap 的尾端可能是 free block,呼叫 coalesce(bp)
確認並合併(若需要) free block
static void *extend_heap(size_t words) {
char *bp;
int size;
size = (words % 2)? (words + 1) * WSIZE: words * WSIZE;
if ((long)(bp = mem_sbrk(size)) == -1)
return NULL;
PUT(HDRP(bp), PACK(size, 0)); /* Header */
PUT(FTRP(bp), PACK(size, 0)); /* Footer */
PUT(HDRP(NEXT_BLKP(bp)), PACK(0, 1)); /* New epilogue header */
return coalesce(bp);
}
合併需要考慮 free block 前後 block 的狀態,共有四種可能性,我改寫了課本的程式碼,讓程式碼較為精簡
除 case 1 之外,都需利用 block pointer 重新設定 header & footer,另外因為 FTRP
的指標是透過 header 的 size 計算,所以必需先完成 header 再進行 footer 的設定
static void *coalesce(void *bp) {
size_t prev_alloc, next_alloc;
size_t size;
prev_alloc = GET_ALLOC(FTRP(PREV_BLKP(bp)));
next_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
size = GET_SIZE(HDRP(bp));
if (prev_alloc && next_alloc) { /* case 1 */
return bp;
}
else if (prev_alloc && (!next_alloc)) { /* case 2 */
size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
}
else if ((!prev_alloc) && next_alloc) { /* case 3 */
size += GET_SIZE(FTRP(PREV_BLKP(bp)));
bp = PREV_BLKP(bp);
}
else { /* case 4 */
size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
size += GET_SIZE(FTRP(PREV_BLKP(bp)));
bp = PREV_BLKP(bp);
}
PUT(HDRP(bp), PACK(size, 0));
PUT(FTRP(bp), PACK(size, 0));
return bp;
}
釋放的動作也需要考慮合併
/*
* mm_free - Freeing a block does nothing.
*/
void mm_free(void *bp)
{
size_t size = GET_SIZE(HDRP(bp));
PUT(HDRP(bp), PACK(size, 0));
PUT(FTPR(bp), PACK(size, 0));
coalesce(bp);
}
分配 block 需要考慮
NULL
而我們必需自行實作
find_fit
:找尋合適的 free block 放置新分配的 blockplace
:放置新分配的 block/*
* mm_malloc - Allocate a block by incrementing the brk pointer.
* Always allocate a block whose size is a multiple of the alignment.
*/
void *mm_malloc(size_t size)
{
size_t asize;
char* bp;
/* ignore spurious request */
if (size == 0)
return NULL;
/* adjust size for alignment */
if (size <= DSIZE)
asize = 2 * DSIZE;
else
asize = (((size + DSIZE) + (DSIZE - 1))/ DSIZE) * DSIZE; /* csapp p73. (x + (1<<k) - 1) >> k */
/* search free list for a fit */
if ((bp = find_fit(asize)) != NULL) {
place(bp, asize);
return bp;
}
/* expand heap if no fit found */
size_t extend_size;
extend_size = MAX(asize, CHUNKSIZE);
if ((bp = extend_heap(extend_size/WSIZE)) == NULL)
return NULL;
place(bp, asize);
return bp;
}
課本介紹了三種常見的分配 block 搜索策略,本文採用 first fit 作為分配的策略
static void *find_fit(size_t asize) {
size_t blk_size;
void *bp = heap_listp;
for (; (blk_size = GET_SIZE(HDRP(bp))) != 0; bp = NEXT_BLKP(bp)) {
if ((GET_ALLOC(HDRP(bp)) == 0) && (blk_size >= asize))
return bp;
}
return NULL;
}
找到合適的 free block 之後,必需設置分配的 block
static void place(void *bp, size_t asize) {
size_t blk_size = GET_SIZE(HDRP(bp));
size_t split_size = blk_size - asize;
if (split_size < 16) {
PUT(HDRP(bp), PACK(blk_size, 1));
PUT(FTRP(bp), PACK(blk_size, 1));
}
else {
PUT(HDRP(bp), PACK(asize, 1));
PUT(FTRP(bp), PACK(asize, 1));
bp = NEXT_BLKP(bp);
PUT(HDRP(bp), PACK(split_size, 0));
PUT(FTRP(bp), PACK(split_size, 0));
}
}
mm_realloc
需要配合資料結構一併更新,否則最後兩個測試不會通過, mm_realloc
要求如下
The call to mm_realloc changes the size of the memory block pointed to by ptr (the old block) to size bytes and returns the address of the new block. Notice that the address of the new block might be the same as the old block, or it might be different, depending on your implementation, the amount of internal fragmentation in the old block, and the size of the realloc request.
The contents of the new block are the same as those of the old ptr block, up to the minimum of the old and new sizes. Everything else is uninitialized. For example, if the old block is 8 bytes and the new block is 12 bytes, then the first 8 bytes of the new block are identical to the first 8 bytes of the old block and the last 4 bytes are uninitialized. Similarly, if the old block is 8 bytes and the new block is 4 bytes, then the contents of the new block are identical to the first 4 bytes of the old block.
/*
* mm_realloc - Implemented simply in terms of mm_malloc and mm_free
*/
void *mm_realloc(void *ptr, size_t size)
{
void *oldptr = ptr;
void *newptr;
size_t copySize;
newptr = mm_malloc(size);
if (newptr == NULL)
return NULL;
size = GET_SIZE(HDRP(oldptr));
copySize = GET_SIZE(HDRP(newptr));
if (size < copySize)
copySize = size;
memcpy(newptr, oldptr, copySize-WSIZE);
mm_free(oldptr);
return newptr;
}
./mdriver -a -g -v -t tracefiles/
Results for mm malloc:
trace valid util ops secs Kops
0 yes 99% 5694 0.009243 616
1 yes 99% 5848 0.007030 832
2 yes 99% 6648 0.011351 586
3 yes 100% 5380 0.008369 643
4 yes 66% 14400 0.000088164384
5 yes 92% 4800 0.007295 658
6 yes 92% 4800 0.006617 725
7 yes 55% 12000 0.156297 77
8 yes 51% 24000 0.294409 82
9 yes 27% 14401 0.122143 118
10 yes 34% 14401 0.002531 5689
Total 74% 112372 0.625372 180
Perf index = 44 (util) + 12 (thru) = 56/100
correct:11
perfidx:56
另一種紀錄 free block 的資料結構為 explicit free list,其透過 double-linked list 將所有 free block 串連起來,由於 free block 的 payload 區域是沒有被使用的,因此我們可以把 linked list 指向前後 free block 的指針放在 payload 區
此時 free block list 中兩個相鄰的 block 其記憶體地址不一定是相鄰的
邏輯上
物理上
我們先定義一些操作 double-linked list 的 Macro
NEXT_PTR(bp)/PREV_PTR(bp)
:取得 free block 放置 next / prev
指針的指針GET_NEXT(bp)/GET_PREV(bp)
:取得指向 next / prev
free block 的指針PUT_PTR(p, ptr)
:將 ptr
放在 NEXT_PTR(bp)/PREV_PTR(bp)
指向的記憶體空間/* compute address of next and previous free block from a given pointer of a free block */
#define NEXT_PTR(bp) ((char *)(bp) + WSIZE)
#define PREV_PTR(bp) ((char *)(bp))
#define GET_NEXT(bp) (*(char **)(NEXT_PTR(bp)))
#define GET_PREV(bp) (*(char **)(bp))
#define PUT_PTR(p, ptr) (*(unsigned int *)(p) = (unsigned int)(ptr))
在初始化 heap 時需要同時初始化 explicit free list,extend_heap
要需初始化 next
& prev
指針為 NULL
/*
* mm_init - initialize the malloc package.
*/
int mm_init(void)
{
/* Create initial empty heap */
if ((heap_listp = mem_sbrk(4 * WSIZE)) == (void *) - 1)
return -1;
PUT(heap_listp, 0);
PUT(heap_listp + WSIZE, PACK(DSIZE, 1));
PUT(heap_listp + 2 * WSIZE, PACK(DSIZE, 1));
PUT(heap_listp + 3 * WSIZE, PACK(0, 1));
heap_listp += DSIZE;
/* Extend heap by CHUNKSIZE */
char *bp;
exp_listp = NULL;
if ((bp = extend_heap(CHUNKSIZE/WSIZE)) == NULL)
return -1;
/* Set Null pointers*/
PUT_PTR(PREV_PTR(bp), NULL);
PUT_PTR(NEXT_PTR(bp), NULL);
return 0;
}
static void *extend_heap(size_t words) {
char *bp;
size_t size;
size = (words % 2)? (words + 1) * WSIZE: words * WSIZE;
if ((long)(bp = mem_sbrk(size)) == -1)
return NULL;
PUT(HDRP(bp), PACK(size, 0)); /* Header */
PUT(FTRP(bp), PACK(size, 0)); /* Footer */
PUT(HDRP(NEXT_BLKP(bp)), PACK(0, 1)); /* New epilogue header */
PUT_PTR(PREV_PTR(bp), NULL);
PUT_PTR(NEXT_PTR(bp), NULL);
return coalesce(bp);
}
mm_malloc
時是否切割出新的 free block 會影響 free list 的維護
若切割
若不切割
static void place(void *bp, size_t asize) {
size_t blk_size = GET_SIZE(HDRP(bp));
size_t split_size = blk_size - asize;
void *prev_blk = GET_PREV(bp);
void *next_blk = GET_NEXT(bp);
if (split_size < 16) {
PUT(HDRP(bp), PACK(blk_size, 1));
PUT(FTRP(bp), PACK(blk_size, 1));
remove_free_blk(bp, prev_blk, next_blk);
}
else {
PUT(HDRP(bp), PACK(asize, 1));
PUT(FTRP(bp), PACK(asize, 1));
// if (exp_listp == bp)
// exp_listp = NEXT_BLKP(bp);
bp = NEXT_BLKP(bp);
PUT(HDRP(bp), PACK(split_size, 0));
PUT(FTRP(bp), PACK(split_size, 0));
insert_free_blk(bp);
// if (prev_blk)
// PUT_PTR(NEXT_PTR(prev_blk), bp);
// if (next_blk)
// PUT_PTR(PREV_PTR(next_blk), bp);
// PUT_PTR(PREV_PTR(bp), prev_blk);
// PUT_PTR(NEXT_PTR(bp), next_blk);
}
}
static void remove_free_blk(void *bp) {
void *prev_blk = GET_PREV(bp);
void *next_blk = GET_NEXT(bp);
if (!prev_blk && !next_blk) /* empty linked-list */
exp_listp = NULL;
else if (!prev_blk && next_blk) /* first node */
{
PUT_PTR(PREV_PTR(next_blk), NULL);
exp_listp = next_blk;
}
else if (prev_blk && !next_blk) /* last node */
{
PUT_PTR(NEXT_PTR(prev_blk), NULL);
} else
{
PUT_PTR(PREV_PTR(next_blk), prev_blk);
PUT_PTR(NEXT_PTR(prev_blk), next_blk);
}
}
static void insert_free_blk(void *bp) {
char *next_blk = exp_listp;
PUT_PTR(PREV_PTR(bp), NULL);
PUT_PTR(NEXT_PTR(bp), next_blk);
if (next_blk)
PUT_PTR(PREV_PTR(next_blk), bp);
exp_listp = bp;
}
block 的釋放要額外考慮被釋放的 free block 該如何插入 free list,常見有兩種作法,本文採用 LIFO
合併同樣分為 4 個 case
static void *coalesce(void *bp) {
size_t prev_alloc, next_alloc;
size_t size;
void *next_blk = NEXT_BLKP(bp);
void *prev_blk = PREV_BLKP(bp);
prev_alloc = GET_ALLOC(FTRP(PREV_BLKP(bp)));
next_alloc = GET_ALLOC(HDRP(NEXT_BLKP(bp)));
size = GET_SIZE(HDRP(bp));
if (prev_alloc && next_alloc) { /* case 1 */
insert_free_blk(bp);
return bp;
}
else if (prev_alloc && (!next_alloc)) { /* case 2 */
size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
remove_free_blk(next_blk);
}
else if ((!prev_alloc) && next_alloc) { /* case 3 */
size += GET_SIZE(FTRP(PREV_BLKP(bp)));
remove_free_blk(prev_blk);
bp = prev_blk;
}
else { /* case 4 */
size += GET_SIZE(HDRP(NEXT_BLKP(bp)));
size += GET_SIZE(FTRP(PREV_BLKP(bp)));
remove_free_blk(next_blk);
remove_free_blk(prev_blk);
bp = prev_blk;
}
PUT(HDRP(bp), PACK(size, 0));
PUT(FTRP(bp), PACK(size, 0));
insert_free_blk(bp);
return bp;
}
find_fit
配合 explicit free list 做相應的修改
static void *find_fit(size_t asize) {
void *bp = exp_listp;
for (; bp != NULL; bp = GET_NEXT(bp)) {
if (GET_SIZE(HDRP(bp)) >= asize)
return bp;
}
return NULL;
}
./mdriver -a -g -v -t tracefiles/
Results for mm malloc:
trace valid util ops secs Kops
0 yes 89% 5694 0.000408 13946
1 yes 92% 5848 0.000197 29685
2 yes 94% 6648 0.000452 14705
3 yes 96% 5380 0.000402 13393
4 yes 66% 14400 0.000227 63436
5 yes 88% 4800 0.000519 9247
6 yes 85% 4800 0.000580 8276
7 yes 55% 12000 0.004255 2820
8 yes 51% 24000 0.003309 7253
9 yes 26% 14401 0.130565 110
10 yes 34% 14401 0.002683 5367
Total 71% 112372 0.143597 783
Perf index = 42 (util) + 40 (thru) = 82/100
correct:11
perfidx:82
Segregated fit free list 為 explicit free list 的陣列,每一個 list 對應不同大小的 free block 如下示意
實作邏輯與 explicit free list 雷同,程式碼改動如下:
mm_init
需初始化整個 free list 陣列
static void *seg_listp[13];
...
/*
* mm_init - initialize the malloc package.
*/
int mm_init(void)
{
/* Create initial empty heap */
if ((heap_listp = mem_sbrk(4 * WSIZE)) == (void *) - 1)
return -1;
PUT(heap_listp, 0);
PUT(heap_listp + WSIZE, PACK(DSIZE, 1));
PUT(heap_listp + 2 * WSIZE, PACK(DSIZE, 1));
PUT(heap_listp + 3 * WSIZE, PACK(0, 1));
heap_listp += DSIZE;
/* Extend heap by CHUNKSIZE */
char *bp;
for (int i = 0; i < 13; i++)
seg_listp[i] = NULL;
if ((bp = extend_heap(CHUNKSIZE/WSIZE)) == NULL)
return -1;
/* Set Null pointers*/
PUT_PTR(PREV_PTR(bp), NULL);
PUT_PTR(NEXT_PTR(bp), NULL);
return 0;
}
find_fit
從符合大小的 free list 開始迭代,若找不到合適的 free block 則搜索下一個更大的 free list,若整個 free list 中沒有合適的 block 就回傳NULL
static void *find_fit(size_t asize) {
void *bp;
int idx = get_list_idx(asize);
for (; idx < 13; idx++) {
bp = seg_listp[idx];
for (; bp != NULL; bp = GET_NEXT(bp)) {
if (GET_SIZE(HDRP(bp)) >= asize)
return bp;
}
}
return NULL;
}
remove_free_blk
& insert_free_blk
必需依照 block 大小從陣列中取出對應的 free list。insert_free_blk
將 block 由小至大排列,讓大的 block 盡可能擺在較後面的位置,提昇空間利用率
static void remove_free_blk(void *bp) {
void *prev_blk = GET_PREV(bp);
void *next_blk = GET_NEXT(bp);
size_t size = GET_SIZE(HDRP(bp));
int idx = get_list_idx(size);
if (!prev_blk && !next_blk) /* empty linked-list */
seg_listp[idx] = NULL;
else if (!prev_blk && next_blk) /* first node */
{
PUT_PTR(PREV_PTR(next_blk), NULL);
seg_listp[idx] = next_blk;
}
else if (prev_blk && !next_blk) /* last node */
{
PUT_PTR(NEXT_PTR(prev_blk), NULL);
} else
{
PUT_PTR(PREV_PTR(next_blk), prev_blk);
PUT_PTR(NEXT_PTR(prev_blk), next_blk);
}
}
/* insert free block into segragated free list in order of block size */
static void insert_free_blk(void *bp) {
size_t size = GET_SIZE(HDRP(bp));
int idx = get_list_idx(size);
void *root = seg_listp[idx];
void *prev = NULL;
void *next = root;
while (next) {
if (GET_SIZE(HDRP(bp)) < GET_SIZE(HDRP(next))) break;
prev = next;
next = GET_NEXT(next);
}
if (next == root)
{
PUT_PTR(PREV_PTR(bp), NULL);
PUT_PTR(NEXT_PTR(bp), next);
if (next)
PUT_PTR(PREV_PTR(next), bp);
seg_listp[idx] = bp;
} else
{
PUT_PTR(PREV_PTR(bp), prev);
PUT_PTR(NEXT_PTR(bp), next);
PUT_PTR(NEXT_PTR(prev), bp);
if (next)
PUT_PTR(PREV_PTR(next), bp);
}
}
get_list_idx
計算 block 對應的 list index
static int get_list_idx(size_t asize) {
int idx = 0;
while (asize > 1 && idx < 12) {
asize >>= 1;
idx++;
}
return idx;
}
./mdriver -a -g -v -t tracefiles/
Results for mm malloc:
trace valid util ops secs Kops
0 yes 99% 5694 0.000479 11895
1 yes 99% 5848 0.000362 16168
2 yes 99% 6648 0.000396 16779
3 yes 100% 5380 0.000306 17582
4 yes 66% 14400 0.000425 33858
5 yes 96% 4800 0.001190 4033
6 yes 95% 4800 0.001127 4258
7 yes 55% 12000 0.011627 1032
8 yes 51% 24000 0.039899 602
9 yes 31% 14401 0.117794 122
10 yes 30% 14401 0.002925 4923
Total 75% 112372 0.176530 637
Perf index = 45 (util) + 40 (thru) = 85/100
correct:11
perfidx:85
完整程式碼請參考 Github 連結