owned this note
owned this note
Published
Linked with GitHub
# quiz10,quiz13,quiz14
redo the quiz:
## quiz10
* 首先是第一題 AAA 的部份
```clike
static int mylog_mmap(struct file *file, struct vm_area_struct *vma) {
struct mylog *mylog = AAA;
unsigned long pfn_start =
(virt_to_phys(mylog->buff) >> BBB) + CCC;
unsigned long size = vma->vm_end - vma->vm_start;
return remap_pfn_range(vma, vma->vm_start, pfn_start, size,
vma->vm_page_prot);
}
```
宣告一個指向結構 `mylog` 的 pointer
而 b,c 選項不存在於 file structure 之中,a 選項則會產生出 NULL pointer dereference 的問題,而以下是對 `*private_data` 之描述:
>private_data is a useful resource for preserving state information across system calls and is used by most of our sample modules.
也就是說,mylog 這個 pointer 會指向保存相關state information 的地方,故選 **D** **private_data**。
* BBB 和 CCC:
```clike=
static int mylog_mmap(struct file *file, struct vm_area_struct *vma) {
struct mylog *mylog = AAA;
unsigned long pfn_start =(virt_to_phys(mylog->buff) >> BBB) + CCC;
unsigned long size = vma->vm_end - vma->vm_start;
return remap_pfn_range(vma, vma->vm_start, pfn_start, size,
vma->vm_page_prot);
}
```
根據我的理解,這段 code 的用意是在將 kernel memory 中拿來紀錄的記憶體區段,映射到 userspace 之中。
首先,`pfn_start` 在 manpage 中的描述,他是做為 `remap_pfn_range` 的 **physical address of kernel memory** ,也就是說,是被映射之 kernel memory 的實體記憶體的起始位置 [(reference)](https://www.kernel.org/doc/htmldocs/kernel-api/API-remap-pfn-range.html)。
所以這行的行為就是要算出被映射的起始位址,然而 `virt_to_phys` 的功能就是將紀錄的記憶體區段(log buffer in kernel)的虛擬位置轉換成實體位置,在得到實體位置之後將它除以 PAGE_SIZE ( 可利用 `getconf PAGE_SIZE` 得知 page size 為 4096) 故須位移 PAGE_SHIFT 12 ,在得知是第幾個 page 後再加上 page offset 之後便能夠得知 physical address 故 **BBB 選 PAGE_SHIFT**, **CCC 選 vma->vm_pgoff**
[reference](https://shanetully.com/2014/12/translating-virtual-addresses-to-physcial-addresses-in-user-space/)
## quiz13
題目:
* AAA:
從整個 queue 的 init 就能夠大概得知他的運作原理:
它的原理是用 mmap 去映射一個檔案(fd)到兩塊 virtual address space:
以下映射出第一段
```clike=
if (mmap(q->buffer, s, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED,
q->fd, 0) == MAP_FAILED)
queue_error_errno("Could not map buffer into virtual memory");
```
映射第二段:
以第一段映射到的位置加上一塊 buffersize 大小作為第二段映射的起始位置(造成兩塊記憶體相鄰)
```clike=
mmap(q->buffer + s, s, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED,
104 q->fd, 0) == MAP_FAILED)
```
接著是題目存在的 code 區段
```clike=
/** Insert into queue *q* a message of *size* bytes from *buffer*
*
* Blocks until sufficient space is available in the queue.
*/
void queue_put(queue_t *q, uint8_t *buffer, size_t size) {
pthread_mutex_lock(&q->lock);
// Wait for space to become available
while ((q->size - (q->tail - q->head)) < (size + sizeof(message_t)))
pthread_cond_wait(&q->writeable, &q->lock);
// Construct header
message_t m = {.len = size, .seq = q->tail_seq++};
// Write message
memcpy(&q->buffer[q->tail], &m, sizeof(message_t));
memcpy(&q->buffer[AAA], buffer, size);
// Increment write index
q->tail += BBB;
pthread_cond_signal(&q->readable);
pthread_mutex_unlock(&q->lock);
}
```
很明顯的,這個 funtion 是將資料從 "buffer" 放入 queue 之中,然而它紀錄資料的方式不僅僅是放入資料,可以看到它用一個特殊的 structure 叫做 `message_t` 來紀錄有關 buffer 中的 meta data,所以在進行資料的放入時,須考慮 meta data 的空間
在 AAA 處就是要將 buffer 的資料(上一行放 meta data) 放入,所以選 **q->tail + sizeof(message_t)** 也就是放完 meta data 的位置。
* BBB 的部份:
在 meta data 和資料本體都放完後,q->tail 要加上兩者的大小,故選 **size+sizeof(message_t)**
* CCC 的部份:
```clike=
/** Retrieves a message of at most *max* bytes from queue *q* and writes
* it to *buffer*.
*
* Blocks until a message of no more than *max* bytes is available.
*
* Returns the number of bytes in the written message.
*/
size_t queue_get(queue_t *q, uint8_t *buffer, size_t max)
{
pthread_mutex_lock(&q->lock);
// Wait for a message that we can successfully consume to reach the front of
// the queue
message_t m;
for (;;) {
// Wait for a message to arrive
while ((q->tail - q->head) == 0)
pthread_cond_wait(&q->readable, &q->lock);
// Read message header
memcpy(&m, &q->buffer[q->head], sizeof(message_t));
// Message too long, wait for someone else to consume it
if (m.len > max) {
while (q->head_seq == m.seq)
pthread_cond_wait(&q->writeable, &q->lock);
continue;
}
// We successfully consumed the header of a suitable message, so proceed
break;
}
// Read message body
memcpy(buffer, &q->buffer[q->head + sizeof(message_t)], m.len);
// Consume the message by incrementing the read pointer
q->head += m.len + sizeof(message_t);
q->head_seq++;
// When read buffer moves into 2nd memory region, we can reset to the 1st
// region
if (q->head >= q->size) {
CCC;
}
pthread_cond_signal(&q->writeable);
pthread_mutex_unlock(&q->lock);
return m.len;
}
```
這個區段正如 command 所提到,會將資料由 queue 取出放入 buffer,而 CCC 的部份便是在將資料取出後,需要調整的 head 和 tail 標示,因為 mapping 完的架構大致是這樣:
![](https://i.imgur.com/66mnBcG.png)
因此,舉例來說,若是放資料在 3,4,1 的位置,真正存在記憶體中的東西是:1,x(未知),3,4
此時因為已將資料取出,導致 head 的位置位在:
`q->head += m.len + sizeof(message_t);`
所以可以將 head 往前調整至前面的位置(映射至相同實體位置的虛擬位置,差一個 buffer 的 size ),然後 tail 也需要調整一個 buffer 的 size (需來到映射區段一以配合 head ),選 **(a)q->head -= q->size; q->tail -= q->size;**
## quiz14
(1)
* AAA:
根據 `naive_malloc_internal` 的 command 寫到:
>places some header information at the start of the first page, then returns a pointer to the first byte after the header
`/* number of pages to allocate */
size_t pages = AAA;`
所以除了本身的 data 所用的 memory 要加上額外 header 的空間,故選 **(size + sizeof(struct page_header)) / page_size + 1**
* BBB:
在參考 linux manpage 中所提到關於 mmap prot flags 的用途:
>PROT_EXEC Pages may be executed.
>PROT_READ Pages may be read.
>PROT_WRITE Pages may be written.
>PROT_NONE Pages may not be accessed.
也就是說被映射的記憶體在存取上會有限制,
藉由 mmap 傳入不同的 flag 代表這塊記憶存取的權限:
* 直接用 code 進行實驗:
在使用 `PROT_READ` 和 `PROT_READ | PROT_EXEC` 之後皆發生了記憶體存取錯誤
`
Segmentation fault (core dumped)
`
但在使用 `PROT_READ | PROT_WRITE` 之後便成功的執行,因為在此段就執行了記憶體的寫入:
```clike=
/* copy values to page header */
page->pages = pages;
page->page_size = page_size;
return page;
```
所以需要有寫入的權利,選 **PROT_READ | PROT_WRITE**
(2)
* A1:
由當中一段 return "is split" flag 的程式碼得知:
```clike=
/* Given the index of a node, this returns the "is split" flag of the parent.
*/
static int parent_is_split(size_t index)
{
index = (index - 1) / 2;
return (node_is_split[index / 8] >> (index % 8)) & 1;
}
```
它在存取 "is split" flag 時需要執行的動作=>`(node_is_split[index / 8] >> (index % 8)) & 1;`
這邊 A1 需要做的事情是翻轉它(0 變 1,1 變 0) 所以依照上述存取方式再加上一個翻轉的動作=>選
**node_is_split[index / 8] ^= 1 << (index % 8);**
* A2:
在 A2 段的程式碼註解提到:
```clike=
/*
* Given the requested size passed to "malloc", this function returns the index
* of the smallest bucket that can fit that size.
*/
static size_t bucket_for_request(size_t request)
{
size_t bucket = BUCKET_COUNT - 1;
size_t size = MIN_ALLOC;
while (size < request) {
A2
}
return bucket;
}
```
它會根據 request 的記憶體需求去找到合適的 bucket size 進行配置,在程式碼的 command 中有講到關於 bucket index 和 記憶體配置大小的關係
> Given a bucket index, the size of the allocations in that bucket can be found with "(size_t)1 << (MAX_ALLOC_LOG2 - bucket)".
因此,一開始 bucket index 從 27 找起,代表它是從 (size_t) 1 << (31-27) 的 size (16 bytes MIN_ALLOC) 開始尋找適合配置的空間,直到找到裝的下的記憶體空間,所以 bucket index 從 max(27) 開始扣(每扣 1 代表要用大兩倍的 size),選 **`bucket--;size *= 2; `**
* A3:
以下是 A3 的程式碼區段:
```clike=
/*
* The tree is always rooted at the current bucket limit. This call grows the
* tree by repeatedly doubling it in size until the root lies at the provided
* bucket index. Each doubling lowers the bucket limit by 1.
*/
static int lower_bucket_limit(size_t bucket)
{
while (bucket < bucket_limit) {
size_t root = node_for_ptr(base_ptr, bucket_limit);
uint8_t *right_child;
/*
* If the parent isn't SPLIT, that means the node at the current bucket
* limit is UNUSED and our address space is entirely free. In that case,
* clear the root free list, increase the bucket limit, and add a single
* block with the newly-expanded address space to the new root free
* list.
*/
if (!parent_is_split(root)) {
list_remove((list_t *) base_ptr);
list_init(&buckets[--bucket_limit]);
list_push(&buckets[bucket_limit], (list_t *) base_ptr);
continue;
}
/*
* Otherwise, the tree is currently in use. Create a parent node for the
* current root node in the SPLIT state with a right child on the free
* list. Make sure to reserve the memory for the free list entry before
* writing to it. Note that we do not need to flip the "is split" flag
* for our current parent because it's already on (we know because we
* just checked it above).
*/
right_child = ptr_for_node(root + 1, bucket_limit);
if (!update_max_ptr(right_child + sizeof(list_t))) return 0;
list_push(&buckets[bucket_limit], (list_t *) right_child);
A3
```
這個 lower_bucket_fnction 的作用是把 root 調整到傳入的 bucket index 上,同時代表增加 tree 的 size 。此時,會出現兩種狀態,一種是 parent_is_split 的狀態,另一則是 no spllit 的狀態,在 "is_split" 的狀態之下註解描述到:
>the tree is currently in use. Create a parent node for the current root node in the SPLIT state with a right child on the free list.
也就是說要增加 tree size => 增加 parent node 的空間(調整現在 root node)
```clike=
right_child = ptr_for_node(root + 1, bucket_limit);
if (!update_max_ptr(right_child + sizeof(list_t))) return 0;
list_push(&buckets[bucket_limit], (list_t *) right_child);
A3
```
所以,以上操作會檢視是否 allocator 能夠配置足夠的空間,若不夠會直接 return 0,相反的,若是有足夠的空間就可以把 right child 這個 entry 加到此 bucket limit 之下,同時 init 下一個 更大 size 的 bucket (size 每兩倍 limit -1), 選 **list_init(&buckets[- -bucket_limit]);**
* A4:
A4 的程式碼區段如下:
```clike=
/*
* Otherwise, grow the tree one more level and then pop a block off
* the free list again. Since we know the root of the tree is used
* (because the free list was empty), this will add a parent above
* this node in the SPLIT state and then add the new right child
* node to the free list for this bucket. Popping the free list will
* give us this right child.
*/
if (!lower_bucket_limit(bucket - 1)) return NULL;
A4
```
因為 `list_pop` 若 free list is empty 會回傳 NULL 指標,代表若是此 bucket 尚具有 free list 則會來到 A4 的程式區段 ( `list_post` 非回傳 null),所以會進行 "grow the tree one more level" 然後在從同一個 bucket 在 pop 一次,選 **`ptr = (uint8_t *)list_pop(&buckets[bucket]);`**
* A5:
code 部份:
```clike=
/*
* If we get here, we know our buddy is UNUSED. In this case we should
* merge with that buddy and continue traversing up to the root node. We
* need to remove the buddy from its free list here but we don't need to
* add the merged parent to its free list yet. That will be done once
* after this loop is finished.
*/
A5
i = (i - 1) / 2;
bucket--;
```
在 A5 的程式碼區段中的註解到:
>We need to remove the buddy from its free list
然而上方程式碼註解曾提到在 node_is_split 中訪問各節點的存取方法:
```clike=
Move to parent: index = (index - 1) / 2;
Move to left child: index = index * 2 + 1;
Move to right child: index = index * 2 + 2;
Move to sibling: index = ((index - 1) ^ 1) + 1;
```
故要存取 buddy(sibling) 方法為:`index = ((index - 1) ^ 1) + 1;`
選 `**list_remove((list_t *)ptr_for_node(((i - 1) ^ 1) + 1, bucket));**`