# 2021q1 quiz6 * [2021q1 第 6 週測驗題 ](https://hackmd.io/@sysprog/linux2021-quiz6) ## 測驗 2 - hashmap - twoSum ### malloc array of pointer to structure 問題 請問在第六週測驗題內 hash table 的實作 在 `map_init()` 內 `map->ht` 這個 pointer to array of `hlist_head` 是什麼時候定義出來的呢? 為什麼在這一行 `(map->ht)[i].first = NULL;` 可以直接對 array 內的成員做初始化呢? ```c= int *twoSum(int *nums, int numsSize, int target, int *returnSize) { map_t *map = map_init(10); ... } // in map_init() map_t *map_init(int bits) { map_t *map = malloc(sizeof(map_t)); if (!map) return NULL; map->bits = bits; map->ht = malloc(sizeof(struct hlist_head) * MAP_HASH_SIZE(map->bits)); if (map->ht) { for (int i = 0; i < MAP_HASH_SIZE(map->bits); i++) (map->ht)[i].first = NULL; } else { free(map); map = NULL; } return map; } ``` #### 推導過程 `map->ht` 是 pointer to array of `hlist_head`, 然後在 line 13 我們分配了一個 element 數量為 2^bits^ 的 array 給 `map->ht` 這個 pointer to `hlist_head`, 他會指向整個 array 的第一個 element, 我們可以使用 `(map->ht)[i]` 來 access 每個 element 以下為 gdb 的實驗結果 ```shell Starting program: quiz6/twoSum Breakpoint 1, map_init (bits=21845) at twoSum.c:11 11 map_t *map_init(int bits) { (gdb) n 12 map_t *map = malloc(sizeof(map_t)); (gdb) n 13 if (!map) (gdb) n 16 map->bits = bits; (gdb) n 17 map->ht = malloc(sizeof(struct hlist_head) * MAP_HASH_SIZE(map->bits)); (gdb) n 18 if (map->ht) { (gdb) ptype map->ht type = struct hlist_head { struct hlist_node *first; } * (gdb) n 19 for (int i = 0; i < MAP_HASH_SIZE(map->bits); i++) (gdb) n 20 (map->ht)[i].first = NULL; (gdb) ptype (map->ht)[0] type = struct hlist_head { struct hlist_node *first; } (gdb) ``` :::warning 可改用 GDB 的 [watch](https://sourceware.org/gdb/onlinedocs/gdb/Set-Watchpoints.html) 命令 :notes: jserv ::: 由上面的 code 可得知 `map->ht` 的型態確實為 pointer to `hlist_head` ```cpp (gdb) ptype map->ht type = struct hlist_head { struct hlist_node *first; } * ``` 而 `(map->ht)[0]` 的型態為 hlist_head ```cpp (gdb) ptype (map->ht)[0] type = struct hlist_head { struct hlist_node *first; } ``` #### gdb `watch` command 實驗 * set watchpoint at `(map->ht)[10].first` * 使用 * 預想當我們 assign `NULL` 給 `(map->ht)[10].first` 時會停止程式 * 問題: 為什麼第一次會無法觸發,但是第二次可以呢? * [stackoverflow - GDB: How to force a watchpoint to not be deleted after a function returned?](https://stackoverflow.com/questions/25938830/gdb-how-to-force-a-watchpoint-to-not-be-deleted-after-a-function-returned) * [Is the “struct hack” technically undefined behavior?](https://stackoverflow.com/questions/3711233/is-the-struct-hack-technically-undefined-behavior) 這裡的行數與一開始不同! ```shell= (gdb) b 18 Breakpoint 1 at 0x11ca: file twoSum.c, line 18. (gdb) r Starting program: /home/ubuntu/repo/unknowntpo/sandbox/linux/linux2021spring/quiz6/twoSum Breakpoint 1, map_init (bits=10) at twoSum.c:18 18 if (map->ht) { (gdb) watch -l (map->ht)[10].first Hardware watchpoint 2: -location (map->ht)[10].first (gdb) c Continuing. [Inferior 1 (process 67597) exited normally] (gdb) info breakpoints Num Type Disp Enb Address What 1 breakpoint keep y 0x00005555555551ca in map_init at twoSum.c:18 breakpoint already hit 1 time 2 hw watchpoint keep y -location (map->ht)[10].first (gdb) r Starting program: /home/ubuntu/repo/unknowntpo/sandbox/linux/linux2021spring/quiz6/twoSum Breakpoint 1, map_init (bits=10) at twoSum.c:18 18 if (map->ht) { (gdb) c Continuing. Hardware watchpoint 2: -location (map->ht)[10].first Old value = <unreadable> New value = (struct hlist_node *) 0x0 map_init (bits=10) at twoSum.c:19 19 for (int i = 0; i < MAP_HASH_SIZE(map->bits); i++) ``` :::info 對於兩次實驗結果不符合預期有點在意,是不是自己漏掉了什麼條件沒考慮到? ::: #### :question: 延伸問題: 是否可以越界存取 heap 內超過 array 範圍的地址? 程式碼如下: ```c=1 #include <stdio.h> #include <stdlib.h> #define TABLESIZE 10 int main() { int a[TABLESIZE]; // pointer to array of integer int *pa; pa = malloc(sizeof(int) * TABLESIZE); // init array elements for (int i = 0; i < TABLESIZE; i++) (pa)[i] = 5; // try to access pa[40] printf("pa[40] = %d\n", pa[40]); } ``` 使用 gdb 測試 ```shell=1 Reading symbols from ./malloc... (gdb) b 11 Breakpoint 1 at 0x11a4: file malloc.c, line 11. (gdb) r Starting program: /home/ubuntu/xxx/malloc/malloc Breakpoint 1, main () at malloc.c:11 warning: Source file is more recent than executable. 11 pa = malloc(sizeof(int) * TABLESIZE); (gdb) n 12 for (int i = 0; i < TABLESIZE; i++) (gdb) n 13 (pa)[i] = 5; (gdb) n ... 16 printf("pa[40] = %d\n", pa[40]); (gdb) pa[40] = 0 17 } (gdb) x/40xw &pa[0] 0x5555555592a0: 0x00000005 0x00000005 0x00000005 0x00000005 0x5555555592b0: 0x00000005 0x00000005 0x00000005 0x00000005 0x5555555592c0: 0x00000005 0x00000005 0x00000411 0x00000000 0x5555555592d0: 0x345b6170 0x3d205d30 0x000a3020 0x00000000 0x5555555592e0: 0x00000000 0x00000000 0x00000000 0x00000000 0x5555555592f0: 0x00000000 0x00000000 0x00000000 0x00000000 0x555555559300: 0x00000000 0x00000000 0x00000000 0x00000000 0x555555559310: 0x00000000 0x00000000 0x00000000 0x00000000 0x555555559320: 0x00000000 0x00000000 0x00000000 0x00000000 0x555555559330: 0x00000000 0x00000000 0x00000000 0x00000000 ``` 結果發現 `pa[40]` 是可以被 access 的! 如果 `pa[40]`所在的地址已經被別人使用的話會不會造成安全問題呢? 根據林志恩同學的回答 >如果存取超出範圍會是未定義行為。 以下為 Committee Draft — December 2, 2010 ISO/IEC 9899:201x J.2 Undefined behavior An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]) (6.5.6). 其他相關討論也可以看這篇 stackoverflow 的文章: [Is it undefined behaviour to access an array beyond its end, if that area is allocated? [duplicate]](https://stackoverflow.com/questions/12354413/is-it-undefined-behaviour-to-access-an-array-beyond-its-end-if-that-area-is-all) ## [測驗 3](https://hackmd.io/@sysprog/linux2021-quiz6#%E6%B8%AC%E9%A9%97-3)