Try   HackMD

2020q1 Homework1 (lab0)

contributed by < unknowntpo >

實驗精神

“If I had eight hours to chop down a tree, I’d spend six hours sharpening my axe.” – Abraham Lincoln

「如果我有 8 小時可以砍 1 棵樹,我會花 6 小時把斧頭磨利。」 – 亞伯拉罕.林肯 (類似漢語「工欲善其事,必先利其器」)

實驗初稿 lab0-c SandBox

Map:

  • lab0-c 架構是什麼?

    • what does dudect do?
  • What's the relationship of queue and linked list?

  • how to install git hook using shell script?

  • how to explain the structure of Makefile?

    • how to control the verbosity?
    • how to embedded valgrind into Makefile?

TODO :

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
代表目前進度

  • queue 實作
    • q_new:
    • q_free:
    • q_insert_head:
    • q_insert_tail:
    • q_remove_head:
    • q_size:
    • q_reverse:
    • q_sort:
  • makefile 學習
  • 思考如何使用 gdb debug 整個 lab0-c
  • Valgrind 實驗
  • 論文 Dude, is my code constant time?
  • qtest 命令直譯器的實作
  • 參考 AdrianHuang 同學 repo 撰寫好的 git commit
    • 模仿開發流程
  • RIO 套件分析與可改善項目探討
    • 以 cpfile.c 來探討 CSAPP RIO pacakge

      TODO: 改善 rio_package 使 rio_readlineb() 能在一次讀完一整個 單字時,就判斷是否該停止讀取,而不是每讀一個 byte 就要判斷是否為 \n or EOF

Outline

queue.c 實作:

q_size()

Return 時順便做 null queue 的檢查

int q_size(queue_t *q)
{
    return q ? q->size : 0;
}

q_new()

queue_t *q_new()
{
    queue_t *q = malloc(sizeof(queue_t));

    if(!q)
        return NULL;
    
    q->head = NULL;
    q->tail = NULL;
    q->size = 0; // set the size to zero.
   
    return q;
}

q_free 實作

q_insert_head 實作

  • 參考 jserv 老師的 code review 使用 NULL_PTR_GUARD() macro 來簡化對 NULL pointer 的檢查
// in queue.c
/* Null pointer guard */
#define NULL_PTR_GUARD(q) \
    do {                  \
        if (!q)           \
            return false; \
    } while (0)

code: 解釋:

  • Step1:

    • 一開始先檢查是否傳遞的 q 為 NULL queue
    • 並使用 strlen() 求得字串長度,儲存在型態為 size_t 的變數 len
  • Step2:

    • 為新的 node newh 配置記憶體空間,
      • 使用 NULL_PTR_GUARD 做 null pointer 檢查,
        • 如果配置失敗,就 return false
      • newh->value 配置記憶體空間,以便之後把輸入的字串 s 複製到新的 node 上
        • 一樣進行 null pointer 的檢查,不過如果這裡配置失敗的話,必須先釋放整個 newh 結構體的記憶體空間才能 return false, 所以沒辦法用 NULL_PTR_GUARD, 只能手動寫一個。
  • Step3:

  • Step4:

    • 更新 q->headq->tail 所指向的位置
      • q->tail 要分成兩種情況討論
        • 如果 q_insert_head()執行前節點數量為 0,
          • 這時就必須讓 q->tail 指向新的 node newh,
        • 如果 q_insert_head()執行前節點數量為 1,
          • 這時就不能更動 q->tail 的位置
    • 更新 q->size 的數值
  • Step5:

    • return true
bool q_insert_head(queue_t *q, char *s)
{
    size_t len = strlen(s);
    NULL_PTR_GUARD(q);

    list_ele_t *newh = malloc(sizeof(list_ele_t));
    NULL_PTR_GUARD(newh);

    newh->value = malloc(len + 1);
    if (!newh->value) {
        free(newh);
        return false;
    }

    strncpy(newh->value, s, len);
    newh->value[len] = '\0';

    newh->next = q->head;
    q->head = newh;
    if (!q->tail)
        q->tail = newh;

    q->size++;

    return true;
}

q_insert_tail 實作

  • 使用 newt 代表新插入的 node
  • 大致上與 q_insert_head() 相同,不同的地方是,必須考慮到兩種情況
    • Empty queue
    • q->size >= 1
  • Empty queue
    • 此時 q->headq->tail 都指向 NULL
    • 在更新 q->headq->tail 時必須 讓他們同時指向newt
  • q->size >= 1
    • 此時 q->headq->tail 都不為 NULL, 分別指向 queue 的 開頭與結尾
      • 我們只需要更新 q->tail 的位置
      • 首先讓 q->tail->next = newt 來把新的 node 接在尾端
      • 再來更新 q->tail 的位置讓他指向新的尾端,也就是 newt
bool q_insert_tail(queue_t *q, char *s)
{
    NULL_PTR_GUARD(q);
    size_t len = strlen(s);

    list_ele_t *newt = malloc(sizeof(list_ele_t));
    NULL_PTR_GUARD(newt);
    newt->next = NULL;
    newt->value = malloc(len + 1);
    if (!newt->value) {
        free(newt->value);
        return false;
    }

    strncpy(newt->value, s, len);
    newt->value[len] = '\0';

    if (!q->tail) {  // empty queue
        q->tail = newt;
        q->head = newt;
    } else {
        q->tail->next = newt;
        q->tail = q->tail->next;
    }

    q->size++;

    return true;
}

q->remove_head() 實作

先確認是否為下列3種狀況

  • !sp
  • !q
  • !q->head

再來決定 要 copy 的字元數量

如果 bufsize == strlen(q->head->value) 呢?

  • 可能有一個 byte 沒有 copy 到
  • line 16 待驗證
  • * ok /* FIXME: verify if bufsize == strlen(q->head->value) */
  • make test fail, 待解決

commit:
Fix the boundary bug of sp in q_remove_head()

Consider the condition when
bufsize == strlen(q->head->value) implies that
ncopy == strlen(q->head->value) == bufsize, there's no place for placing '\0' at last of the sp. It may cause error.

  • 如果 bufsize - 1 大於等於要刪除的字串長度

    • ncopy 就是 要刪除的字串長度
      • Image Not Showing Possible Reasons
        • The image file may be corrupted
        • The server hosting the image is unavailable
        • The image path is incorrect
        • The image format is not supported
        Learn More →
        : 只要關注 不包含 \0 的字串長度就好
  • 如果 bufsize 小於要刪除的字串長度

    • ncopy 就是 bufsize - 1
      • 因為要留一個 byte 放置 \0
/* * Attempt to remove element from head of queue. * Return true if successful. * Return false if queue is NULL or empty. * If sp is non-NULL and an element is removed, copy the removed string to *sp * (up to a maximum of bufsize-1 characters, plus a null terminator.) * The space used by the list element and the string should be freed. */ bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { list_ele_t *old_h; if (!q || !q->head || !sp) return false; size_t ncopy = (bufsize - 1) >= strlen(q->head->value) ? strlen(q->head->value) : bufsize-1; strncpy(sp, q->head->value, ncopy); sp[ncopy] = '\0'; old_h = q->head; q->head = q->head->next; free(old_h->value); free(old_h); q->size--; return true; }

q_reverse() 實作

Swap 概念實作 q_reverse()

一開始先把 q->head, q->tail 交換位置,再來使用 for-loop 把箭頭一個個反向, 最後把 q->tail 設為 0 後結束 function

void q_reverse(queue_t *q)
{
    if (!q || q->size == 1)
        return;
    list_ele_t *pre, *cur, *nxt;

    /* Swap q->head and q->tail */
    pre = q->head;
    q->head = q->tail;
    q->tail = pre;

    /* reverse the linked list */
    for(q->head->next = q->tail, pre = q->tail, cur = pre->next, nxt = cur->next; ;) {
        /* reverse */
        cur->next = pre;

        /* update the pointer */
        pre = cur;
        if (cur == q->head) {
            q->tail->next = NULL;
            return; // means we're done
        } else {
            cur = nxt;
            nxt = nxt->next;
        }

    }
}

步驟說明

假設一開始為

5 -> 2 -> 1 -> 3 -> 4 -> x
|                   |   
head                tail

一開始先 swap q->headq->tail ,避免移動指標到下一個 node 時不小心 dereference NULL pointer (在 nxt = nxt->next這一行, 如果 nxt->next == NULL 時就會出錯)

5 -> 2 -> 1 -> 3 -> 4 -> x
|                   |   
tail                head

再來就是使用 for-loop 把 linked list 的箭頭反向, 一開始初始化迴圈時,

  • 先把 q->headq->tail 接起來,
    • 避免移動指標到下一個 node 時不小心 dereference NULL pointer (在 nxt = nxt->next這一行, 如果 nxt->next == NULL 時就會出錯)
  • 設定 pre cur nxt 指標的位置

Round 1:

# After initialize for-loop
------------------------   
|                      |
v                      |      
5 -> 2 -> 1 -> 3 -> 4 -- 
|    |    |         |   
tail cur  nxt       head
pre

code: reverse arrow cur->next = pre;

# state:
------------------------   
|                      |
v                      |      
5 <->2    1 -> 3 -> 4 -- 
|    |    |         |   
tail cur  nxt       head
pre


code: pre = cur

# state:
------------------------   
|                      |
v                      |      
5 <->2    1 -> 3 -> 4 -- 
|    |    |         |   
tail cur  nxt       head
     pre


code: inside if statement

  • false
    • so update cur, nxt
      • cur = nxt; nxt = nxt->next;
# state:
------------------------   
|                      |
v                      |      
5 <->2    1 -> 3 -> 4 -- 
|    |    |    |    |    
tail pre  cur  nxt  head

Round 2:

code: reverse arrow cur->next = pre;

# state:
------------------------   
|                      |
v                      |      
5 <->2 <- 1 -> 3 -> 4 -- 
|    |    |    |    |    
tail pre  cur  nxt  head


code: pre = cur

# state:
------------------------   
|                      |
v                      |      
5 <->2 <- 1 -> 3 -> 4 -- 
|    |    |    |    |    
tail      cur  nxt  head
          pre

code: inside if statement

  • false
    • so update cur, nxt
      • cur = nxt; nxt = nxt->next;
# state:
------------------------   
|                      |
v                      |      
5 <->2 <- 1 -> 3 -> 4 -- 
|    |    |    |    |    
tail      pre  cur  head
                    nxt

Round 3:

code: reverse arrow cur->next = pre;

# state:
------------------------   
|                      |
v                      |      
5 <->2 <- 1 <- 3    4 -- 
|    |    |    |    |    
tail      pre  cur  head
                    nxt

code: pre = cur

# state:
------------------------   
|                      |
v                      |      
5 <->2 <- 1 <- 3    4 -- 
|    |    |    |    |    
tail           cur  head
               pre  nxt

code: inside if statement

  • false
    • so update cur, nxt
      • cur = nxt; nxt = nxt->next;
# state:
------------------------   
|                      |
v                      |      
5 <->2 <- 1 <- 3    4 -- 
|              |    |    
tail                head
nxt            pre  cur

Round 4

code: reverse arrow cur->next = pre;

# state:
                         
5 <->2 <- 1 <- 3 <- 4  
|              |    |    
tail                head
nxt            pre  cur

code: pre = cur

# state:
                         
5 <->2 <- 1 <- 3 <- 4  
|                   |    
tail                head
nxt                 cur
                    pre

code: inside if statement

  • true
    • so set tail->next to NULL and return
      • q->tail->next = NULL; return;
# final state:
                         
x <- 5 <- 2 <- 1 <- 3 <- 4  
     |                   |    
     tail                head
     nxt                 cur
                         pre

Valgrind

Trace-12-malloc


作業範本 (HackMD)

參考 makefile build 出檔案的方式來學習

  • target: e.g. linked-list 實作
  • prerequisites: c structure. NULL pointer. Dynamic_memory allocation
  • command: 學習過程
target: prerequisites
    command

作業完成步驟

  • 開發環境設定
  • 取得程式碼並進行開發 :
    • clang-format 工具和一致的程式撰寫風格
    • Git Hooks 進行自動程式碼排版檢查
    • 撰寫 Git Commit Message 和自動檢查機制
    • 牛刀小試
      • Image Not Showing Possible Reasons
        • The image file may be corrupted
        • The server hosting the image is unavailable
        • The image path is incorrect
        • The image format is not supported
        Learn More →
        實作 q_size tutorial
    • 以 Valgrind 分析記憶體問題
    • Valgrind 使用案例
  • 自動測試程式
  • 作業要求

問題集與材料閱讀

Linked Lists 基礎知識

課內學習資源:

須具備基礎知識:

  • Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →
    C structure 基本概念
  • Dynamic memory allocation
  • Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →
    Typedef 基本概念
    • K&R 6.7 Typedef
    • Typedef 在 structure 的應用

K&R 6.1 Basics of structures

Declaration of structure

可行的宣告方式:

struct {...} x, y, z;

or

struct struct_tag {
    type member1;  // type can be int, struct, float
    type member2;
} struct_variable;

It's important to note that:

  • struct_tag is optional
  • when struct_variable is declared, system will allocate memory to it.

To access the member member1 in a structure struct_tag, we can use:

struct_variable.member1;

善用 Typedef 簡化程式宣告

For example, Consider following code mysqrt.c which calculate the distance between origin and point pt:

#include <stdio.h>
#include <math.h>

typedef struct {
   int x;
   int y;
} point;

int main() {
    point pt = {10, 20}; // declare and initialize a structure and its members
    double dist;
    printf("pt = (%d,%d)\n", pt.x, pt.y);
    dist = sqrt( (double)pt.x * pt.x + (double)pt.y * pt.y);
    printf("distance from origin to pt = %lf\n", dist);
}

compile it.

$ gcc mysqrt.c -o mysqrt 

Output:

$ mysqrt.c:23: undefined reference to `sqrt'
collect2: error: ld returned 1 exit status

初步猜測 ld 可能與 linker 有關 透過 google 搜尋關鍵字: undefined reference to sqrt

stack overflow

有人問一樣的問題

Answer:

The math library must be linked in when building the executable. How to do this varies by environment, but in Linux/Unix, just add -lm to the command. The math library is named libm.so, and the -l command option assumes a lib prefix and .a or .so suffix.wallyk

我的理解 linker 在連結的時候沒有找到 libm.so 這個檔案裡的 sqrt() function

修改 gcc 指令:

$ gcc mysqrt.c -o mysqrt -lm 

得到正確結果

$ ./mysqrt
pt = (10,20)
distance from origin to pt = 22.360680
衍伸問題

libm.so 是什麼呢? <math.h> 有什麼關係? 如何解釋 undefined reference ?

請參閱 你所不知道的 C 語言:動態連結器篇你所不知道的 C 語言:連結器和執行檔資訊,有對應的解說。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

收到,待補上心得 unknowntpo7:55 March 7 2020

Typedef 後的變數是否只能與同一個 Typedef 下的變數相容

參考 ISO / IEC 98996.7.7 Type definitions:

In a declaration whose storage-class specifier is typedef, each declarator defines an identifier to be a typedef name that denotes the type specified for the identifier in the way described in 6.7.5. Any array size expressions associated with variable length array declarators are evaluated each time the declaration of the typedef name is reached in the order of execution. A typedef declaration does not introduce a new type, only a synonym for the type so specified. That is, in the following declarations:

typedef T type-ident;
type-ident D;

type_ident is defined as a typedef name with the type specified by the declaration specifiers in T (known as T), and the identifier in D has the type derived-declaratortype-list T where the derived-declarator-type-list is specified by the declarators of D. A typedef name shares the same name space as other identifiers declared in ordinary declarators.

(Page 135).

Keywords:

  • storage-class specifier.
  • declarator

我的理解: consider the code below:

typedef int myint;
myint x;
  • declaration specifier: int
  • type identifier: myint
  • declarator: x
  • A typedef does not introduce a new type, only a synonym of the type so specified.
  • A typedef name shares the same name space as other identifiers declared in ordinary declarators.

參照 Example 2:

After the declarations:

typedef struct s1 { int x; } t1, *tp1;
typedef struct s2 { int x; } t2, *tp2;
  • Type t1 and the type pointed to by tp1 are compatible.
  • Type t1 is also compatible with type struct s1,
  • but not compatible with the types:
    • struct s2,
    • t2,
    • the type pointed to by tp2,
    • int.

關於 typedef 的 name space

根據 typedef statement in C

關於 typedef 的 namespace, 如果 typedef 的宣告放在所有 function 外面,那他的 scope 就是 global ,那所有的 function 都能使用這個 typedef. 但如果 typedef 的宣告放在某個 function 裡面,他的 scope 就是 local ,那在其他地方就不能使用它。

e.g. Consider the code below,

#include<stdio.h> void foo(void); int main() { typedef unsigned char uchar; uchar ch = 'a'; printf("ch inside main() : %c\n", ch); foo(); return 0; } void foo(void) { uchar ch = 'a'; // Error unsigned char ch = 'z'; printf("ch inside foo() : %c\n", ch); }

因為在

typedef unsigned char uchar

的 scope 沒有包含foo() 所以出現以下 error:

test_typedef.c: In function ‘foo’:
test_typedef.c:16:5: error: unknown type name ‘uchar’; did you mean ‘char’?
     uchar ch = 'a'; // Error
     ^~~~~
     char
test_typedef.c:17:19: error: conflicting types for ‘ch’
     unsigned char ch = 'z';
                   ^~
test_typedef.c:16:11: note: previous definition of ‘ch’ was here
     uchar ch = 'a'; // Error
衍生問題

namespace 是什麼? scope 是什麼?

這些詞彙是 Programming Language Theory (PLT) 的術語,可簡易理解為:

  1. namespace 著重於 identifier (即編譯器解析原始程式碼之後所任得的單元,可以是變數或函式名稱) 的區別方式;
  2. scope 在中文翻譯是「作用區域」,不特別區隔 identifier,但考慮到同樣名稱但真正有效的範圍,例如:
int i = 0;
int foo() { i++; }
int main() {
    int i = 100;
    for (int i = 1; i < 100; i++) ;
}

變數 i 的數值取決於在哪裡取得 (evaluate),顯然在 foo, main, for 都會不同,而且參照的物件也不同,有 local 也有 global。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

K&R 6.2 - Check whether a point is in rectangle or not

The full code is in gist.

參考資料 你所不知道的 C 語言:前置處理器應用篇

Linux Kernel 開發日誌 2020.3.9 中, K&R p.130 的 canonrect() function macro 作動不對 導致數值比較錯誤

Debug 後發現

#define MAX(a,b) ((a) > (b) ? (a) : (b))

relational operator > 寫反了。

延伸問題

gdb debug 後發現

(gdb) l 96
91       * Return: a rect which data type is 'struct rect'.
92       */
93      rect cononrect (rect r)
94      {
95          rect temp;
96          temp.pt1.x = min(r.pt1.x, r.pt2.x);
97          temp.pt1.y = min(r.pt1.y, r.pt2.y);
98          temp.pt2.x = max(r.pt1.x, r.pt2.x);
99          temp.pt2.y = max(r.pt1.y, r.pt2.y);
100         return temp;
Breakpoint 1, cononrect (r=...) at rect.c:96
(gdb) print temp
$1 = {pt1 = {x = 1, y = 0}, pt2 = {x = 5, y = 5}}
(gdb) print r
$2 = {pt1 = {x = 10, y = 10}, pt2 = {x = 0, y = 0}}
(gdb)

cononrect() 內, temp 一開始的數值是

$1 = {pt1 = {x = 1, y = 0}, pt2 = {x = 5, y = 5}}

推測是因為沒有被初始化的關係,所以有非預期的數值

pt1.x = 1
pt2.x = 5
pt2.y = 5

出現 所以 在 gdb 內 assign variable 給 temp 模擬 line 95 如果我們已經初始化 temp 的狀況。

(gdb) set variable temp = {{0,0},{0,0}}

之後再修改程式碼:

rect temp = {{0, 0}, {0, 0}};

再使用 gdb 來測試

Breakpoint 1, cononrect (r=...) at rect.c:95
95          rect temp = {{0, 0}, {0, 0}};
(gdb) n
96          temp.pt1.x = min(r.pt1.x, r.pt2.x);
(gdb) print temp
$3 = {pt1 = {x = 0, y = 0}, pt2 = {x = 0, y = 0}}
(gdb)

發現到我們成功初始化 temp 了!

但其實我們不需要初始化,畢竟不管 temp 內的 member 是什麼數字,之後都會馬上被 assign 成別的數。


K&R p.131 pointer and structure operator precedence

consider a structure

struct {
    int len;
    char *str;
}*p;

murmur: 這個宣告是否合法? unknowntpo

then,

  • ++p -> len:

    • access the address of structure member len,
    • and imcrement len
  • (++p)->len:

    • imcrement p and then access structure member len.
  • *p->str++:

    • access str: (p->str)
    • evaluate str: (p->str)++
    • get the value of str.
    • evaluate++: increment the value that str point to.

查詢 imcrement operator increment 的時間 *p->str++: 解釋可能錯誤,該如何更好地用英文解釋這些運算的行為呢? 我對用詞的掌握好像不太精準,可能會造成別人誤解

  • (*p->str)++:
    • get access to str :(p->str)
    • get the value of str: *(p->str)
    • imcrement the value that str pointed to. *(p->str)++
  • *p++->str:
    • p++: 先 evaluate p++, 在下一個 sequence point 前 increment p;
    • p++->str: get access to str
    • get the value of str
  • 撰寫程式來測試這些操作是否為 Undefined Behavior
#include <stdio.h> struct { int len; char *str; }*p; int main() { p->len = 1; p->str = "hello"; //printf("p -> len = %d", p->len); //printf("++p -> len = %d", ++p->len); return 0; }

遇到 segmentation fault.

(gdb) b 10
Breakpoint 1 at 0x5fe: file pointer.c, line 10.
(gdb) r
Starting program: /home/ubuntu/course_jserv_linux_kernel/week1/struct/pointer

Breakpoint 1, main () at pointer.c:10
10          p->len = 1;
(gdb) n

Program received signal SIGSEGV, Segmentation fault.
0x0000555555554605 in main () at pointer.c:10
10          p->len = 1;
(gdb)

segmentation fault: 查詢 Core Dump (Segmentation fault) in C/C++

推測可能是

  • structure 的宣告與
  • structure pointer 的觀念沒弄清楚。
  • 搞不懂 *p 的 scope, 所以存取了不能沒有存取權限的記憶體。

閱讀 K&R 6.4 Pointers to Structures 後再修改程式。

小心得: 基礎觀念還是要仔細學習,不能輕率, 不然到最後還是要繞一大圈,反而更花時間 unknowntpoThu, Mar 12, 2020 11:08 AM