2021q1 Homework1 (lab-0)

# 2021q1 Homework1 (lab-0) contributed by < `qwe661234` > ###### tags: `linux2021` :::info TODO: - [ ] select funciton ::: ## 作業要求 1 q_insert_tail 和 q_size 需要將原本 O(n) 時間複雜度的實作改寫為 O(1)時間複雜度 ## 測試測試所有的測資: > make test 針對單一個測試可以用: > ./qtest -f <command file> 得到更詳細的程式運行狀況，方便修改程式的錯誤 ## 實作 ### queue structure ``` c= typedef struct { list_ele_t *head; list_ele_t *tail; int size; } queue_t; ``` ### q_new 配置記憶體產生一個新的 queue，並檢查記憶體是否成功配置 ``` c= queue_t *q_new() { queue_t *q = malloc(sizeof(queue_t)); if (q == NULL) { return NULL; } q->head = NULL; q->tail = NULL; q->size = 0; return q; } ``` ### q_free 釋放 queue 的記憶體，釋放前要先把 queue 裡面存的 data 佔用的記憶體先釋放掉 ``` c= void q_free(queue_t *q) { if (q == NULL) { return; } list_ele_t *node; node = q->head; while (node) { list_ele_t *current = node; node = node->next; free(current->value); free(current); } free(q); } ``` ### q_insert_head 新增一個 list_ele 和配置記憶體給為其存取的字串，並將其設置為新的開頭，需檢查 q 是否存在以及是否有成功配置記憶體給新增的 list_ele 及其存取的字串 ``` c= bool q_insert_head(queue_t *q, char *s) { list_ele_t *newh; if (q == NULL) { return false; } newh = malloc(sizeof(list_ele_t)); if (newh == NULL) { return false; } int slen = strlen(s); newh->value = malloc((slen + 1) * sizeof(char)); if (newh->value == NULL) { free(newh); return false; } strncpy(newh->value, s, slen); newh->value[slen] = '\0'; newh->next = q->head; q->head = newh; q->size++; if (q->size == 1) { q->tail = q->head; } return true; } ``` ### q_insert_tail 方法和 insert_head 相似，只是改成將其設為結尾，新增變數 q->tail 可使此 function 的時間複雜度達到 O(1)，檢查部分要多檢查 q_size 是否為 0，如果不為0則將開和結尾都設成此 list_ele ``` c= bool q_insert_tail(queue_t *q, char *s) { list_ele_t *newt; if (q == NULL) { return false; } newt = malloc(sizeof(list_ele_t)); if (newt == NULL) { return false; } int slen = strlen(s); newt->value = malloc((slen + 1) * sizeof(char)); if (newt->value == NULL) { free(newt->value); free(newt); return false; } strncpy(newt->value, s, slen); newt->value[slen] = '\0'; newt->next = NULL; if (q->size == 0) { q->head = newt; q->tail = newt; } else { q->tail->next = newt; q->tail = newt; } q->size++; return true; } ``` ### q_remove_head 將 head 移除，移除需要將原本存在 head 中的內容存於 sp，儲存時要先確認 buffer 大小以免造成記憶體錯誤 ``` c= bool q_remove_head(queue_t *q, char *sp, size_t bufsize) { if (q == NULL || q->head == NULL) { return false; } list_ele_t *target = q->head; if (sp != NULL) { int slen = strlen(target->value); if (bufsize >= (slen + 1)) { strncpy(sp, target->value, slen); sp[slen] = '\0'; } else { strncpy(sp, target->value, bufsize - 1); sp[bufsize - 1] = '\0'; } } q->head = q->head->next; q->size--; if (q->size == 0) { q->tail = NULL; } free(target->value); free(target); return true; } ``` ### q_size 用變數 q->size 來 maintain，可以達到 O(1) ``` c= int q_size(queue_t *q) { if (q == NULL) { return 0; } return q->size; } ``` ### q_reverse front 用來存前面的 element, back 用來存後面的 element，將 back 的 next 指向 front，但在指向之前要先存在 temp 中，否則 next 會指向 front 後無法繼續往下一個 element 做 reverse ``` c= void q_reverse(queue_t *q) { if (q == NULL || q->head == q->tail) { return; } list_ele_t *front, *back, *temp; front = q->head; back = front->next; q->tail = q->head; while (back != NULL) { temp = back->next; back->next = front; if (front == q->head) { front->next = NULL; } front = back; back = temp; } q->head = front; } ``` ### q_sort 以類似龜兔賽跑的方式切分 list，fast 會一次跑兩個 list_ele, 而 slow 只會一次跑一個，以此方式將 queue 切成兩份。以此方式遞迴切分至 queue 的 element 只剩一個或空的，接著將其連接並排序，連接完成後找到 queue 的尾巴並 assign 給 q_tail ``` c= list_ele_t *merge(list_ele_t *l1, list_ele_t *l2) { list_ele_t *newh = l1; if (strcmp(l1->value, l2->value) <= 0) { l1 = l1->next; } else { newh = l2; l2 = l2->next; } list_ele_t *ptr = newh; while (1) { if (l1 != NULL && l2 != NULL) { if (strcmp(l1->value, l2->value) <= 0) { ptr->next = l1; l1 = l1->next; } else { ptr->next = l2; l2 = l2->next; } } else { if (l2 == NULL) { ptr->next = l1; } else { ptr->next = l2; } break; } ptr = ptr->next; } return newh; } list_ele_t *mergeSortList(list_ele_t *head) { // merge sort if (head == NULL || head->next == NULL) { return head; } list_ele_t *fast = head->next; list_ele_t *slow = head; // split list while (fast != NULL && fast->next != NULL) { slow = slow->next; fast = fast->next->next; } fast = slow->next; slow->next = NULL; // sort each list list_ele_t *l1 = mergeSortList(head); list_ele_t *l2 = mergeSortList(fast); // merge sorted l1 and sorted l2 return merge(l1, l2); } void q_sort(queue_t *q) { if (q == NULL || q->head == q->tail) { return; } q->head = mergeSortList(q->head); list_ele_t *ptr = q->head; while (ptr->next != NULL) { ptr = ptr->next; } q->tail = ptr; } ``` 原本 function merge 是用 recursive 的方式，但由於效能不佳，無法通過第15個測資，所以改以 nonrecursive 的方式改寫 ```c= list_ele_t *merge(list_ele_t *l1, list_ele_t *l2) { // merge with recursive if (l2 == NULL) { return l1; } if (l1 == NULL) { return l2; } if (strcmp(l1->value, l2->value) <= 0) { l1->next = merge(l1->next, l2); return l1; } else { l2->next = merge(l1, l2->next); return l2; } } ``` ## 測試結果 100 / 100 ## Address Sanitizer Error message: ``` c= int oldval = *plist->valp; ``` 但一直找不到原因，後來參考其他人的開發紀錄才知道原因出在上面宣告的 simulation 是 1個 bytes 的 bool，而下面是用 int 來讀，並以４個 bytes 的方式存取，所以會出現此錯誤，改動三處型別即可解決 ``` c= bool simulation = false; -> int simulation = false; extern bool simulation; -> extern int simulation; static bool echo = 0; -> static int echo = 0; ``` ## Valgrind & Massif 執行Valgrind >make valgrind >valgrind --tool=massif ./qtest -f testfile >massif-visualizer outputfile 測資16的 outputfile ![](https://i.imgur.com/RLbudWU.png) 測資17的 outputfile ![](https://i.imgur.com/7qyV9YG.png) ![](https://i.imgur.com/0AV3Kks.png) ## CERN 筆記許多 vulnerability 關於buffer overflow 和 string mauniplation. 這大多會以segmentation fault 來呈現 Error. gets, strcpy, sprintf 不安全原因為未檢查 buffer長度，且 strcpy 可能 overwrite 目標字串相鄰的記憶體位置。 1. gets -> fgets 2. strcpy -> strlcpy 或是 strcpy -> strncpy 但第二個方法較麻煩，因為他不會自動產生結尾字元 '\0'，要自己另外檢查並接上。 3. sprintf() -> snprintf() strcat 和 strcmp 也有風險存在，但未提及修改方式 4. printf format string e.g. print(argv[1]) ```c= #FormatString.c #include <stdio.h> int main(int argc, char **argv) { char *secret = "This is a secret!\n"; printf external link(argv[1]); return 0; } ``` 當輸入 ./FormatString %s 出現意想不到的結果 ```bash= $ gcc -mpreferred-stack-boundary=2 FormatString.c -o FormatString $ ./FormatString %s This is a secret! ``` 所以最好將 format string 寫死，不要讓 user 自己去 input 5. file openiing Symbolic link attack ([symbolic link 是什麼](https://en.wikipedia.org/wiki/Symbolic_link)) ```c= #include <stdio.h> #include <stdlib.h> #include <unistd.h> #define MY_TMP_FILE "/tmp/file.tmp" int main(int argc, char* argv[]) { FILE * f; if (!access(MY_TMP_FILE, F_OK)) { printf external link("File exists!\n"); return EXIT_FAILURE; } /* At this point the attacker creates a symlink from /tmp/file.tmp to /etc/passwd */ tmpFile = fopen(MY_TMP_FILE, "w"); if (tmpFile == NULL) { return EXIT_FAILURE; } fputs("Some text...\n", tmpFile); fclose(tmpFile); /* You successfully overwrote /etc/passwd (at least if you ran this as root) */ return EXIT_SUCCESS; } ``` 這樣你的密碼就被偷走了，所以 open file 前要先用 unlink() 將可能存在的 symbolic link 去除，並且不要 overwrite 已存在的檔案 reference: https://security.web.cern.ch/recommendations/en/codetools/c.shtml ## console 分析在 qtest 中會呼叫 function run_console, run_console funciton 的 input 為 inputfile, input file 有三種情況: 1. input file 為空會 call linenoise 來讀取使用者的輸入 2. 如果 input file 不為空且存在會先將 file 推入 RIO 中並 call select funciton 3. 如果 input file 不為空但不存在則會報錯 ```c= bool run_console(char *infile_name) { if (!push_file(infile_name)) { report(1, "ERROR: Could not open source file '%s'", infile_name); return false; } if (!has_infile) { char *cmdline; while ((cmdline = linenoise(prompt)) != NULL) { interpret_cmd(cmdline); linenoiseHistoryAdd(cmdline); /* Add to the history. */ linenoiseHistorySave(HISTORY_FILE); /* Save the history on disk. */ linenoiseFree(cmdline); } } else { while (!cmd_done()) cmd_select(0, NULL, NULL, NULL, NULL); } return err_cnt == 0; } ``` ## Select function ```c= int cmd_select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout) { int infd; fd_set local_readset; if (cmd_done()) return 0; if (!block_flag) { /* Process any commands in input buffer */ if (!readfds) readfds = &local_readset; /* Add input fd to readset for select */ infd = buf_stack->fd; FD_SET(infd, readfds); if (infd == STDIN_FILENO && prompt_flag) { printf("%s", prompt); fflush(stdout); prompt_flag = true; } if (infd >= nfds) nfds = infd + 1; } if (nfds == 0) return 0; int result = select(nfds, readfds, writefds, exceptfds, timeout); if (result <= 0) return result; infd = buf_stack->fd; if (readfds && FD_ISSET(infd, readfds)) { /* Commandline input available */ FD_CLR(infd, readfds); result--; if (has_infile) { char *cmdline; cmdline = readline(); if (cmdline) interpret_cmd(cmdline); } } return result; } ``` ## Linenoise ### TTY == terminal (Teletype / Teletypewriter) 的英文縮寫就是 tty 終端機介面有二種模式： 1. 正規模式: 又稱為 cooked 模式。在這種模式中，終端設備會處理特殊字元，且會以一次一列的方式將輸入傳給應用程式。例如 Linux 的 shell 指令。 2. 非正規模式: 又稱為 raw 模式。在這種模式中，終端設備不會處理特殊字元，且會以一次一個字元的方式將輸入傳給應用程式。例如在 Linux 使用 vim 編輯程式。 ### function linenoise input prompt 是自定義的命令提示字串, 用來檢視是否為使用 terminal 的狀態, 如果使用 terminal 會執行 funciton `linenoiseRaw` ```c= char *linenoise(const char *prompt) { char buf[LINENOISE_MAX_LINE]; int count; if (!isatty(STDIN_FILENO)) { /* Not a tty: read from file / pipe. In this mode we don't want any * limit to the line size, so we call a function to handle that. */ return linenoiseNoTTY(); } else if (isUnsupportedTerm()) { size_t len; printf("%s",prompt); fflush(stdout); if (fgets(buf,LINENOISE_MAX_LINE,stdin) == NULL) return NULL; len = strlen(buf); while(len && (buf[len-1] == '\n' || buf[len-1] == '\r')) { len--; buf[len] = '\0'; } return strdup(buf); } else { count = linenoiseRaw(buf,LINENOISE_MAX_LINE,prompt); if (count == -1) return NULL; return strdup(buf); } } ``` ### function linenoiseRaw 會呼叫 function `enableRawMode` 將目前 termios 的參數設為 Raw mode, 接著以 function `linenoiseEdit` 來讀取使用者輸入, 讀去後以 funciton `disableRawMode` 關閉 Raw mode, 最後回傳讀取 buffer 的字串長度。 ```c= static int linenoiseRaw(char *buf, size_t buflen, const char *prompt) { int count; if (buflen == 0) { errno = EINVAL; return -1; } if (enableRawMode(STDIN_FILENO) == -1) return -1; count = linenoiseEdit(STDIN_FILENO, STDOUT_FILENO, buf, buflen, prompt); disableRawMode(STDIN_FILENO); printf("\n"); return count; } ``` ### function EnableRawMode termios 是在 Linux 中設定終端機的參數，例如鮑率、字元長度等。termios的結構如下： ```c= struct termios { tcflag_t c_iflag; //輸入模式 tcflag_t c_oflag; //輸出模式 tcflag_t c_cflag; //控制模式 tcflag_t c_lflag; //局部模式 cc_t c_cc[NCCS]; //特殊控制字元 } ``` function `enableRawMode` 就是以 funciton `tcgetattr`用來取得目前的 termios 也就是終端機設定, 接著將當前的終端機設定改為 RawMode, 再以 funciton `tcsetattr` 將終端機設定設為修改過的 termios ```c= static int enableRawMode(int fd) { struct termios raw; if (!isatty(STDIN_FILENO)) goto fatal; if (!atexit_registered) { atexit(linenoiseAtExit); atexit_registered = 1; } if (tcgetattr(fd,&orig_termios) == -1) goto fatal; raw = orig_termios; /* modify the original mode */ /* input modes: no break, no CR to NL, no parity check, no strip char, * no start/stop output control. */ raw.c_iflag &= ~(BRKINT | ICRNL | INPCK | ISTRIP | IXON); /* output modes - disable post processing */ raw.c_oflag &= ~(OPOST); /* control modes - set 8 bit chars */ raw.c_cflag |= (CS8); /* local modes - choing off, canonical off, no extended functions, * no signal chars (^Z,^C) */ raw.c_lflag &= ~(ECHO | ICANON | IEXTEN | ISIG); /* control chars - set return condition: min number of bytes and timer. * We want read to return every single byte, without timeout. */ raw.c_cc[VMIN] = 1; raw.c_cc[VTIME] = 0; /* 1 byte, no timer */ /* put terminal in raw mode after flushing */ if (tcsetattr(fd,TCSAFLUSH,&raw) < 0) goto fatal; rawmode = 1; return 0; fatal: errno = ENOTTY; return -1; } ``` ### funciton linenoiseEdit 先以 function 'write' 將 prompt 輸出, 接著以 funciton `read` 一次讀入一個輸入, 其中以 switch 定義了許多輸入對應的操作。常見輸入 * 一般字元 call function `linenoiseEditInsert` 將讀入的字元接在 buffer 後面 * tab call function `completeLine` 來完成 auto complete ```c= static int linenoiseEdit(int stdin_fd, int stdout_fd, char *buf, size_t buflen, const char *prompt) { struct linenoiseState l; /* Populate the linenoise state that we pass to functions implementing * specific editing functionalities. */ l.ifd = stdin_fd; l.ofd = stdout_fd; l.buf = buf; l.buflen = buflen; l.prompt = prompt; l.plen = strlen(prompt); l.oldpos = l.pos = 0; l.len = 0; l.cols = getColumns(stdin_fd, stdout_fd); l.maxrows = 0; l.history_index = 0; /* Buffer starts empty. */ l.buf[0] = '\0'; l.buflen--; /* Make sure there is always space for the nulterm */ /* The latest history entry is always our current buffer, that * initially is just an empty string. */ linenoiseHistoryAdd(""); if (write(l.ofd,prompt,l.plen) == -1) return -1; while(1) { char c; int nread; char seq[3]; nread = read(l.ifd,&c,1); if (nread <= 0) return l.len; /* Only autocomplete when the callback is set. It returns < 0 when * there was an error reading from fd. Otherwise it will return the * character that should be handled next. */ if (c == 9 && completionCallback != NULL) { c = completeLine(&l); /* Return on errors */ if (c < 0) return l.len; /* Read next character when 0 */ if (c == 0) continue; } switch(c) { case ENTER: /* enter */ history_len--; free(history[history_len]); if (mlmode) linenoiseEditMoveEnd(&l); if (hintsCallback) { /* Force a refresh without hints to leave the previous * line as the user typed it after a newline. */ linenoiseHintsCallback *hc = hintsCallback; hintsCallback = NULL; refreshLine(&l); hintsCallback = hc; } return (int)l.len; case CTRL_C: /* ctrl-c */ errno = EAGAIN; return -1; case BACKSPACE: /* backspace */ case 8: /* ctrl-h */ linenoiseEditBackspace(&l); break; case CTRL_D: /* ctrl-d, remove char at right of cursor, or if the line is empty, act as end-of-file. */ if (l.len > 0) { linenoiseEditDelete(&l); } else { history_len--; free(history[history_len]); return -1; } break; case CTRL_T: /* ctrl-t, swaps current character with previous. */ if (l.pos > 0 && l.pos < l.len) { int aux = buf[l.pos-1]; buf[l.pos-1] = buf[l.pos]; buf[l.pos] = aux; if (l.pos != l.len-1) l.pos++; refreshLine(&l); } break; case CTRL_B: /* ctrl-b */ linenoiseEditMoveLeft(&l); break; case CTRL_F: /* ctrl-f */ linenoiseEditMoveRight(&l); break; case CTRL_P: /* ctrl-p */ linenoiseEditHistoryNext(&l, LINENOISE_HISTORY_PREV); break; case CTRL_N: /* ctrl-n */ linenoiseEditHistoryNext(&l, LINENOISE_HISTORY_NEXT); break; case ESC: /* escape sequence */ /* Read the next two bytes representing the escape sequence. * Use two calls to handle slow terminals returning the two * chars at different times. */ if (read(l.ifd,seq,1) == -1) break; if (read(l.ifd,seq+1,1) == -1) break; /* ESC [ sequences. */ if (seq[0] == '[') { if (seq[1] >= '0' && seq[1] <= '9') { /* Extended escape, read additional byte. */ if (read(l.ifd,seq+2,1) == -1) break; if (seq[2] == '~') { switch(seq[1]) { case '3': /* Delete key. */ linenoiseEditDelete(&l); break; } } } else { switch(seq[1]) { case 'A': /* Up */ linenoiseEditHistoryNext(&l, LINENOISE_HISTORY_PREV); break; case 'B': /* Down */ linenoiseEditHistoryNext(&l, LINENOISE_HISTORY_NEXT); break; case 'C': /* Right */ linenoiseEditMoveRight(&l); break; case 'D': /* Left */ linenoiseEditMoveLeft(&l); break; case 'H': /* Home */ linenoiseEditMoveHome(&l); break; case 'F': /* End*/ linenoiseEditMoveEnd(&l); break; } } } /* ESC O sequences. */ else if (seq[0] == 'O') { switch(seq[1]) { case 'H': /* Home */ linenoiseEditMoveHome(&l); break; case 'F': /* End*/ linenoiseEditMoveEnd(&l); break; } } break; default: if (linenoiseEditInsert(&l,c)) return -1; break; case CTRL_U: /* Ctrl+u, delete the whole line. */ buf[0] = '\0'; l.pos = l.len = 0; refreshLine(&l); break; case CTRL_K: /* Ctrl+k, delete from current to end of line. */ buf[l.pos] = '\0'; l.len = l.pos; refreshLine(&l); break; case CTRL_A: /* Ctrl+a, go to the start of the line */ linenoiseEditMoveHome(&l); break; case CTRL_E: /* ctrl+e, go to the end of the line */ linenoiseEditMoveEnd(&l); break; case CTRL_L: /* ctrl+l, clear screen */ linenoiseClearScreen(); refreshLine(&l); break; case CTRL_W: /* ctrl+w, delete previous word */ linenoiseEditDeletePrevWord(&l); break; } } return l.len; } ```