2019q1 Homework1 (lab0)

# 2019q1 Homework1 (lab0) contributed by < `tony2037` > ###### tags: 劉宥辰 ## Document interpretation We are about to implement a **queue**, supporting both last-in, first-out **(LIFO)** and first-in-first-out **(FIFO)** queueing disciplines. The underlying data structure is a **singly-linked list**, enhanced to make some of the operations more efficient. * There are several functions mentioned in the document: * Memory management: Functions `malloc` and `free`. * Stringoperations: Functions `strlen`,`strcpy`,and `strncpy`.(Beware of the behavior of `strncpy` when truncating strings!) (==Overflow== may happen, and actually happend to me when I called this `strncpy`) * [quote] `The final list element has its next pointer set to NULL.` * This is a crucial condition to determine how to traverse the singly-linked list. * [quote] `To store a string of length l, the array has l + 1 elements, with the first l storing the codes (typically ASCII1 format) for the characters and the final one being set to 0.` * This implies the facts that * String is an array of characters. * end with `\0` (How we know where is the end). * [quote] `In our C code, a queue is a pointer of type queue_t *. We distinguish two special cases: a NULL queue is one for which the pointer has value NULL. An empty queue is one pointing to a valid structure, but the head field has value NULL. Your code will need to deal properly with both of these cases, as well as queues containing one or more elements.` * NULL queue: queue_t *q; q = NULL * An empty queue: queue_t *q = addr; q->size = 0; q->head = NULL; * [quote] `For functions that provide strings as arguments, you must create and store a copy of the string by calling malloc to allocate space (remember to include space for the terminating character) and then copying from the source to the newly allocated space.` * This implies that we need to: * Allocate a space for a given string * Take the terminating character in consideration * [quote] `When it comes time to free a list element, you must also free the space used by the string. You cannot assume any fixed upper bound on the length of a string—you must allocate space for each string based on its length.` * Memory leak issue * [quote] `q_insert_tail and q_size will require some effort on your part to meet the required performance standards.` * `q_insert_tail` * `q_size` ## Problems that happen to me ### `q_insert_head` * Origin ```clike list_ele_t *newh; newh = malloc(sizeof(list_ele_t)); if (newh == NULL) { printf("Allocate failed\n"); return false; } /* Don't forget to allocate space for the string and copy it */ /* What if either call to malloc returns NULL? */ memset(newh, 0, sizeof(list_ele_t)); newh->next = q->head; q->head = newh; return true; ``` :::warning 注意遵守一致的程式碼縮排風格 (4 spaces 而非 tab)，在 GitHub 和 HackMD 都是 :notes: jserv ::: > Have revised > [name=Ztex] * The stastic analysis mentions that I have ```either circulor loop or infinite loop```,and `memory leak` issue. * By using `gdb` I found that: * If failure happends when i try to allocate a space for `new->value`, I should `free` `newh` * I should take care of `empty queue` * Now ```clike bool q_insert_head(queue_t *q, char *s) { ... /* Don't forget to allocate space for the string and copy it */ /* What if either call to malloc returns NULL? */ memset(newh, 0, sizeof(list_ele_t)); newh->value = malloc(sizeof(char) * strlen(s) + 1); if (newh->value == NULL) { printf("Allocate failed\n"); (+) free(newh); return false; } memset(newh->value, 0, sizeof(char) * strlen(s) + 1); strcpy(newh->value, s); (+) if (q->size == 0) { (+) newh->next = NULL; (+) q->head = newh; (+) q->size++; (+) return true; (+) } newh->next = q->head; q->head = newh; q->size++; return true; } ``` ### `q_free` * When I use `qtest` and `free` command to evaluate, it states that `3 blocks are still allocated` ```shell $>./qtest cmd> new [] cmd> ih ttt [ttt] cmd> ih yyy [yyy, ttt] cmd> ih zzz [zzz, yyy, ttt] cmd> free Freeing queue ERROR: Freed queue, but 3 blocks are still allocated ``` * By using `gdb`, I was sure that `char* value` the spaces had been freed. * There were actual 3 elements in queue. I then realize that its because I forgot to free element's pointers. :::danger 文字訊息「不要」用圖片去表示，考量點: 1. 後續搜尋和文字擷取會很困難; 2. 有些檢視者可能有視覺障礙 (有全盲生參與本系列課程)，他們無法完整地閱讀共筆，進而充分討論，請尊重他們的權益; :notes: jserv ::: > Copy that. Have removed the image, and will make sure the mistake will not happen again. > [name=Ztex] * Revise ```clike= void q_free(queue_t *q) { /* How about freeing the list elements and the strings? */ /* Free queue structure */ if (q == NULL) { return; } list_ele_t *tmp = q->head; (+) list_ele_t *prev = q->head; while (tmp != NULL) { free(tmp->value); (+) prev = tmp; tmp = tmp->next; (+) free(prev); } free(q); } ``` ## Result ==港動的不得了 QAQ== **The following picture is the result states having pass the test** ![](https://i.imgur.com/HMRrFrK.png) ## How the scoring system works? * Firstly, take a look at `Makefile` [Big shot Jserv teachs Makefile](https://hackmd.io/c/rJKbX1pFZ/https%3A%2F%2Fhackmd.io%2Fs%2FSySTMXPvl) ```shell CC = gcc CFLAGS = -O0 -g -Wall -Werror GIT_HOOKS := .git/hooks/applied all: $(GIT_HOOKS) qtest $(GIT_HOOKS): @scripts/install-git-hooks @echo queue.o: queue.c queue.h harness.h $(CC) $(CFLAGS) -c queue.c qtest: qtest.c report.c console.c harness.c queue.o $(CC) $(CFLAGS) -o qtest qtest.c report.c console.c harness.c queue.o test: qtest scripts/driver.py scripts/driver.py clean: rm -f *.o *~ qtest rm -rf *.dSYM (cd traces; rm -f *~) ``` `test: qtest scripts/driver.py` implies that we should check `scripts/driver.py` out * scripts/driver.py ```python if __name__ == "__main__": run(sys.argv[0], sys.argv[1:]) ``` so I use `pdb` to check out what's the content of `sys.argv` ```shell $>pdb scripts/driver.py -> import subprocess (Pdb) b 148 Breakpoint 1 at /home/e24056310/lab0-c/scripts/driver.py:148 (Pdb) r > /home/e24056310/lab0-c/scripts/driver.py(148)<module>() -> run(sys.argv[0], sys.argv[1:]) (Pdb) p sys.argv ['scripts/driver.py'] ``` humm ... ok let's check out `run` ```python def run(name, args): ... optlist, args = getopt.getopt(args, 'hp:t:v:A') for (opt, val) in optlist: if opt == '-h': usage(name) elif opt == '-p': prog = val elif opt == '-t': tid = int(val) elif opt == '-v': vlevel = int(val) levelFixed = True elif opt == '-A': autograde = True else: print("Unrecognized option '%s'" % opt) usage(name) if not levelFixed and autograde: vlevel = 0 t = Tracer(qtest = prog, verbLevel = vlevel, autograde = autograde) t.run(tid) ``` Looks like just some parameters handlings, nothing special here, guess `Tracer` is what we're looking for. Before we dig in `Tracer.run`, some variables belonged to this object catch my attention. ```python class Tracer: traceDirectory = "./traces" qtest = "./qtest" ... traceDict = { 1 : "trace-01-ops", 2 : "trace-02-ops", 3 : "trace-03-ops", 4 : "trace-04-ops", 5 : "trace-05-ops", 6 : "trace-06-string", 7 : "trace-07-robust", 8 : "trace-08-robust", 9 : "trace-09-robust", 10 : "trace-10-malloc", 11 : "trace-11-malloc", 12 : "trace-12-malloc", 13 : "trace-13-perf", 14 : "trace-14-perf", 15 : "trace-15-perf" } traceProbs = { 1 : "Trace-01", 2 : "Trace-02", 3 : "Trace-03", 4 : "Trace-04", 5 : "Trace-05", 6 : "Trace-06", 7 : "Trace-07", 8 : "Trace-08", 9 : "Trace-09", 10 : "Trace-10", 11 : "Trace-11", 12 : "Trace-12", 13 : "Trace-13", 14 : "Trace-14", 15 : "Trace-15" } maxScores = [0, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7] ``` I check out `traces` directory as the result ```shell $> ls traces trace-01-ops.cmd trace-03-ops.cmd trace-05-ops.cmd trace-07-robust.cmd trace-09-robust.cmd trace-11-malloc.cmd trace-13-perf.cmd trace-15-perf.cmd trace-02-ops.cmd trace-04-ops.cmd trace-06-string.cmd ... $> cat traces/trace-01-ops.cmd # Test of insert_head and remove_head option fail 0 option malloc 0 new ih gerbil ih bear ih dolphin rh dolphin rh bear rh gerbil ``` look like scripts used for `./qtest`, they must be. ```shell $> pdb scripts/driver.py > /home/e24056310/lab0-c/scripts/driver.py(3)<module>() -> import subprocess (Pdb) b 77 Breakpoint 1 at /home/e24056310/lab0-c/scripts/driver.py:77 (Pdb) b 87 Breakpoint 2 at /home/e24056310/lab0-c/scripts/driver.py:87 (Pdb) b 67 Breakpoint 3 at /home/e24056310/lab0-c/scripts/driver.py:67 (Pdb) r --- Trace Points > /home/e24056310/lab0-c/scripts/driver.py(77)run() -> if tid == 0: (Pdb) p scoreDict {1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0, 13: 0, 14: 0, 15: 0} (Pdb) r > /home/e24056310/lab0-c/scripts/driver.py(87)run() -> tname = self.traceDict[t] (Pdb) n > /home/e24056310/lab0-c/scripts/driver.py(88)run() -> if self.verbLevel > 0: (Pdb) p tname 'trace-01-ops' (Pdb) r +++ TESTING trace trace-01-ops: > /home/e24056310/lab0-c/scripts/driver.py(67)runTrace() -> try: (Pdb) p clist ['./qtest', '-v', '1', '-f', './traces/trace-01-ops.cmd'] (Pdb) p fname './traces/trace-01-ops.cmd' (Pdb) p vname '1' (Pdb) n > /home/e24056310/lab0-c/scripts/driver.py(68)runTrace() -> retcode = subprocess.call(clist) (Pdb) n # Test of insert_head and remove_head > /home/e24056310/lab0-c/scripts/driver.py(72)runTrace() -> return retcode == 0 (Pdb) ``` In the `Tracer.run` section, two variables and one function catch my attention, `scoreDict`, `tname` and `self.runTrace`. I set up breakpoints on line ==77== ==87== ==67==. So I figure it out: 1. `scoreDict` used for recording scores in different `Traces` `{1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0, 13: 0, 14: 0, 15: 0}` 2. `tname` is the test working on `'trace-01-ops'` 3. `subprocess.call` is used for running the command described by args. ==Wait for command to complete==, then ==return the returncode attribute== (I quote the document, the reference link is below). [https://docs.python.org/2/library/subprocess.html](https://docs.python.org/2/library/subprocess.html) Here is an example what its parameters look like ```python (Pdb) p clist ['./qtest', '-v', '1', '-f', './traces/trace-01-ops.cmd'] ``` 4. At the end `Tracer.runTrace` ```shell Pdb) n # Test of insert_head and remove_head > /home/e24056310/lab0-c/scripts/driver.py(72)runTrace() -> return retcode == 0 ``` 5. look back to `Tracer.run` ```python ok = self.runTrace(t) maxval = self.maxScores[t] tval = maxval if ok else 0 print("---\t%s\t%d/%d" % (tname, tval, maxval)) score += tval maxscore += maxval scoreDict[t] = tval ``` Humm~ * Conclusion * Read the script files in `Traces` directory * Deal with parameters * By calling `subprocess.call` to execute `qtest` and get the retcode * If success then get points * Add it up to get the final score :::info > Q: How does the scoring system remind me what kind of problem I probably have ? > [name=Ztex] ::: :::info > I guess the reminder part is in `harness.h` > ```clike > /* > Return whether any errors have occurred since last time checked > */ > bool error_check(); > ``` > [name=Ztex] ::: ## Microscope `qtest` * Start with `Makefile` ```shell CC = gcc CFLAGS = -O0 -g -Wall -Werror GIT_HOOKS := .git/hooks/applied all: $(GIT_HOOKS) qtest $(GIT_HOOKS): @scripts/install-git-hooks @echo queue.o: queue.c queue.h harness.h $(CC) $(CFLAGS) -c queue.c qtest: qtest.c report.c console.c harness.c queue.o $(CC) $(CFLAGS) -o qtest qtest.c report.c console.c harness.c queue.o test: qtest scripts/driver.py scripts/driver.py clean: rm -f *.o *~ qtest rm -rf *.dSYM (cd traces; rm -f *~) ``` `Line 11` and `Line 14` implies `qtest` depends on several files. `queue.o` and `qtest.c` catch my attention, because `queue.o` depends on `queue.c` and `queue.h` that I've been working on them, and `qtest.c` is the first dependency. * `main` in `qtest.c` ```clike int main(int argc, char *argv[]) { /* To hold input file name */ char buf[BUFSIZE]; char *infile_name = NULL; char lbuf[BUFSIZE]; char *logfile_name = NULL; int level = 4; int c; while ((c = getopt(argc, argv, "hv:f:l:")) != -1) { switch (c) { case 'h': usage(argv[0]); break; case 'f': strncpy(buf, optarg, BUFSIZE); buf[BUFSIZE - 1] = '\0'; infile_name = buf; break; case 'v': level = atoi(optarg); break; case 'l': strncpy(lbuf, optarg, BUFSIZE); buf[BUFSIZE - 1] = '\0'; logfile_name = lbuf; break; default: printf("Unknown option '%c'\n", c); usage(argv[0]); break; } } queue_init(); init_cmd(); console_init(); set_verblevel(level); if (level > 1) { set_echo(true); } if (logfile_name) set_logfile(logfile_name); add_quit_helper(queue_quit); bool ok = true; ok = ok && run_console(infile_name); ok = ok && finish_cmd(); return ok ? 0 : 1; } ``` Look at `Line 46` and `Line 47`, `run_console` and `finish_cmd` may be critical as them determine `ok` ```shell (gdb) b run_console (gdb) b finish_cmd Breakpoint 2 at 0x4049c6: file console.c, line 628. (gdb) r Starting program: /home/e24056310/lab0-c/qtest Breakpoint 1, run_console (infile_name=0x0) at console.c:637 (gdb) p infile_name $1 = 0x0 (gdb) p *infile_name Cannot access memory at address 0x0 (gdb) n (gdb) n cmd> cmd> ``` uhh ... I guess this function operates literally like its name ... take care of `console` ```shell cmd>help Commands: # ... | Display comment free | Delete queue help | Show documentation 1 85 ih str [n] | Insert string str at head of queue n times (default: n == 1) it str [n] | Insert string str at tail of queue n times (default: n == 1) log file | Copy output to file new | Create new queue option [name val] | Display or set options quit | Exit program reverse | Reverse queue ... ``` It should be the function, `cmd_select`, taking care of how commands work. * `cmd_select` ```shell (gdb) n (gdb) p cmdline $1 = 0x608800 <linebuf> "new\n" (gdb) n (gdb) s interpret_cmd (cmdline=0x608800 <linebuf> "new\n") at console.c:231 (gdb) n (gdb) p cmdline $2 = 0x608800 <linebuf> "new\n" ``` `step` `into interpret_cmd` , because it looks like a part taking care of executing command. ```shell (gdb) p *argv $6 = 0x60d8a0 "new" (gdb) s interpret_cmda (argc=1, argv=0x60d880) at console.c:210 ``` `step` into `interpret_cmda` ... I found that given command `new`, the process will go to the function `do_new` after `step`ping into `next_cmd->operation` :::info > I'm not sure here! should I say `go to` ==`branch`== `do_new` **or** `go to` ==`function`== `do_new` ?????? > [name=Ztex] ::: The message below show a snapshot of `gdb` ```clike static bool interpret_cmda(int argc, char *argv[]) │ │209 { │ >│210 if (argc == 0) │ │211 return true;atoi(optarg); │ │212 /* Try to find matching command */ │ │213 cmd_ptr next_cmd = cmd_list; │ │214 bool ok = true;(lbuf, optarg, BUFSIZE); │ │215 while (next_cmd && strcmp(argv[0], next_cmd->name) != 0) │ │216 next_cmd = next_cmd->next; │ │217 if (next_cmd) { │ │218 ok = next_cmd->operation(argc, argv); │ │219 if (!ok)tf("Unknown option '%c'\n", c); │ │220 record_error(); │ │221 } else { │ │222 report(1, "Unknown command '%s'", argv[0]); │ │223 record_error(); │ │224 ueuok = false; │ │225 } │ │226 return ok; │ │227 } ``` intuitively, I should check out the structure `cmd_ptr` [How to locate where a type and variable is first declared with gdb ](https://stackoverflow.com/questions/21022354/how-to-locate-where-a-type-and-variable-is-first-declared-with-gdb) ```shell (gdb) info types cmd_ptr All types matching regular expression "cmd_ptr": File console.h: typedef struct CELE * cmd_ptr; ``` * `CELE` `cmd_ptr` ```clike typedef bool (*cmd_function)(int argc, char *argv[]); /* Information about each command */ /* Organized as linked list in alphabetical order */ typedef struct CELE cmd_ele, *cmd_ptr; struct CELE { char *name; cmd_function operation; char *documentation; cmd_ptr next; }; ``` `cmd_function` is a function pointer which takes an `int` argument `agrv`, and a `pointer to char array` and return `bool` :::info I'm not sure about the explanation above. I remember there is a toolkit released aims to explain this. 我記得有一個工具可以解釋C的宣告. 好像是GNU toolkit chain裡面的樣子？但我忘記名字了 > [name=Ztex] ::: :::warning [cdecl](https://cdecl.org/) is your friend. :notes: jserv ::: > `declare cmd_function as pointer to function (int, array of pointer to char) returning bool` > THX Teacher. > [name=Ztex] > ```shell= (gdb) p next_cmd->name $1 = 0x4062a2 "#" (gdb) p next_cmd->documentation $2 = 0x406280 " ...", ' ' <repeats 12 times>, "| Display comment" (gdb) p next_cmd->operation $3 = (cmd_function) 0x403d78 <do_comment_cmd> (gdb) n (gdb) n (gdb) p next_cmd->name $1 = 0x4062a2 "#" (gdb) p next_cmd->documentation $2 = 0x406280 " ...", ' ' <repeats 12 times>, "| Display comment" (gdb) p next_cmd->operation $3 = (cmd_function) 0x403d78 <do_comment_cmd> (gdb) b do_comment_cmd Breakpoint 2 at 0x403d87: file console.c, line 316. (gdb) p next_cmd->next->operatoin There is no member named operatoin. (gdb) p next_cmd->next $4 = (cmd_ptr) 0x60b220 (gdb) p next_cmd->next->operation $5 = (cmd_function) 0x401241 <do_free> (gdb) b do_free Breakpoint 3 at 0x401250: file qtest.c, line 117. (gdb) (gdb) p next_cmd->next->next $6 = (cmd_ptr) 0x60b010 (gdb) p next_cmd->next->next->operation $7 = (cmd_function) 0x403c92 <do_help_cmd> (gdb) p next_cmd->next->next->next->operation $8 = (cmd_function) 0x401378 <do_insert_head> (gdb) p next_cmd->next->next->next->next->operation $9 = (cmd_function) 0x401635 <do_insert_tail> (gdb) b do_insert_head Breakpoint 4 at 0x401387: file qtest.c, line 144. (gdb) b do_insert_tail Breakpoint 5 at 0x401644: file qtest.c, line 206. ``` ==Assumption== :::info $\because$ `next_cmd->operation = <do_comment_cmd>` `next_cmd->next->operation = <do_free>` `next_cmd->->next->next->operation = <do_help_cmd>` `next_cmd->next->next->next->operation = <do_insert_head>` $\therefore$ I assume that `CELE` structure is a `singly-linked-list` elements. Linking these elements together to have a `singly-linked list`, which stands for a series of work respect to corresponding command. eg. the example above is a singly-linked list for command `new`. ::: ==Verify the assumtion== > Honestly, I'm confused about how to execute verification. > Intuitively, I'll probably try to trace out a complete linked list to a specific command (like `new` command for example.). Try to figure out what the pragram does when I `new` a `queue`. :::info ~~But I'm curious about what shoud I do to verify my assumption more effectively and strictly.~~ ~~求救, 我不確定怎麼更有效率且嚴謹地去驗證我的假設. 現在我所想到的是直覺地嘗試追蹤完整條linked list進而了解在對應的指令下CELE linked list 做了哪些操作.~~ ::: > [name=Ztex] > :::info ```clike= typedef bool (*cmd_function)(int argc, char *argv[]); /* Information about each command */ /* Organized as linked list in alphabetical order */ typedef struct CELE cmd_ele, *cmd_ptr; struct CELE { char *name; cmd_function operation; char *documentation; cmd_ptr next; }; ``` 1. 以上是一個定義在"console.h"名為CELE的==singly-linked list== element 2. 經過**gdb**追蹤(給定**new** command), 我發現在執行**new**這個command時, 一個由CELE組成的singly-linked list 被實例化(instance 我不確定這邊的用語正不正確)出來 3. 透過**gdb**發現每個element的**operation** function pointer 指向用於操作queue的functions(eg.**do_free**, **do_insert_head** ) 4. 問題: 1. 假設這個由CELE組成的singly-linked list, 是對應特定的指令實例化出來的(也就是**new** 的 linked-list 跟 **show** 的linked-list 是不一樣的), 如何有效率且嚴謹的驗證假設? * 目前想到的辦法是給定**new**, **show** command在特過**gdb** 來看對應的linked-list是否相同. 2. 假設上述假設為真，如何有效率的追蹤一條linked-list 如何被實例化出來? * 目前想到的辦法是找出實例化出來的地方, 然後把相關的functions 從頭到尾code review一次 ::: :::danger 先改善你的語文表達和提問，請參見 [提問的智慧](https://github.com/ryanhanwu/How-To-Ask-Questions-The-Smart-Way) :notes: jserv ::: * `cmd_list` In `console.c`, a global static `cmd_ptr` type variable named `cmd_list` is declared. ```clike static cmd_ptr cmd_list = NULL; ``` when get into console, `init_cmd` function is called to do initialization. ```clike= /* Initialize interpreter */ void init_cmd() { cmd_list = NULL; param_list = NULL; err_cnt = 0; quit_flag = false; add_cmd("help", do_help_cmd, " | Show documentation"); add_cmd("option", do_option_cmd, " [name val] | Display or set options"); add_cmd("quit", do_quit_cmd, " | Exit program"); add_cmd("source", do_source_cmd, " file | Read commands from source file"); add_cmd("log", do_log_cmd, " file | Copy output to file"); add_cmd("time", do_time_cmd, " cmd arg ... | Time command execution"); add_cmd("#", do_comment_cmd, " ... | Display comment"); add_param("verbose", &verblevel, "Verbosity level", NULL); add_param("error", &err_limit, "Number of errors until exit", NULL); add_param("echo", &echo, "Do/don't echo commands", NULL); #if 0 add_param("megabytes", &mblimit, "Maximum megabytes allowed", NULL); add_param("seconds", &timelimit, "Maximum seconds allowed", change_timeout); #endif init_in(); init_time(&last_time); first_time = last_time; } ``` `add_cmd` aims to add new command. Sveral fiexed commands are appended in the singly-linked list initially. (`help`, `option`, `quit`, `source` ... so on) ```clike= /* Add a new command */ void add_cmd(char *name, cmd_function operation, char *documentation) { cmd_ptr next_cmd = cmd_list; cmd_ptr *last_loc = &cmd_list; while (next_cmd && strcmp(name, next_cmd->name) > 0) { last_loc = &next_cmd->next; next_cmd = next_cmd->next; } cmd_ptr ele = (cmd_ptr) malloc_or_fail(sizeof(cmd_ele), "add_cmd"); ele->name = name; ele->operation = operation; ele->documentation = documentation; ele->next = next_cmd; *last_loc = ele; } ``` the code above implies that a new `cmd_ptr` is created according to given arguments and appended to the **global** singly-linked list `cmd_list`. The information above explains why I got same elements in the front part of `cmd_list` everytimes, even giving different command. ::: :::danger 又陳好率 :notes:神祕人 :::