# 2019q1 Homework1 (lab0)
contributed by <`johnnylord` >
###### tags: `sysprog2019`
## Implementation of queue
### Design of the queue data structure
As the docs said, `q_insert_tail` and `q_size` should operate in time ==O(1)==, I need to **add additional field** to track the tail and the size of queue.
```clike
/* Queue structure */
typedef struct {
list_ele_t *head; // Track the head of queue
list_ele_t *tail; // Track the tail of queue
unsigned int size; // Track the size of queue
} queue_t;
```
### q_new()
Create a new, empty queue.
```clike
queue_t *q_new()
{
queue_t *q = malloc(sizeof(queue_t));
// If there is no free space for malloc
if (!q) {
return NULL;
}
q->head = NULL;
q->tail = NULL;
q->size = 0;
return q;
}
```
When construct or new a brand new queue, **make sure you initialize all the fields in it**. Otherwise, there is a big chance that other functions(e.g `q_insert_head`, `q_insert_tail`, etc) might use the wrong value of these fields.
### q_free()
Free all storage used by a queue
```clike
void q_free(queue_t *q)
{
list_ele_t *last = NULL;
// Check if q can be freed or not
if (!q) {
return;
}
/* Free all the list elements, including their strings */
for (list_ele_t *ptr = q->head; ptr != NULL; last = ptr, ptr = ptr->next) {
if (last) {
free(last);
}
free(ptr->value);
}
// Free the last element left by above loop if exist
if (last) {
free(last);
}
free(q);
}
```
As **the space of the string** of each list element is allocated from the **heap**, I have to free the space of string **before** freeing the list element. Otherwise, there will be lots of **memory leak** in the heap space.
### q_insert_head()
Attempt to insert a new element at the head of the queue (LIFO discipline).
```clike
bool q_insert_head(queue_t *q, char *s)
{
list_ele_t *newh;
char *value;
// Check if q is valid queue or not
if (!q) {
return false;
}
// Allocate memory for new element
if (!(newh = malloc(sizeof(list_ele_t)))) {
return false;
}
if (!(value = malloc(sizeof(char) * (strlen(s) + 1)))) {
free(newh);
return false;
}
// copy string into value
strcpy(value, s);
newh->value = value;
// Update linked list
newh->next = q->head;
q->head = newh;
if (!q_size(q)) {
q->tail = newh;
}
q->size++;
return true;
}
```
Thanks to the static analysis feature provided by `cppcheck`, it helps me find sections that cause memory leak. Because I've already allocated some space at Line 12 in the source code, I need to free the space if there is any exception occurs in the following section.
```clike=16
if (!(value = malloc(sizeof(char) * (strlen(s) + 1)))) {
free(newh);
return false;
}
```
### q_insert_tail()
Attempt to insert a new element at the tail of the queue.
```clike
bool q_insert_tail(queue_t *q, char *s)
{
list_ele_t *newh;
char *value;
// Check if q is valid queue or not
if (!q) {
return false;
}
// Allocate memory for new element
if (!(newh = malloc(sizeof(list_ele_t)))) {
return false;
}
if (!(value = malloc(sizeof(char) * (strlen(s) + 1)))) {
free(newh);
return false;
}
// copy string into value
strcpy(value, s);
newh->value = value;
// Update linked list
newh->next = NULL;
if (!q_size(q)) {
q->head = q->tail = newh;
} else {
q->tail->next = newh;
q->tail = newh;
}
q->size++;
return true;
}
```
There is no big difference between `q_insert_tail` and `q_insert_head`. Just be careful when dealing with the update of the linked list. As I've kept the information of tail of the queue in `q->tail`, I can append the new element directly next to `q->tail` without traversing the queue all the way from `q->head` to the end of the queue.
### q_remove_head()
Attempt to remove the element at the head of the queue.
```clike
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
list_ele_t *remove_ele;
if (!q || !q_size(q)) {
return false;
}
// Update linked list
remove_ele = q->head;
q->head = q->head->next;
// Copy removed element's value to sp
if (sp) {
strncpy(sp, remove_ele->value, bufsize - 1);
sp[bufsize - 1] = '\0';
}
// Free the removed element
free(remove_ele->value);
free(remove_ele);
q->size--;
return true;
}
```
Just write the code as the docs said. Easy~
### q_size()
Compute the number of elements in the queue.
```clike
int q_size(queue_t *q)
{
if (!q) {
return 0;
}
return q->size;
}
```
As I keep the `size` information in the queue data structure itself, I don't have to traverse the queue and count the element. Just return the size information of the queue. Easiest part~~
**However, I think the return type of this function should be `unsigned int`. The size of the queue won't be negative.**
### q_reverse()
Reorder the list so that the queue elements are reversed in order.
```clike
void q_reverse(queue_t *q)
{
list_ele_t *last, *ptr, *next, *tmp;
if (!q || !q_size(q)) {
return;
}
// Reverse the linked-list order
last = NULL;
ptr = q->head;
next = q->head->next;
while (ptr != NULL) {
ptr->next = last;
last = ptr;
ptr = next;
if (next) {
next = next->next;
}
}
// Update the head and tail fields of the q
tmp = q->head;
q->head = q->tail;
q->tail = tmp;
return;
}
```
## How qtest works
Here is the entry point of `qtest`. First of all, do associated action based on the options and arguments user type in.
```clike
#define BUFSIZE 256
int main(int argc, char *argv[])
{
/* To hold input file name */
char buf[BUFSIZE];
char *infile_name = NULL;
char lbuf[BUFSIZE];
char *logfile_name = NULL;
int level = 4;
int c;
while ((c = getopt(argc, argv, "hv:f:l:")) != -1) {
switch (c) {
case 'h':
usage(argv[0]);
break;
case 'f':
strncpy(buf, optarg, BUFSIZE);
buf[BUFSIZE - 1] = '\0';
infile_name = buf;
break;
case 'v':
level = atoi(optarg);
break;
case 'l':
strncpy(lbuf, optarg, BUFSIZE);
buf[BUFSIZE - 1] = '\0';
logfile_name = lbuf;
break;
default:
printf("Unknown option '%c'\n", c);
usage(argv[0]);
break;
}
}
// Skip ...
}
```
> Reference
> `$ man 3 getopt`
Then, it does some essential initialization(e.g queue, console, etc).
```clike
int main(int argc, char *argv[]) {
// Skip ...
queue_init();
init_cmd();
console_init();
// Skip ...
}
```
* **queue_init()**
Initialize the essential global variable for queue and register two signal handlers.
```clike
/* Signal handlers */
void sigsegvhandler(int sig)
{
trigger_exception(
"Segmentation fault occurred. You dereferenced a NULL or invalid "
"pointer");
}
void sigalrmhandler(int sig)
{
trigger_exception(
"Time limit exceeded. Either you are in an infinite loop, or your "
"code is too inefficient");
}
static void queue_init()
{
fail_count = 0;
q = NULL;
signal(SIGSEGV, sigsegvhandler);
signal(SIGALRM, sigalrmhandler);
}
```
> Use `sigaction` instead of `signal` as the docs said below
:::danger
Portability
The only portable use of signal() is to set a signal's disposition to SIG_DFL or SIG_IGN.
**The semantics when using signal() to establish a signal handler vary across systems** (and POSIX.1 explicitly permits this variation); do not use it for this purpose.
**POSIX.1 solved the portability mess by specifying sigaction(2)**, which provides explicit control of the semantics when a signal handler is invoked; use that interface instead ofsignal().
> Reference
> `man 2 signal` > Section 'Portability'
:::
* **init_cmd()**
Setup the commands that can be used in the console.
```clike
// In console.c
void init_cmd()
{
cmd_list = NULL;
param_list = NULL;
err_cnt = 0;
quit_flag = false;
add_cmd("help", do_help_cmd, " | Show documentation");
add_cmd("option", do_option_cmd,
" [name val] | Display or set options");
add_cmd("quit", do_quit_cmd, " | Exit program");
add_cmd("source", do_source_cmd,
" file | Read commands from source file");
add_cmd("log", do_log_cmd, " file | Copy output to file");
add_cmd("time", do_time_cmd, " cmd arg ... | Time command execution");
add_cmd("#", do_comment_cmd, " ... | Display comment");
add_param("verbose", &verblevel, "Verbosity level", NULL);
add_param("error", &err_limit, "Number of errors until exit", NULL);
add_param("echo", &echo, "Do/don't echo commands", NULL);
#if 0
add_param("megabytes", &mblimit, "Maximum megabytes allowed", NULL);
add_param("seconds", &timelimit, "Maximum seconds allowed",
change_timeout);
#endif
init_in();
init_time(&last_time);
first_time = last_time;
}
```
Here's the data type of some variables used in `init_cmd`. As they are all `static` variable, these variables are only visible in `console.c`.
```clike
// In console.c
static cmd_ptr cmd_list = NULL;
static param_ptr param_list = NULL;
static int err_cnt = 0;
static bool quit_flag = false;
```
```clike
// In console.h
/* Implementation of simple command-line interface */
/* Each command defined in terms of a function */
typedef bool (*cmd_function)(int argc, char *argv[]);
/* Information about each command */
/* Organized as linked list in alphabetical order */
typedef struct CELE cmd_ele, *cmd_ptr;
struct CELE {
char *name;
cmd_function operation;
char *documentation;
cmd_ptr next;
};
```
We can see that the type of `cmd_list` is `cmd_ptr`. And `cmd_ptr` is an alias of `struct CELE*`. When look into the `struct CELE` further, it's a one way linked-list, and each element stores the information of its command metadata(e.g `name` and `documentation`) and its associated **function's address**.
> Here is the circumstance that we can do sorting on the linked-list(if the `cmd_list` is out of order), as the `cmd_list` is sorted in alphabetical order. ==Jserv teacher asked student, "Why we need to do sorting on linked-list?"==
Let's take a look at how `add_cmd` work...
```clike
/* Add a new command */
void add_cmd(char *name, cmd_function operation, char *documentation)
{
cmd_ptr next_cmd = cmd_list;
cmd_ptr *last_loc = &cmd_list;
while (next_cmd && strcmp(name, next_cmd->name) > 0) {
last_loc = &next_cmd->next;
next_cmd = next_cmd->next;
}
cmd_ptr ele = (cmd_ptr) malloc_or_fail(sizeof(cmd_ele), "add_cmd");
ele->name = name;
ele->operation = operation;
ele->documentation = documentation;
ele->next = next_cmd;
*last_loc = ele;
}
```
Here is a interesting part, the function use double pointer, `cmd_ptr*`, to keep track of the last location(either the address of **head of the list** or the address of `next` field inside the structure). **In this way, the code can be written in a more general way**.
If I implements this, I might write the code in this way... I have to use `if` statement to determine the last location, and take associated action. (Not general)
```clike
void add_cmd(char *name, cmd_function operation, char *documentation)
{
cmd_ptr prev = NULL;
cmd_ptr ptr = cmd_list;
while (ptr && strcmp(name, ptr->name) > 0) {
prev = ptr;
ptr = ptr->next;
}
cmd_ptr ele = (cmd_ptr) malloc_or_fail(sizeof(cmd_ele), "add_cmd");
ele->name = name;
ele->operation = operation;
ele->documentation = documentation;
ele->next = ptr;
if (pre)
pre->next = ele;
else
cmd_list = ele;
}
```
### Console section
The technique that it used to deal with the I/O is something called [RIO(Robust I/O)](https://hackmd.io/s/H1TtmVTTz). RIO is a buffered IO. One of the advantage of buffered I/O is that it can decrease the number of time it calls system call(e.g read, write, etc)
Here is a simple demo that show the difference when we used buffered I/O and unbuffered I/O.
* **buffered version**
```clike
// buffered.c
# include <stdio.h>
int main(int argc, char *argv[])
{
/* Buffered input */
printf("a..");
printf("b..");
printf("c..");
printf("d..");
fflush(NULL); // or use printf("\n");
return 0;
}
```
```
$ gcc -o buffered buffered.c
$ strace -e write ./buffered
write(1, "a..b..c..d..", 12a..b..c..d..) = 12
```
* **non-buffered version**
```clike
// nonbuffered.c
# include <stdio.h>
int main(int argc, char *argv[])
{
/* nonbuffered input */
printf("a..\n");
printf("b..\n");
printf("c..\n");
printf("d..\n");
return 0;
}
```
```
$ gcc -o nonbuffered nonbuffered.c
$ strace -e write ./nonbuffered
write(1, "a..\n", 4a..
) = 4
write(1, "b..\n", 4b..
) = 4
write(1, "c..\n", 4c..
) = 4
write(1, "d..\n", 4d..
) = 4
```
I use `strace` tool to analyze the system call the program calls. If I don't buffered the input, like `nonbuffered.c`, there will be lots of I/O system call.
> The more system calls, the more context-switch. And in consequence, slow down the program.
Here is the data structure of RIO.
```clike
#define RIO_BUFSIZE 8192
typedef struct RIO_ELE rio_t, *rio_ptr;
struct RIO_ELE {
int fd; /* File descriptor */
int cnt; /* Unread bytes in internal buffer */
char *bufptr; /* Next unread byte in internal buffer */
char buf[RIO_BUFSIZE]; /* Internal buffer */
rio_ptr prev; /* Next element in stack */
};
rio_ptr buf_stack;
```
In the process, all the opened test files will have their own RIO data structure. `fd` will keep the file descriptor of the file, and `buf` is the buffer for its file.
And here is how the program utilize this data structure
```clike=
/* Read command from input file.
When hit EOF, close that file and return NULL
*/
static char *readline()
{
int cnt;
char c;
char *lptr = linebuf;
if (buf_stack == NULL)
return NULL;
for (cnt = 0; cnt < RIO_BUFSIZE - 2; cnt++) {
if (buf_stack->cnt <= 0) {
/* Need to read from input file */
buf_stack->cnt = read(buf_stack->fd, buf_stack->buf, RIO_BUFSIZE);
buf_stack->bufptr = buf_stack->buf;
if (buf_stack->cnt <= 0) {
/* Encountered EOF */
pop_file();
if (cnt > 0) {
/* Last line of file did not terminate with newline. */
/* Terminate line & return it */
*lptr++ = '\n';
*lptr++ = '\0';
if (echo) {
report_noreturn(1, prompt);
report_noreturn(1, linebuf);
}
return linebuf;
} else
return NULL;
}
}
/* Have text in buffer */
c = *buf_stack->bufptr++;
*lptr++ = c;
buf_stack->cnt--;
if (c == '\n')
break;
}
if (c != '\n') {
/* Hit buffer limit. Artificially terminate line */
*lptr++ = '\n';
}
*lptr++ = '\0';
if (echo) {
report_noreturn(1, prompt);
report_noreturn(1, linebuf);
}
return linebuf;
}
```
Basically, this function will contiuously read content from the file until it encounter `\n` or reach `EOF`...
```clike=15
/* Need to read from input file */
buf_stack->cnt = read(buf_stack->fd, buf_stack->buf, RIO_BUFSIZE);
buf_stack->bufptr = buf_stack->buf;
```
And then it take out each character in the buffered, and process then.
```clike
static char *readline()
{
int cnt;
char c;
char *lptr = linebuf;
if (buf_stack == NULL)
return NULL;
for (cnt = 0; cnt < RIO_BUFSIZE - 2; cnt++) {
if (buf_stack->cnt <= 0) {
/* Need to read from input file */
buf_stack->cnt = read(buf_stack->fd, buf_stack->buf, RIO_BUFSIZE);
buf_stack->bufptr = buf_stack->buf;
// skip ...
}
/* Have text in buffer */
c = *buf_stack->bufptr++;
*lptr++ = c;
buf_stack->cnt--;
if (c == '\n')
break;
}
// Skip ...
return linebuf;
}
```
After reading a line, we need to interpret the line. The process will then call some functions in order `...` -> `parse_args` -> `interpret_cmda` -> `...`.
The `parse-args` will transform the line into a list of arguments. And send the number of arguments(`argc`) and the list of arguments(`argv`) to `interpret_cmda` for further processing.
Inside `interpret_cmda`...
```clike=
/* Execute a command that has already been split into arguments */
static bool interpret_cmda(int argc, char *argv[])
{
if (argc == 0)
return true;
/* Try to find matching command */
cmd_ptr next_cmd = cmd_list;
bool ok = true;
while (next_cmd && strcmp(argv[0], next_cmd->name) != 0)
next_cmd = next_cmd->next;
if (next_cmd) {
ok = next_cmd->operation(argc, argv);
if (!ok)
record_error();
} else {
report(1, "Unknown command '%s'", argv[0]);
record_error();
ok = false;
}
return ok;
}
```
We can see that it will try to find the function named `argv[0]` in the `cmd_list`, and call the function. If the return value or the value of `ok` is not true, then it will update the global variable `err_cnt` in `record_error()`.
```clike=7
while (next_cmd && strcmp(argv[0], next_cmd->name) != 0)
next_cmd = next_cmd->next;
if (next_cmd) {
ok = next_cmd->operation(argc, argv);
if (!ok)
record_error();
}
```
```clike
void record_error()
{
err_cnt++;
if (err_cnt >= err_limit) {
report(1, "Error limit exceeded. Stopping command execution");
quit_flag = true;
}
}
```
Then it will check if there is any error(`err_cnt` should be `0`) during the process and return associated value.
```clike
bool run_console(char *infile_name)
{
if (!push_file(infile_name)) {
report(1, "ERROR: Could not open source file '%s'", infile_name);
return false;
}
while (!cmd_done()) {
cmd_select(0, NULL, NULL, NULL, NULL);
}
return err_cnt == 0;
}
```
Finally, the process return `1` if there is any error that occurs during the process, and `0` otherwise.
```clike
#define BUFSIZE 256
int main(int argc, char *argv[])
{
/* To hold input file name */
char buf[BUFSIZE];
char *infile_name = NULL;
char lbuf[BUFSIZE];
char *logfile_name = NULL;
int level = 4;
int c;
// Skip ...
bool ok = true;
ok = ok && run_console(infile_name);
ok = ok && finish_cmd();
return ok ? 0 : 1;
}
```
## How automatic grading system works
By seeing the `Makefile`, I can tell that it executes the `driver.py` script in the scripts/ folder.
```make
test: qtest scripts/driver.py
scripts/driver.py
```
Thte `Tracer` stores the information about each test case, and store the score we can get.
```python=
class Tracer:
traceDirectory = "./traces"
qtest = "./qtest"
command = qtest
verbLevel = 0
autograde = False
useValgrind = False
traceDict = {
1 : "trace-01-ops",
2 : "trace-02-ops",
3 : "trace-03-ops",
4 : "trace-04-ops",
5 : "trace-05-ops",
6 : "trace-06-string",
7 : "trace-07-robust",
8 : "trace-08-robust",
9 : "trace-09-robust",
10 : "trace-10-malloc",
11 : "trace-11-malloc",
12 : "trace-12-malloc",
13 : "trace-13-perf",
14 : "trace-14-perf",
15 : "trace-15-perf"
}
traceProbs = {
1 : "Trace-01",
2 : "Trace-02",
3 : "Trace-03",
4 : "Trace-04",
5 : "Trace-05",
6 : "Trace-06",
7 : "Trace-07",
8 : "Trace-08",
9 : "Trace-09",
10 : "Trace-10",
11 : "Trace-11",
12 : "Trace-12",
13 : "Trace-13",
14 : "Trace-14",
15 : "Trace-15"
}
maxScores = [0, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7]
# ...
```
This script will try to run each test case and figure out the score we get.
```python
def run(self, tid = 0):
# Skip ...
for t in tidList:
tname = self.traceDict[t]
if self.verbLevel > 0:
print("+++ TESTING trace %s:" % tname)
ok = self.runTrace(t)
maxval = self.maxScores[t]
tval = maxval if ok else 0
print("---\t%s\t%d/%d" % (tname, tval, maxval))
score += tval
maxscore += maxval
scoreDict[t] = tval
print("---\tTOTAL\t\t%d/%d" % (score, maxscore))
# Skip ...
```
```python
def runTrace(self, tid):
# Skip ...
fname = "%s/%s.cmd" % (self.traceDirectory, self.traceDict[tid])
vname = "%d" % self.verbLevel
clist = [self.command, self.qtest, "-v", vname, "-f", fname]
try:
retcode = subprocess.call(clist)
except Exception as e:
print("Call of '%s' failed: %s" % (" ".join(clist), e))
return False
return retcode == 0
```
## Things that I'm still working on...
- Learn how to use `gdb` to trace the code and analyze the behaviour during the process
> Inspired by [ztex](https://hackmd.io/s/BkYsI4FBN)'s hackmd
- Learn `valgrind` to analyze memory usage
> Inspired by [Naetw](https://hackmd.io/s/BysQssYHN)'s hackmd