contributed by < cantfindagoodname >
$ cat /proc/version
Linux version 5.13.0-30-generic (buildd@lcy02-amd64-003) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #33~20.04.1-Ubuntu SMP Mon Feb 7 14:25:10 UTC 2022
$ gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Note : current progress
# Github Action
$ Run make test || (cat scripts/weeping.raw ; exit 1)
...
--- TOTAL 100/100
# Local machine
$ make test
...
+++ TESTING trace trace-14-perf:
# Test performance of insert_tail, reverse, and sort
ERROR: Time limit exceeded. Either you are in an infinite loop, or your code is too inefficient
--- trace-14-perf 0/6
...
--- TOTAL 94/100
q_new
is expected to return a list_head
object that acts as the head of list.
It allocates the list_head
object that points to itself, implemented in INIT_LIST_HEAD(head)
struct list_head *q_new()
{
struct list_head *head = malloc(sizeof *head);
if (head == NULL)
return NULL;
INIT_LIST_HEAD(head);
return head;
}
q_free
iterates through the queue then free all storage used by queue.
list_for_each_entry_safe(...)
is used because the function would remove element of current iteration of loop. list_for_each_entry_safe(...)
would have a pointer safe
to store the address of entry->next
, so that when entry->next
points to other address, list_for_each_entry_safe
can still find the address for next iteration.
q_free()
intention is to free every memory space allocated by the for the queue. Which includes the list_head
structure. And so, the function would iterate through the queue one by one, releasing memory of both the char *
value and the memory for container itself, using q_release_element
that was provided by default.
When the iterator exits the loop, every element in the queue is freed, and only the list_head
or queue is remain, as the objective of the q_free()
function is to release every memory space allocated for the queue, the remaining list_head
structure is released as well.
Improve your writing. You should mention the approach how you release the intended resouces.
void q_free(struct list_head *l)
{
if (l == NULL)
return;
element_t *elem, *safe;
list_for_each_entry_safe (elem, safe, l, list) {
q_release_element(elem);
}
free(l);
}
q_insert_head
inserts an element to head of queue, which next of head of list points to.
Linux kernel API already had list_add(...)
function that does that.
q_insert_head
assigns the value field then insert the element after head of list.
If any of the malloc()
fails, free every previously allocated memory then return false
.
bool q_insert_head(struct list_head *head, char *s)
{
if (head == NULL)
return false;
element_t *elem = malloc(sizeof *elem);
if (elem == NULL)
return false;
elem->value = malloc(strlen(s) + 1);
if(elem->value == NULL){
free(elem);
return false;
}
strncpy(elem->value, s, strlen(s) + 1);
list_add(&elem->list, head);
return true;
}
q_insert_tail
inserts an element to tail of queue, which prev of head of list points to.
Linux kernel API already had list_add_tail(...)
function that does that.
q_insert_tail
assigns the value field then insert the element before head of list.
If any of the malloc()
fails, free every previously allocated memory then return false
.
bool q_insert_tail(struct list_head *head, char *s)
{
if (head == NULL)
return false;
element_t *elem = malloc(sizeof *elem);
if (elem == NULL)
return false;
elem->value = malloc(strlen(s) + 1);
if(elem->value == NULL){
free(elem);
return false;
}
strncpy(elem->value, s, strlen(s) + 1);
list_add_tail(&elem->list, head);
return true;
}
q_remove_head
removes the head of queue, which next of head of list points to. Linux Kernel API has a function list_del(...)
that unlinks a node from the list. q_remove_head
indicates the head of queue to unlink with list_del(...)
.
The value of removed node would be copied to sp
. As it is required to make this function works in constant time, the character is copied one at a time until it reaches end of string. After that, it would continue doing some assignment operation to ensure the loop will run for bufsize - 1
iterations no matter then length of string to be copied.
element_t *q_remove_head(struct list_head *head, char *sp, size_t bufsize)
{
if (head == NULL || list_empty(head))
return NULL;
struct list_head *node = head->next;
list_del_init(node);
if (sp != NULL) {
size_t len = (bufsize - 1) ^
(((bufsize - 1) ^
(strlen(list_entry(node, element_t, list)->value))) &
-(strlen(list_entry(node, element_t, list)->value) <
(bufsize - 1)));
char *psp = list_entry(node, element_t, list)->value;
for (int i = 0; i < bufsize - 1; ++i) {
sp[i ^ ((i ^ len) & -(i > len))] = *(psp + (i & -(i < len)));
}
sp[len] = '\0';
}
return list_entry(node, element_t, list);
}
q_remove_tail
removes the head of queue, which next of head of list points to. Linux Kernel API has a function list_del(...)
that unlinks a node from the list. q_remove_tail
indicates the tail of queue to unlink with list_del(...)
The value of removed node would be copied to sp
. As it is required to make this function works in constant time, the character is copied one at a time until it reaches end of string. After that, it would continue doing some assignment operation to ensure the loop will run for bufsize - 1
iterations no matter then length of string to be copied.
element_t *q_remove_tail(struct list_head *head, char *sp, size_t bufsize)
{
if (head == NULL || list_empty(head))
return NULL;
struct list_head *node = head->prev;
list_del_init(node);
if (sp != NULL) {
strncpy(sp, list_entry(node, element_t, list)->value, bufsize);
sp[bufsize - 1] = '\0';
}
return list_entry(node, element_t, list);
}
q_size
returns the number of elements in the queue.
It uses list_for_each(...)
macro from Linux Kernel API that iterates through the list, which would tranverse through each element exactly once in the process.
int q_size(struct list_head *head)
{
int size = 0;
if (head == NULL || list_empty(head))
return size;
struct list_head *node;
list_for_each (node, head) {
++size;
}
return size;
}
q_delete_mid
deletes the middle element ( take floor if q_size
is odd ) of the queue.
As the queue is implemented as a circular, doubly-linked list, it is possible to find the middle element of the queue by going from both tail-to-head and head-to-tail, a step at a time.
The function would first moves forward then backward as the desired result is floor of q_size
.
bool q_delete_mid(struct list_head *head)
{
if (head == NULL || list_empty(head))
return false;
struct list_head *forward = NULL, *backward = NULL;
for(forward = head->next,backward = head->prev;forward != backward;backward = backward->prev){
forward = forward->next;
if (forward == backward)
break;
}
list_del_init(forward);
q_release_element(list_entry(forward, element_t, list));
return true;
}
Rewrite with for iteration. Shorten the code.
Question :
Why is for loop better than while loop?
Even though the code is shorter, i feel like while loop is easier to read / understand
bool q_delete_mid(struct list_head *head)
{
if (head == NULL || list_empty(head))
return false;
struct list_head *forward, *backward;
forward = head->next;
backward = head->prev;
while (forward != backward) {
forward = forward->next;
if (forward == backward)
break;
backward = backward->prev;
}
list_del_init(forward);
q_release_element(list_entry(forward, element_t, list));
return true;
}
q_delete_dup
removes every element that has one or more duplicate copy of itself.
As stated in the documentation, it is only used after sorting the queue, which means every duplicates must be consecutive.
The fuction is currently implemented with a foreach loop, which then it will check if for current and next entry is duplicate of each other, then delete if it is.
After it found a new element when deleting the duplicates, it will then delete once again for the last remaining duplicate.
In theory, we could delete all the duplicate element by detaching them from the original queue. However, as we have still have to iterate through each node to release them, there should not be much imporvement in performance.
bool q_delete_dup(struct list_head *head)
{
if (head == NULL)
return false;
element_t *elem, *safe, *last = NULL;
list_for_each_entry_safe (elem, safe, head, list) {
if (&safe->list == head)
break;
if (!strcmp(elem->value, safe->value)) {
last = safe;
list_del_init(&elem->list);
q_release_element(elem);
} else {
if (last != NULL && last == elem) {
list_del_init(&elem->list);
q_release_element(elem);
}
}
}
if (last == list_last_entry(head, element_t, list)) {
list_del_init(&elem->list);
q_release_element(elem);
}
return true;
}
q_swap
would swap every two adjacent elements in the queue.
The implementation is pretty straight forward as it simply tranverse through the list and relink all 6 edges connecting the two elements.
void q_swap(struct list_head *head)
{
int phase = 0;
struct list_head *node, *safe;
list_for_each_safe (node, safe, head) {
if (phase) {
struct list_head *swap = node->prev;
swap->prev->next = node;
node->prev->next = safe;
node->next = swap;
safe->prev = swap;
node->prev = swap->prev;
swap->prev = node;
}
phase ^= 1;
}
}
q_reverse
reverses the queue.
Similar the rule of a LIFO structure, when repeatedly inserting an element to the head of the queue, the very first element inserted would became the last element by the time the function reaches the end of the queue.
void q_reverse(struct list_head *head)
{
if (head == NULL || list_empty(head))
return;
struct list_head *node, *safe;
list_for_each_safe (node, safe, head) {
list_del(node);
list_add(node, head);
}
}
q_sort
sorts the queue.
The implementation is with a simple merge sort algorithm.
The merge sort algorithm consists of three functions, merge
, merge_sort
, and q_sort
.
The body of merge
comes mainly from lib/list_sort.c, the only modification made is on the comparison if statement.
q_sort
itself only breaks the circular, doubly-linked list into a non-circular, singly-linked list before merge_sort
, then connects them back into a circular, doubly-linked list after merge_sort
.
The algorithm will search for the middle element then breaks the queue into 2 singly-linked list, then it will merge
them recursively.
Middle of list is found by using Floyd's Tortoise and Hare algorithm which originally used as algorithm to show that list is cirular. Conveniently, the " tortoise " would stop by the middle of list at the end of the algorithm. I opt for this algorithm to find the middle node instead of the algorithm in q_delete_middle
as the queue is no longer a circular linked list at this point.
Reference :
struct list_head *merge(struct list_head *a, struct list_head *b)
{
struct list_head *head = NULL, **tail = &head;
for (;;) {
if (strcmp(list_entry(a, element_t, list)->value,
list_entry(b, element_t, list)->value) <= 0) {
*tail = a;
tail = &a->next;
a = a->next;
if (!a) {
*tail = b;
break;
}
} else {
*tail = b;
tail = &b->next;
b = b->next;
if (!b) {
*tail = a;
break;
}
}
}
return head;
}
struct list_head *merge_sort(struct list_head *head)
{
if (head == NULL || head->next == NULL)
return head;
struct list_head *left = head;
struct list_head *right = left->next;
while (right && right->next) {
left = left->next;
right = right->next->next;
}
right = left->next;
left->next = NULL;
left = merge_sort(head);
right = merge_sort(right);
return merge(left, right);
}
void q_sort(struct list_head *head)
{
if (head == NULL || list_empty(head) || list_is_singular(head))
return;
struct list_head *start = head->next;
struct list_head *final = head->prev;
start->prev = final->next = NULL;
INIT_LIST_HEAD(head);
head->next = merge_sort(start);
head->next->prev = head;
struct list_head *temp = head->next;
for (; temp->next != NULL; temp = temp->next) {
temp->next->prev = temp;
}
head->prev = temp;
temp->next = head;
}
Indent the above source listing. Don't paste from the terminal. Instead, use proper editors which fit the use of system-wide clipboard.
qtest
提供新的命令 shuffle
Makefile
, we have -g
in it which means we gdb is available to trace the program.CFLAGS = -O1 -g -Wall -Werror -Idudect -I.
qtest
actually work, I place a break point under a function qtest
would invoke.(gdb) list q_new
15 /*
16 * Create empty queue.
17 * Return NULL if could not allocate space.
18 */
19 struct list_head *q_new()
20 {
21 struct list_head *head = malloc(sizeof *head);
22 if (head == NULL)
23 return NULL;
24 INIT_LIST_HEAD(head);
(gdb) b 21
Breakpoint 1 at 0x67b8: file queue.c, line 21.
bt
to print backtrace of function stack.(gdb) bt
#0 q_new () at queue.c:21
#1 0x000055555555825b in do_new (argc=<optimized out>, argv=<optimized out>) at qtest.c:129
#2 0x00005555555592e1 in interpret_cmda (argc=argc@entry=1, argv=argv@entry=0x555555566e20) at console.c:185
#3 0x000055555555982d in interpret_cmd (cmdline=<optimized out>, cmdline@entry=0x5555555669f0 "new") at console.c:208
#4 0x000055555555a2dd in run_console (infile_name=<optimized out>) at console.c:649
#5 0x00005555555587f6 in main (argc=1, argv=0x7fffffffdec8) at qtest.c:992
interpret_cmda
, I found cmd_list
that should be where the commands are initialized.cmd_ptr next_cmd = cmd_list;
cmd_list
, the macro ADD_COMMAND
and function add_cmd
is found in init_cmd
.ADD_COMMAND(help, " | Show documentation");
ADD_COMMAND(option, " [name val] | Display or set options");
ADD_COMMAND(quit, " | Exit program");
ADD_COMMAND(source, " file | Read commands from source file");
ADD_COMMAND(log, " file | Copy output to file");
ADD_COMMAND(time, " cmd arg ... | Time command execution");
add_cmd("#", do_comment_cmd, " ... | Display comment");
add_param("simulation", &simulation, "Start/Stop simulation mode", NULL);
add_param("verbose", &verblevel, "Verbosity level", NULL);
add_param("error", &err_limit, "Number of errors until exit", NULL);
add_param("echo", &echo, "Do/don't echo commands", NULL);
ADD_COMMAND
and prototype of add_cmd
in console.h
./* Add a new command */
void add_cmd(char *name, cmd_function operation, char *documentation);
#define ADD_COMMAND(cmd, msg) add_cmd(#cmd, do_##cmd, msg)
console_init
of qtest.c
.ADD_COMMAND(new, " | Create new queue");
ADD_COMMAND(free, " | Delete queue");
ADD_COMMAND(
ih,
" str [n] | Insert string str at head of queue n times. "
"Generate random string(s) if str equals RAND. (default: n == 1)");
ADD_COMMAND(
it,
" str [n] | Insert string str at tail of queue n times. "
"Generate random string(s) if str equals RAND. (default: n == 1)");
ADD_COMMAND(
rh,
" [str] | Remove from head of queue. Optionally compare "
"to expected value str");
ADD_COMMAND(
rt,
" [str] | Remove from tail of queue. Optionally compare "
"to expected value str");
ADD_COMMAND(
rhq,
" | Remove from head of queue without reporting value.");
ADD_COMMAND(reverse, " | Reverse queue");
ADD_COMMAND(sort, " | Sort queue in ascending order");
ADD_COMMAND(
size, " [n] | Compute queue size n times (default: n == 1)");
ADD_COMMAND(show, " | Show queue contents");
ADD_COMMAND(dm, " | Delete middle node in queue");
ADD_COMMAND(
dedup, " | Delete all nodes that have duplicate string");
ADD_COMMAND(swap,
" | Swap every two adjacent nodes in queue");
console_init
, I use swap
as reference to implement ADD_COMMAND
for shuffle
.ADD_COMMAND(shuffle, " | Shuffle list");
do_shuffle
, do_swap
is used as reference.static bool do_shuffle(int argc, char *argv[])
{
if (argc != 1) {
report(1, "%s takes no arguments", argv[0]);
return false;
}
if (!l_meta.l)
report(3, "Warning: Calling shuffle on null queue");
error_check();
int cnt = q_size(l_meta.l);
if (cnt < 2)
report(3, "Warning: Calling shuffle on single node");
error_check();
set_noallocate_mode(true);
if (exception_setup(true))
q_shuffle(l_meta.l);
exception_cancel();
set_noallocate_mode(false);
bool ok = true;
show_queue(3);
return ok && !error_check();
}
q_shuffle
is included in q_test.c
.extern void q_shuffle(struct list_head *);
q_shuffle
is implemented following Fisher-Yates shuffle algorithm.void q_shuffle(struct list_head *head)
{
int size = q_size(head);
struct list_head *node, *end = head;
while (size > 0) {
node = head->next;
int roll = rand() % size--;
for (int i = 0; i < roll; ++i)
node = node->next;
list_move_tail(end->prev, node);
list_move_tail(node, end);
end = end->prev;
}
}
Image from Wikipedia
qtest
.cmd> new
l = []
cmd> it RAND 8
l = [tardudv ummpv mbnegzc mfsffj fhfgbs eglgvqo gtlll skhzpncj]
cmd> shuffle
l = [skhzpncj tardudv eglgvqo gtlll fhfgbs mfsffj ummpv mbnegzc]
cmd> shuffle
l = [skhzpncj eglgvqo mfsffj fhfgbs tardudv mbnegzc ummpv gtlll]
cmd> sort
l = [eglgvqo fhfgbs gtlll mbnegzc mfsffj skhzpncj tardudv ummpv]
cmd> reverse
l = [ummpv tardudv skhzpncj mfsffj mbnegzc gtlll fhfgbs eglgvqo]
cmd> sort
l = [eglgvqo fhfgbs gtlll mbnegzc mfsffj skhzpncj tardudv ummpv]
cmd> shuffle
l = [eglgvqo tardudv skhzpncj ummpv mbnegzc gtlll mfsffj fhfgbs]
cmd> shuffle
l = [ummpv skhzpncj tardudv eglgvqo fhfgbs mbnegzc gtlll mfsffj]
qtest
提供新的命令 web
All codes of this section comes from lab-0 description and tiny-web-server with slight modification.
I study from tiny-web-server to better grasp how a web-server works.
After reverting to the inital commit, I removed some of the codes from main function that are not essential.
main
int main(int argc, char** argv) {
struct sockaddr_in clientaddr;
int default_port = 9999, clientlen = sizeof clientaddr;
int listenfd, connfd;
listenfd = open_listenfd(default_port);
while(1) {
connfd = accept(listenfd, (SA *)&clientaddr, &clientlen);
process(connfd, &clientaddr);
close(connfd);
}
return 0;
}
The program would run a web-server that let its user browse his directory.
It is clear that the main
function only listen
the port and would accept
a file descriptor that are requesting for service.
Then it leaves everything else to process
.
process
void process(int fd, struct sockaddr_in *clientaddr) {
rio_t rio;
char buf[MAXLINE], method[MAXLINE], uri[MAXLINE], version[MAXLINE];
rio_readinitb(&rio, fd);
rio_readlineb(&rio, buf, MAXLINE);
sscanf(buf, "%s %s %s", method, uri, version);
/* read all */
while(buf[0] != '\n' && buf[1] != '\n') { /* \n || \r\n */
rio_readlineb(&rio, buf, MAXLINE);
}
char* filename = uri;
if(uri[0] == '/') {
filename = uri + 1;
if (strlen(filename) == 0){
filename = ".";
}
}
struct stat sbuf;
int status = 200, size = 0;
int ffd = open(filename, O_RDONLY, 0);
if(ffd <= 0) {
status = 404;
char *msg = "File not found";
client_error(fd, status, "Not found", msg);
size = strlen(msg);
} else {
fstat(ffd, &sbuf);
if(S_ISREG(sbuf.st_mode)) {
size = sbuf.st_size;
server_static(fd, ffd ,sbuf.st_size);
} else if(S_ISDIR(sbuf.st_mode)) {
status = 200;
char *msg = "Directory listing is not implemented yet";
size = strlen(msg);
client_error(fd, status, "OK", msg);
} else {
status = 400;
char *msg = "Unknow Error";
size = strlen(msg);
client_error(fd, status, "Error", msg);
}
}
log_access(status, size, clientaddr, filename);
close(ffd);
}
Currently, the web-server handles request by browsing user's directory.
I modify the program slighty to turn the program into an echo web-server.
// process
void process(int fd, struct sockaddr_in *clientaddr)
{
http_request req;
parse_request(fd, &req);
handle_request(fd, req.filename);
/* ... */
char *p = req.filename;
while(*p && (*p) != '\0'){
++p;
if(*p == '/'){
*p = ' ';
}
}
}
// handle_request
void handle_request(int out_fd, char *filename)
{
writen(out_fd, filename, strlen(filename));
close(out_fd);
}
Shell 1
curl --http0.9 localhost:9999/hello/world
Shell 2
$ ./tiny
listen on port 9999, fd is 3
accept request, fd is 4, pid is 25999
127.0.0.1:60416 200 - 'hello world' (text/plain)
I then follow the description to port tiny.c
to qtest
.
Using ADD_COMMAND
, I moved entiretly of main
function of tiny.c
to do_web()
.
After fixing some dependencies for tiny.c
and rest of the program, I call web
from qtest
.
<cmd> web
now opens the echo web-server, however, I can no longer make a request via a console with the web-server on.
Following the description, I try to make the program respond to both stdin_fd
from console and listenfd
from web-server.
select()
system call does exactly this, with the code from description, the program would try to respond to both stdin and socket.
if (listenfd) {
fd_set set;
FD_ZERO(&set);
FD_SET(listenfd, &set);
FD_SET(stdin_fd, &set);
int rv = select(listenfd + 1, &set, NULL, NULL, NULL);
struct sockaddr_in clientaddr;
socklen_t clientlen = sizeof clientaddr;
int connfd;
switch (rv) {
case -1:
perror("select"); /* an error occurred */
continue;
case 0:
printf("timeout occurred\n"); /* an timeout occurred */
continue;
default:
if (FD_ISSET(listenfd, &set)) {
connfd = accept(listenfd, (SA *) &clientaddr, &clientlen);
char *p = process(connfd, &clientaddr);
strncpy(buf, p, strlen(p) + 1);
close(connfd);
free(p);
return strlen(p);
} else if (FD_ISSET(stdin_fd, &set)) {
nread = read(l.ifd, &c, 1);
if (nread <= 0)
return l.len;
}
break;
}
} else {
nread = read(l.ifd, &c, 1);
if (nread <= 0)
return l.len;
}
When the web-server is not active yet, the program will execute code as like before. If web-server is active, it will use select()
system call to monitor both the file descriptors as readfds
.
select
system callFrom select(),
Note well: Upon return, each of the file descriptor sets is modified in place to indicate which file descriptors are currently "ready". Thus, if using select() within a loop, the sets must be reinitialized before each call.
First we adds file descriptor to set. Then after file descriptor returns, we can use FD_ISSET()
to determine which of the file descriptor are currently "ready". Which means they are requesting for service.
After setting-up the system call, the program would send a message to the console but qtest
would not read the command from the web-server.
So following the description again I made modification to run_console()
to handle the request instead of handling request in do_web()
. do_web()
now would only open the port to listen to the webserver.
qtest
can now handle request from the web-server.
Improve your writing!
Shell 1
$ ./qtest
cmd> web
listen on port 9999, fd is 3
cmd> accept request, fd is 4, pid is 27571
127.0.0.1:60420 200 - 'new' (text/plain)
l = []
cmd> accept request, fd is 4, pid is 27571
127.0.0.1:60422 200 - 'something not a command' (text/plain)
Unknown command 'something'
cmd> accept request, fd is 4, pid is 27571
127.0.0.1:60424 200 - 'it RAND 10' (text/plain)
l = [yqllkmd mdkaoeer uvjlzx rbido bbafug yggjrhg mynrmzf qcugofg ouxir qerxyef]
cmd> accept request, fd is 4, pid is 27571
127.0.0.1:60426 200 - 'reverse' (text/plain)
l = [qerxyef ouxir qcugofg mynrmzf yggjrhg bbafug rbido uvjlzx mdkaoeer yqllkmd]
cmd> accept request, fd is 4, pid is 27571
127.0.0.1:60428 200 - 'sort' (text/plain)
l = [bbafug mdkaoeer mynrmzf ouxir qcugofg qerxyef rbido uvjlzx yggjrhg yqllkmd]
cmd> accept request, fd is 4, pid is 27571
127.0.0.1:60430 200 - 'quit' (text/plain)
Freeing queue
Shell 2
$ curl --http0.9 localhost:9999/new
I received new
$ curl --http0.9 localhost:9999/something/not/a/command
I received something/not/a/command
$ curl --http0.9 localhost:9999/it/RAND/10
I received it/RAND/10
$ curl --http0.9 localhost:9999/reverse
I received reverse
$ curl --http0.9 localhost:9999/sort
I received sort
$ curl --http0.9 localhost:9999/quit
I received quit
lib/list_sort.c
与 q_sort
的效能落差size, sort, list_sort
5000, 005, 002
10000, 010, 005
15000, 015, 009
20000, 021, 012
25000, 028, 017
30000, 033, 020
35000, 041, 024
40000, 051, 030
45000, 069, 040
50000, 077, 042
55000, 079, 046
60000, 090, 057
65000, 108, 064
70000, 121, 068
75000, 133, 083
80000, 152, 082
85000, 158, 095
90000, 179, 114
95000, 200, 117
100000, 197, 126
105000, 221, 134
110000, 232, 140
115000, 254, 157
120000, 267, 166
125000, 285, 175
130000, 303, 182
135000, 306, 190
140000, 327, 196
145000, 336, 198
150000, 347, 212
155000, 363, 222
160000, 384, 226
165000, 396, 244
170000, 409, 256
175000, 438, 263
180000, 455, 271
185000, 470, 288
190000, 467, 288
195000, 488, 301
200000, 512, 308
205000, 519, 310
210000, 547, 328
215000, 559, 333
220000, 587, 342
225000, 602, 358
230000, 618, 368
235000, 621, 372
240000, 652, 392
245000, 671, 402
250000, 687, 414
255000, 705, 424
260000, 720, 425
265000, 749, 427
270000, 740, 437
275000, 765, 423
280000, 753, 453
285000, 786, 460
290000, 779, 457
295000, 782, 468
300000, 807, 475
305000, 822, 483
310000, 884, 506
315000, 902, 528
320000, 911, 534
325000, 944, 556
330000, 928, 548
335000, 929, 551
340000, 951, 564
Data was gained by writing a simple trace file for
qtest
interpreter and send into a simple program as input to extract only their run time and size withtime sort
andsize
respectively.
It should be clear that list_sort
outperforms q_sort
at both large and small data size.
Both list_sort
and q_sort
uses merge sort to do the sorting, in theory, they both have the same Time Complexity
When we draw two function
Rate of growth ( Time Complexity ) of both q_sort
and list_sort
have tendency to go inbetween their
Hence it should be safe to say that they both have time complexity
Hence, the disparity in performance does not come from their time complexity. Instead, they come from the hidden constant in
First, I analyze the list_sort
function.
list_sort
的運作原理list_sort()
__attribute__((nonnull(2,3)))
void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp)
{
struct list_head *list = head->next, *pending = NULL;
size_t count = 0;
if (list == head->prev)
return;
head->prev->next = NULL;
do {
size_t bits;
struct list_head **tail = &pending;
for (bits = count; bits & 1; bits >>= 1)
tail = &(*tail)->prev;
if (likely(bits)) {
struct list_head *a = *tail, *b = a->prev;
a = merge(priv, cmp, b, a);
a->prev = b->prev;
*tail = a;
}
list->prev = pending;
pending = list;
list = list->next;
pending->next = NULL;
count++;
} while (list);
list = pending;
pending = pending->prev;
for (;;) {
struct list_head *next = pending->prev;
if (!next)
break;
list = merge(priv, cmp, pending, list);
pending = next;
}
merge_final(priv, cmp, head, pending, list);
}
__attribute__((nonnull(2,3)))
From the gcc Compiler User Guide, we know that this line simply indicates which referenced argument of a function must be non-null pointer. Along with the fact cmp
does not need an addition priv
data, it is safe to pass NULL
as the function first argument when calling list_sort
.
struct list_head *list = head->next, *pending = NULL;
size_t count = 0;
if (list == head->prev)
return;
head->prev->next = NULL;
This segment of code simply initialize some variables of the list. It also convert the list into a null-terminated singly-linked list.
We should also note that list
acts as iterator for the loop and pending
acts as a sorted sublists waiting to be merged.
do {
size_t bits;
struct list_head **tail = &pending;
for (bits = count; bits & 1; bits >>= 1)
tail = &(*tail)->prev;
if (likely(bits)) {
struct list_head *a = *tail, *b = a->prev;
a = merge(priv, cmp, b, a);
a->prev = b->prev;
*tail = a;
}
list->prev = pending;
pending = list;
list = list->next;
pending->next = NULL;
count++;
} while (list);
This do while loop is the main body of the function, every iteration, it would seperate an element as a new sublist out from list
, then, it will merge two sublist as soon as they consists of
To trace the code line-by-line, we would first reach the bottom part of the loop as count
is zero initially.
list->prev = pending;
pending = list;
list = list->next;
pending->next = NULL;
count++;
This segment of code seperate an element from list
to form a new sublist. Previous sublist is not lost as prev
pointer is still maintaining a "list of lists".
The first 2 iteration can represents what this segment does, as the program won't enter if (likely(bits))
branch when count
is either 0 or 1.
Before
count
= 0
count = 0
count = 1
NULL
.struct list_head **tail = &pending;
for (bits = count; bits & 1; bits >>= 1)
tail = &(*tail)->prev;
The program will always try to merge the first two sublist with equal number of elements starting from pending
. The very last sublist is at the previous most node starting from pending
. What would be indicated by tail
is the list about to be merge, with tail->prev
.
From the comment of the original source of code, we could get a hint of what bits & 1
is trying to do.
Each time we increment
count
, we set one bit (bit k) and clear bits k-1 .. 0. Each time this happens (except the very first time for each bit, whencount
increments to), we merge two lists of size into one list of size .
The comment states that the merge only NOT happen when count
increments to count
would have value of count
" of pending
elements had
How many time to iterate through the list of lists though, is following the value of count
. However, as the nearest sublist to pending
with same elements in it have to be selected, the carry isn't added to directly, and would only add up when there is nothing else by the right to its digit number to add. When this happens, the column would be the one merging.This number happens to be the first clear bit of count
counting from the least significant bit.
Example, from
Every digit with carry align with the least most significant clear bit, with one exception of the addition of least most significant bit.
These carry at
digit means the sublist from pending
should be thetail
The only exception that don't merge is when the digit is first set.
if (likely(bits)) {
struct list_head *a = *tail, *b = a->prev;
a = merge(priv, cmp, b, a);
a->prev = b->prev;
*tail = a;
}
likely
can be seen as a macro that compiler optimization to do branch prediction, provided the probability of how 'likely' it is not
This segment of code mainly merges two lists tail
and tail->prev
, which is sublist with the same number of element.
list = pending;
pending = pending->prev;
for (;;) {
struct list_head *next = pending->prev;
if (!next)
break;
list = merge(priv, cmp, pending, list);
pending = next;
}
This segment is reached after list
reach the end of list. It merges all of pending
list to the final sorted list except the last pending
.
merge_final(priv, cmp, head, pending, list);
This function would merge the list one final time. Quoting from the comments in the source.
Combine final list merge with restoration of standard doubly-linked list structure. This approach duplicates code from merge(), but runs faster than the tidier alternatives of either a separate final prev-link restoration pass, or maintaining the prev links throughout.
This functions rebuilds the list into the original circular, doubly-linked list.
uses prev
As stated before, both q_sort
and list_sort
uses merge sort to sort the queue. And both of them are optimal mergesort, that is , in worst case, they did the minimal number of comparison.
A fact about minimal number of comparison that can be performed while executing merge sort is
binary tree ofq_sort
binary tree oflist_sort
Both tree, even though has a different structure, is optimal merge sort.
We uses number of element to compare to represent each node of tree, and merge sort can be seen as optimal when their external path length is always minimal.
q_sort
would always split the list in half in every iteration of sorting. Which resembles how a balanced binary tree works, and from its definition, the external path length of a balanced binary tree is always minimal. Hence, q_sort
is an optimal merge sort.
Referenced from commit message of
list_sort.c
commit043b3f7
, queue merge sort
list_sort
could be seen as a variant of queue merge sort, which resembles Huffman encoding, where all of its leaf is 1. By induction, it is proven that the external path length of the merge tree is minimal. Hence, list_sort
is an optimal merge sort.
Screenshot to showlist_sort
is a variant ofqueue_merge_sort
With equal number of comparison, q_sort
does have a major flaw. That is, the algorithm to find the merge point is very inefficient.
q_sort
would use fast
and slow
to tranverse from head of list, taking
list_sort
however, would tranverse through the pending
list, taking
From what I've red from the documentation of list_sort
, they did take into consideration of the size of cache. So long as the cache can fit in
q_sort
however, does not take cache size into consideration at all. When the element is large enough, the CPU utilization may be lower as q_sort
would take in a large portion of memory at a time, slowing down the system as the program is waiting for I/O. Then CPU would give even more CPU time resource to the program, making the program even slower.
Address Sanitizer
修正 qtest
執行過程中的錯誤From lab0 requirements, it is stated that calling help command in qtest
interpreter will trigger error with Address Sanitizer, however, I wasn't able to reproduce the error in my machine.
However, Address Sanitizer did report another error that isn't caught by the compiler and cppcheck.
Which is called heap-buffer-overflow.
================================================================= [1/1969]
==12654==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6040000001fe at pc 0x7f5a91a76480 bp 0x7ffeddf02680 sp 0x7ffeddf01e28
READ of size 1024 at 0x6040000001fe thread T0
#0 0x7f5a91a7647f (/lib/x86_64-linux-gnu/libasan.so.5+0x9b47f)
#1 0x55f95ab81a7d in memcpy /usr/include/x86_64-linux-gnu/bits/string_fortified.h:34
#2 0x55f95ab81a7d in q_remove_head /home/user/Desktop/linux2022/lab0-c/queue.c:112
#3 0x55f95ab7c44b in do_remove /home/user/Desktop/linux2022/lab0-c/qtest.c:395
#4 0x55f95ab7c699 in do_rh /home/user/Desktop/linux2022/lab0-c/qtest.c:454
#5 0x55f95ab7ed33 in interpret_cmda /home/user/Desktop/linux2022/lab0-c/console.c:189
#6 0x55f95ab7f6d2 in interpret_cmd /home/user/Desktop/linux2022/lab0-c/console.c:212
#7 0x55f95ab80f4b in run_console /home/user/Desktop/linux2022/lab0-c/console.c:658
#8 0x55f95ab7de77 in main /home/user/Desktop/linux2022/lab0-c/qtest.c:1030
#9 0x7f5a916c10b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
#10 0x55f95ab7aced in _start (/mnt/lda/linux2022/lab0-c/qtest+0x8ced)
0x6040000001fe is located 0 bytes to the right of 46-byte region [0x6040000001d0,0x6040000001fe)
allocated by thread T0 here:
#0 0x7f5a91ae8bc8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8)
#1 0x55f95ab8128f in test_malloc /home/user/Desktop/linux2022/lab0-c/harness.c:138
SUMMARY: AddressSanitizer: heap-buffer-overflow (/lib/x86_64-linux-gnu/libasan.so.5+0x9b47f)
Shadow bytes around the buggy address:
0x0c087fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c087fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c087fff8000: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 fa
0x0c087fff8010: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 fa
0x0c087fff8020: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00 fa
=>0x0c087fff8030: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 00[06]
0x0c087fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c087fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c087fff8060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c087fff8070: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c087fff8080: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==12654==ABORTING
From the Log, we can determine that the overflow came from memcpy
of q_remove_head
.
READ of size 1024 at 0x6040000001fe thread T0
#0 0x7f5a91a7647f (/lib/x86_64-linux-gnu/libasan.so.5+0x9b47f)
#1 0x55f95ab81a7d in memcpy /usr/include/x86_64-linux-gnu/bits/string_fortified.h:34
#2 0x55f95ab81a7d in q_remove_head /home/user/Desktop/linux2022/lab0-c/queue.c:112
#3 0x55f95ab7c44b in do_remove /home/user/Desktop/linux2022/lab0-c/qtest.c:395
#4 0x55f95ab7c699 in do_rh /home/user/Desktop/linux2022/lab0-c/qtest.c:454
#5 0x55f95ab7ed33 in interpret_cmda /home/user/Desktop/linux2022/lab0-c/console.c:189
#6 0x55f95ab7f6d2 in interpret_cmd /home/user/Desktop/linux2022/lab0-c/console.c:212
#7 0x55f95ab80f4b in run_console /home/user/Desktop/linux2022/lab0-c/console.c:658
#8 0x55f95ab7de77 in main /home/user/Desktop/linux2022/lab0-c/qtest.c:1030
#9 0x7f5a916c10b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
#10 0x55f95ab7aced in _start (/mnt/lda/linux2022/lab0-c/qtest+0x8ced)
My implementation of memcpy
would always tries to copy bufsize - 1
characters to sp
, but the size allocated for list_entry(...,element_t,...)->value
is not always bufsize - 1
, in fact, it would be bufsize - 1
in size only at few circustances. Hence, heap-buffer-overflow occurs.
memcpy(sp, list_entry(node, element_t, list)->value, buf_size - 1);
sp[buf_size - 1] = '\0';
A simple fix is to simply replace buf_size - 1
with the smaller entry of buf_size - 1
and size of list_entry(...,element_t,...)->value
.
size_t len = (bufsize - 1) ^ (((bufsize - 1) ^ (strlen(list_entry(node, element_t, list)->value))) & -(strlen(list_entry(node, element_t, list)->value) < (bufsize - 1)));
memcpy(sp, list_entry(node, element_t, list)->value, len);
sp[len] = '\0';
Valgrind
排除 qtest
實作的記憶體錯誤When using Valgrind
to run qtest
directly with $ valgrind ./qtest
, there would not be any error messages shown by Valgrind
. However, when running it with Makefile
$ make valgrind
, the error messages would show up.
==13243== 5 bytes in 1 blocks are still reachable in loss record 1 of 2
==13243== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==13243== by 0x4A5250E: strdup (strdup.c:42)
==13243== by 0x111F4F: linenoiseHistoryAdd (linenoise.c:1276)
==13243== by 0x112C71: linenoiseHistoryLoad (linenoise.c:1365)
==13243== by 0x10DA17: main (qtest.c:1019)
==13243==
==13243== 160 bytes in 1 blocks are still reachable in loss record 2 of 2
==13243== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==13243== by 0x111F0F: linenoiseHistoryAdd (linenoise.c:1264)
==13243== by 0x112C71: linenoiseHistoryLoad (linenoise.c:1365)
==13243== by 0x10DA17: main (qtest.c:1019)
==13243==
After some attempts to recreate the error, I found out that the error would only show up only when qtest
is trying to interpret command from other source file. And the error messages are clearly associated with linenoiseHistory
.
When I was trying to implement the web-server, I learnt that linenoise
provides some tool that allows a more friendly interface for reading user-input. Which they are not needed when reading from a external file.
The problem came from the program somehow still allocate memory for history
even though linenoise
is not needed at all.
Following the backtrace, I discover that the main
function from qtest.c
would still call linenoiseHistoryLoad()
which would then call linenoiseHistoryAdd()
that would allocate memory.
Normally, the allocated memory would be freed in run_console()
of console.c
. However, this is not the case, as run_console()
would test for external file with has_infile
and will never call freeHistory()
.
My fix is to simply add a boolean guard has_infile
to stop qtest.c
from calling any of the linenoise
function if it is reading from external file.
if(!has_infile){
linenoiseSetCompletionCallback(completion);
linenoiseHistorySetMaxLen(HISTORY_LEN);
linenoiseHistoryLoad(HISTORY_FILE);
}
Central Limit Theorem and normal distribution assumed standard deviation is known. This assumption is not reasonable in real scenarios however. In said scenarios, is where t-distribution could be useful.
t-Distribution estimate the standard deviation of population with standard deviation of sample, if sample size is large enough, the distribution will not differs by alot from the standard normal distribution.
A statistics called degree of freedom is used to measure how close t-Distribution is to normal distribution. Where more distribution means larger sample size.
When we look at the shape of t-Distribution, we can see that the tail is thicker at lower degree of freedom, as the confidence value is based on area of distribution (p = 0.95), when the tail is thicker, the threshold to reject is higher, which means its harder to reject the value.
This is useful especially when we don't have a large sample size (data), as the null hypothesis is less likely to be rejected before we got a larger sample size, unless the program is very confident of the result.
The program would perform a leakage detection test, then test if the execution time is statistically different. If they are, the code is not constant time. The tool is said to not rely on static analysis but on statistical analysis, and don't need to model underlying hardware to do the test.
constant value
// in prepare_inputs
randombytes(input_data, n_measure * chunk_size);
for (size_t i = 0; i < n_measure; i++) {
classes[i] = randombit();
if (classes[i] == 0)
memset(input_data + (size_t) i * chunk_size, 0, chunk_size);
}
random value
for (size_t i = 0; i < N_MEASURE; ++i) {
/* Generate random string */
randombytes((uint8_t *) random_string[i], 7);
random_string[i][7] = 0;
}
extraction of x86 cycle counter
unsigned int hi, lo;
__asm__ volatile("rdtsc\n\t" : "=a"(lo), "=d"(hi));
return ((int64_t) lo) | (((int64_t) hi) << 32);
I don't see cropping at
report()
and t-test only test for first-order statistics
check ttest.c
The paper did include False positives as a topic in the discussion section.
When I test for constant time with trace-17
, I don't seem to pass the test on my local machine, however the code never fail the test at Github Action. Implying either false positive at github action virtual ware, or the test depends on the underlying hardware (which defeats the point of this paper is no need to model underlying hardware to measure leakage).