2019q1 Homework1 (lab0)

tags: `sysprog` `2019` `2019spring`

contributed by < czvkcui >

認真看作業要求，除了提及如何逐步達到要求，還需要進行:

改善程式效能
解釋自動評分系統運作的原理
提及 qtest 的行為和裡頭的技巧

還未達成符合預期目標，請繼續付出！

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

jserv

queue implement

q_new,q_free,q_insert_head,q_insert_tail,q_remove_head,q_size,q_reverse implement

q_new





















typedef struct
{
    list_ele_t *head, *tail; /* Linked list of elements */
    /*
      You will need to add more fields to this structure
      to efficiently implement q_size and q_insert_tail
    */
    size_t size;
} queue_t;

queue_t *q_new()
{
    queue_t *q = malloc(sizeof(queue_t));
    if (q)
    {
        q->head = NULL;
        q->tail = NULL;
        q->size = 0;
    }
    return q;
}

size保存linked list 的長度
q_new 初始化queue_t結構

q_free















void q_free(queue_t *q)
{
    /* Free queue structure */
    if (!q) return;
    list_ele_t* t;
    while (q->head)
    {
        t = q->head->next;
        free(q->head->value);
        free(q->head);
        q->head = t;
    }
    free(q);
}

q_insert_head























bool q_insert_head(queue_t *q, char *s)
{
    list_ele_t *newh = malloc(sizeof(list_ele_t));
    int len = strlen(s);
    if (!q) goto error;
    if (!newh) goto error;
    newh->value = malloc((len + 1) * sizeof(char));
    if (!newh->value) goto error;
    strncpy(newh->value, s, len);
    newh->value[len] = '\0';
    newh->next = q->head;

    if (!q->head)
    {
        q->tail = newh;
    }
    q->head = newh;
    q->size++;
    return true;
error:
    if (newh) free(newh);
    return false;
}

利用goto來統一處理例外狀況

q_insert_tail


























bool q_insert_tail(queue_t *q, char *s)
{
    list_ele_t *newh = malloc(sizeof(list_ele_t));
    int len = strlen(s);
    if (!q) goto error;
    if (!newh) goto error;
    newh->value = malloc((len + 1) * sizeof(char));
    if (!newh->value) goto error;
    strncpy(newh->value, s, len);
    newh->value[len] = '\0';
    newh->next = NULL;

    if (!q->tail)
    {
        q->head = newh;
        q->tail = newh;
    }

    q->tail->next = newh;
    q->tail = newh;
    q->size++;
    return true;
error:
    if (newh) free(newh);
    return false;
}

q_remove_head



















bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
    if (!q || !q->head) goto error;
    char* str = q->head->value;
    if (sp)
    {
        strncpy(sp, str, bufsize - 1);
        sp[bufsize - 1] = '\0';
    }
    list_ele_t* t = q->head;
    q->head = q->head->next;
    free(t->value);
    free(t);
    q->size--;
    if (q->size == 0) q->tail = NULL;
    return true;
error:
    return false;
}

q_size





int q_size(queue_t *q)
{
    if (!q || !q->head) return 0;
    return q->size;
}

q_reverse




















void q_reverse(queue_t *q)
{
    if (!q || !q->head || !q->head->next) return;
    if (q->head == q->tail) return;

    q->tail = q->head;
    q->head = q->head->next;

    list_ele_t* back = q->tail, *curr = q->head->next;

    while (curr != NULL)
    {
        q->head->next = back;
        back = q->head;
        q->head = curr;
        curr = curr->next;
    }
    q->head->next = back;
    q->tail->next = NULL;
}

解釋自動評分系統運作的原理

scripts

git hook

we have 2 hooks inside our scripts:

pre-commit.hook
commit-msg.hook

pre-commit.hook

trigger on git commit command
check format is suitable with clang format
use cppcheck checking

commit-msg.hook

check commit message with regular expression

qtest command interpret

初始化階段：

queue_init();
init_cmd();
console_init();
set_verblevel(level);

init_cmd和console_init會向console註冊所使用的command

command parse

main->run_console->cmd_select->interpret_cmd->interpret_cmda

interpret_cmda執行

cmd_ptr next_cmd = cmd_list;
bool ok = true;
while (next_cmd && strcmp(argv[0], next_cmd->name) != 0)
        next_cmd = next_cmd->next;
if (next_cmd) {
    ok = next_cmd->operation(argc, argv);
    if (!ok)
        record_error();
} else {
    ... // error handling
}

cmd_list是一個link list，它保存了queue_init和init_cmd所註冊的command。透過循序搜尋找到符合的cmd，然後執行對應的函式(next_cmd->operation)。

report.[hc]

static void check_exceed(size_t new_bytes)
{
    size_t limit_bytes = (size_t) mblimit << 20;
    size_t request_bytes = new_bytes + current_bytes;
    if (mblimit > 0 && request_bytes > limit_bytes) {
        report_event(MSG_FATAL,
                     "Exceeded memory limit of %u megabytes with %lu bytes",
                     mblimit, request_bytes);
    }
}

Q:Why limit_bytes = mblimit<<20 ?
A:mblimit 作為判斷malloc有無超出上限，mblimit<<20代表最高可以使用的大小(KB)，在本例中0<<20還是為0(mblimit預設為0表示無上限)

void init_time(double *timep)
{
    (void) delta_time(timep);
}

Q:Why we need (void) before delta_time(timep) ?
A:某些編譯器可能會警告使用者沒有處理返回值，加上(void)來處理返回值(轉形成void)，但是像是printf等常用的function可以在宣告時加上__attribute__(noreturn)來避免。

在gcc中 -Wall 預設會開啟Wunused-result，不過在我的平台上(gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0)，刪除(void)和加上-Wunused-result都沒有效果。此外也可以在function宣告加上__attribute__((warn_unused_result))強迫警告使用者處理返回值。

getrusage：

獲取資源的使用情形

/* Number of bytes resident in physical memory */
size_t resident_bytes()
{
    struct rusage r;
    size_t mem = 0;
    int code = getrusage(RUSAGE_SELF, &r);
    if (code < 0) {
        report_event(MSG_ERROR, "Call to getrusage failed");
    } else {
        mem = r.ru_maxrss * 1024;
    }
    return mem;
}

declaration：
int getrusage(int who, struct rusage *usage);
usage接受一個參數負責返回結果，而who則有以下參數

RUSAGE_SELF：
只統計本身process的使用資訊(包含thread)
RUSAGE_CHILDREN：
會統計所有children和calling的process的資訊
RUSAGE_THREAD(linux 2.6.26以後)：
統計calling thread的資訊
rusage structure:

struct rusage {
               struct timeval ru_utime; /* user CPU time used */
               struct timeval ru_stime; /* system CPU time used */
               long   ru_maxrss;        /* maximum resident set size */
               long   ru_ixrss;         /* integral shared memory size */
               long   ru_idrss;         /* integral unshared data size */
               long   ru_isrss;         /* integral unshared stack size */
               long   ru_minflt;        /* page reclaims (soft page faults) */
               long   ru_majflt;        /* page faults (hard page faults) */
               long   ru_nswap;         /* swaps */
               long   ru_inblock;       /* block input operations */
               long   ru_oublock;       /* block output operations */
               long   ru_msgsnd;        /* IPC messages sent */
               long   ru_msgrcv;        /* IPC messages received */
               long   ru_nsignals;      /* signals received */
               long   ru_nvcsw;         /* voluntary context switches */
               long   ru_nivcsw;        /* involuntary context switches */
           };

console.[hc]

cmd_select ,run_console:

bool run_console(char *infile_name)
{
    if (!push_file(infile_name)) {
        report(1, "ERROR: Could not open source file '%s'", infile_name);
        return false;
    }
    while (!cmd_done()) {
        cmd_select(0, NULL, NULL, NULL, NULL);
    }
    return err_cnt == 0;
}

harness.[hc]

test_malloc and test_free

queue.c因為有include harness.h 因此呼叫的malloc和free為test_malloc和test_free。
struct of block_ele_t

typedef struct BELE {
    struct BELE *next;
    struct BELE *prev;
    size_t payload_size;
    size_t magic_header; /* Marker to see if block seems legitimate */
    unsigned char payload[0];
    /* Also place magic number at tail of every block */
} block_ele_t;

payload[0]是C99後引入的語法，可以允許使用者在runtime時才決定該array大小。

block_ele_t *new_block =
        malloc(size + sizeof(block_ele_t) + sizeof(size_t));

size是是真正要給使用者使用的空間大小(bytes)，加上block_ele_t的大小，sizeof(size_t)則是儲存footer的空間。在test_malloc中，block_ele_t會依序填上MAGICHEADER、size、 FOOTER、payload,接著將next指向link list的開頭(allcated則指向新的new block，舊的prev指標指向新的block)，再由allocated_count++後完成記憶體分配。test_free會檢查要被釋放的block的MAGICHEADER和MAGICFOOTER。當payload存取的資料超過payload的大小時會覆蓋自己的MAGICFOOTER，MAGICHEADER則可以檢查自己的block是否被別人所覆蓋^[1]。最後test_free將block移出double link list，allocated_count--釋放block。

trigger_exception,exception_setup,exception_cancel

jmp_read的type是volatile，在參考from C 語言 setjmp 與 longjmp 函數用法教學後得知:

未開啟最佳化options(-O)只有 register 宣告的變數會改變(not change:volatile.global,static,local)
- with -O option: register,local宣告的變數值均會改變
sigsetjmp,siglongjmp:
- 和setjmp和longjmp一樣都是nonlocal jmp
- 但是setjmp 和longjmp並不會在jump儲存當前的signal mask

void trigger_exception(char *msg)
{
    error_occurred = true;
    error_message = msg;
    if (jmp_ready)
        siglongjmp(env, 1);
    else
        exit(1);
}

當exception觸發(SIGSEGV,SIGALRM)，jmp_rady會被trigger成true，而進行setlongjmp至上一個exception_setup位置(if (sigsetjmp(env, 1)))，印出錯誤訊息後返回。
exception_cancel：清除'jmp_ready'和error_message，若是超時則會觸發alarm。

qtest.c

bool do_new(int argc, char *argv[])
{
    if (argc != 1) {
        report(1, "%s takes no arguments", argv[0]);
        return false;
    }
    bool ok = true;
    if (q != NULL) {
        report(3, "Freeing old queue");
        ok = do_free(argc, argv);
    }
    error_check();
    if (exception_setup(true))
        q = q_new();
    exception_cancel();
    qcnt = 0;
    show_queue(3);
    return ok && !error_check();
}

根據exception_setup和exception_cancel可以得知

static void queue_init()
{
    fail_count = 0;
    q = NULL;
    signal(SIGSEGV, sigsegvhandler);
    signal(SIGALRM, sigalrmhandler);
}

queue_init會使qtest在初始化時註冊SIGSEGV和SIGALRM，當中斷發生時OS會通知signal handler做對應的處理，在本例中SIGSEGV發生時會交由sigsegvhandler，而SIGALARM由sigalrmhandler處理，兩者都是印出錯誤訊息後，呼叫trigger_exception。

效能改善

Test in parallel

修改driver.py 以 process pool來執行：
Performance counter stats for 'python scripts/driver.py'(with process pool):

   1276.391265      task-clock (msec)         #    1.327 CPUs utilized          
           842      context-switches          #    0.660 K/sec                  
            58      cpu-migrations            #    0.045 K/sec                  
       135,961      page-faults               #    0.107 M/sec                  
 3,941,655,638      cycles                    #    3.088 GHz                    
 6,034,081,563      instructions              #    1.53  insn per cycle         
 1,167,962,182      branches                  #  915.050 M/sec                  
     2,832,393      branch-misses             #    0.24% of all branches        

   0.961677493 seconds time elapsed

Performance counter stats for 'python scripts/driver.py':

   1007.687420      task-clock (msec)         #    0.797 CPUs utilized          
           803      context-switches          #    0.797 K/sec                  
            12      cpu-migrations            #    0.012 K/sec                  
       131,010      page-faults               #    0.130 M/sec                  
 3,107,377,907      cycles                    #    3.084 GHz                    
 5,928,785,192      instructions              #    1.91  insn per cycle         
 1,146,581,106      branches                  # 1137.834 M/sec                  
     2,129,417      branch-misses             #    0.19% of all branches        

   1.264532373 seconds time elapsed

In [2]: (1.264532373-0.961677493)/1.264532373 * 100
Out[2]: 23.949950706402355

compiler opt

在用perf record分析時，發現有些部份可能可以交由編譯器最佳化，例如每次malloc都會被呼叫但是回傳值都是一樣的fail_allocation

orig

        21,424,519      cache-misses              #   91.576 % of all cache refs      ( +-  0.05% )
        23,395,281      cache-references                                              ( +-  0.17% )
     1,580,167,387      instructions              #    1.51  insn per cycle           ( +-  0.05% )
     1,049,546,666      cycles                                                        ( +-  1.31% )
           246,893      branch-misses                                                 ( +-  2.07% )

       0.343512614 seconds time elapsed                                               ( +-  1.37% )

gcc -O2 flag

        21,198,677      cache-misses              #   92.116 % of all cache refs      ( +-  0.06% )
        23,013,085      cache-references                                              ( +-  0.14% )
     1,242,986,343      instructions              #    1.39  insn per cycle           ( +-  0.05% )
       893,054,339      cycles                                                        ( +-  1.19% )
           232,820      branch-misses                                                 ( +-  1.79% )

       0.291908910 seconds time elapsed

gcc -O3 flag

        21,257,036      cache-misses              #   92.071 % of all cache refs      ( +-  0.06% )
        23,087,689      cache-references                                              ( +-  0.15% )
     1,245,395,797      instructions              #    1.36  insn per cycle           ( +-  0.06% )
       916,791,967      cycles                                                        ( +-  1.35% )
           243,474      branch-misses                                                 ( +-  1.96% )

       0.299784791 seconds time elapsed                                               ( +-  1.45% )

clang -O2 flag

        21,258,614      cache-misses              #   91.996 % of all cache refs      ( +-  0.04% )
        23,108,203      cache-references                                              ( +-  0.16% )
     1,242,551,526      instructions              #    1.34  insn per cycle           ( +-  0.06% )
       924,695,143      cycles                                                        ( +-  1.43% )
           261,872      branch-misses                                                 ( +-  8.10% )

       0.302822045 seconds time elapsed                                               ( +-  1.53% )

clang -O3 flag

        21,243,947      cache-misses              #   92.033 % of all cache refs      ( +-  0.03% )
        23,082,922      cache-references                                              ( +-  0.11% )
     1,241,602,442      instructions              #    1.35  insn per cycle           ( +-  0.05% )
       917,623,562      cycles                                                        ( +-  1.30% )
           249,315      branch-misses                                                 ( +-  5.23% )

       0.299945914 seconds time elapsed                                               ( +-  1.28% )

2019q1 Homework1 (lab0)

tags: `sysprog` `2019` `2019spring`

queue implement

q_new

q_free

q_insert_head

q_insert_tail

q_remove_head

q_size

q_reverse

解釋自動評分系統運作的原理

scripts

git hook

pre-commit.hook

commit-msg.hook

qtest command interpret

初始化階段：

command parse

interpret_cmda執行

report.[hc]

getrusage：

console.[hc]

cmd_select ,run_console:

harness.[hc]

test_malloc and test_free

trigger_exception,exception_setup,exception_cancel

qtest.c

效能改善

Test in parallel

compiler opt

驗證常數時間複雜度^[2]

Reference

2019q1 Homework1 (lab0)

tags: sysprog 2019 2019spring

queue implement

q_new

q_free

q_insert_head

q_insert_tail

q_remove_head

q_size

q_reverse

解釋自動評分系統運作的原理

scripts

git hook

pre-commit.hook

commit-msg.hook

qtest command interpret

初始化階段：

command parse

interpret_cmda執行

report.[hc]

getrusage：

console.[hc]

cmd_select ,run_console:

harness.[hc]

test_malloc and test_free

trigger_exception,exception_setup,exception_cancel

qtest.c

效能改善

Test in parallel

compiler opt

驗證常數時間複雜度[2]

Reference

tags: `sysprog` `2019` `2019spring`

驗證常數時間複雜度^[2]