Try   HackMD

2017q1 Homework4 (phonebook-concurrent)

contributed by <ryanwang522>

開發環境

  • OS: ubuntu 16.04 LTS
  • L1d cache: 32K
  • L1i cache: 32K
  • L2 cache: 256K
  • L3 cache: 3072K
  • Architecture:x86_64
  • CPU op-mode(s): 32-bit, 64-bit
  • Byte Order: Little Endian
  • CPU(s): 4
  • Model name: Intel® Core i5-4200H CPU @ 2.80GHz

原始版本

  • 先看懂程式碼:
    • text-align()

      先利用 memset 將每個字元都寫為\0 ,接著在呼叫 strncpy() 將所讀到的字串覆寫到 wbuf 中,使得每一行的加上 null-chracters 的總長度固定,方便使用 mmap() 利用 offset 的操作對映射在 memory 的 file 資料進行動作。

    • pthread_setconcurrency()

      這裡我查了一下 Linux manual page

      ​​​​​​​​NOTES
      ​​​​​​​The default concurrency level is 0.
      
      ​​​​​​​Concurrency levels are only meaningful for M:N threading implementations, where at any moment a subset of a process's set  of  user-level  threads
      ​​​​​​​may be bound to a smaller number of kernel-scheduling entities.  Setting the concurrency level allows the application to give the system a hint as
      ​​​​​​​to the number of kernel-scheduling entities that should be provided for efficient execution of the application.
      
      ​​​​​​​Both LinuxThreads and NPTL are 1:1 threading implementations, so setting the concurrency level has no meaning.  In other  words,  on  Linux  these
      ​​​​​​​functions merely exist for compatibility with other systems, and they have no effect on the execution of a program.
      

      裡頭寫到在 linux 環境裡 concurrency level 並不影響程式執行,我把程式裡頭的函式去掉也發現執行情況並沒有太大差異,不知道有沒有地方弄錯@@

      • 認為只是為了跨平台而加的指令,功能是明確說明要用多少 kernel thread 去對應多少 user thread ,就是 OS 課程提到的 thread modeling。
    • 接著就是進行分配工作給每個 thread 的部份,比較特別的是這裡是直接 malloc 所有 entry 所需的記憶體,省去頻繁呼叫 malloc() 的 overhead,也是透過 text-align 所得到的效益。

    ​​​​ entry_pool = (entry *)malloc(sizeof(entry) * ​​​​ file_size / MAX_LAST_NAME_SIZE); ​​​​ ​​​​ for (int i = 0; i < THREAD_NUM; i++) ​​​​ // Created by malloc, remeber to free them. ​​​​ thread_args[i] = createThread_arg(map + MAX_LAST_NAME_SIZE * i, map + file_size, i, ​​​​ THREAD_NUM, entry_pool + i); ​​​​ /* Deliver the jobs to all threads and wait for completing */ ​​​​ clock_gettime(CLOCK_REALTIME, &mid); ​​​​ for (int i = 0; i < THREAD_NUM; i++) ​​​​ pthread_create(&threads[i], NULL, (void *)&append, (void *)thread_args[i]); ​​​​ for (int i = 0; i < THREAD_NUM; i++) ​​​​ pthread_join(threads[i], NULL);
    • 最後再把每個 thread 所創的 linked-list 頭尾相接起來,但這裡不同於之前擁有排序好的特性。由 findName() 的時間比較可以看出並沒有相差太多,依然需要 Linear search,時間複雜度為 O(n)。
    • 在看 code 時發現合併 linked-list 時的 pHead 指向了原本頭節點的下一個節點
    ​​​​headPtr = thread_args[i]->lEntry_head->pNext;
    • 透過呼叫 show_entry() 來檢查正確性,顯示只有 349896 個節點,看來是在相接的時候少接了原本的頭節點,將程式更改為:
    ​​​​headPtr = thread_args[i]->lEntry_head;

Refactoring

  • 參考老師給東霖 csielee 的建議 以及 async.hasync.c,嘗試利用封裝概念去增加 main.c 的可讀性
  • 根據上面的理解,去掉 pthread_setconcurrency() 的呼叫

Interface 實作

  • 利用在 .h 檔宣告所需函式結構,在透過 function pointer 在 .c 檔進行實作。
  • .h
extern struct __API { ​ void (*initialize)(); ​ entry *(*findName)(char lastName[]); ​ entry *(*append)(char *fileName); ​ void (*free)(); } Phonebook;
  • .c
struct __API Phonebook = { .initialize = phonebook_init, .findName = phonebook_findName, .append = phonebook_append, .free = phonebook_free, };
  • 透過在 .c 檔的實作,在 main.c 就可以直接進行同樣的呼叫,編譯時期不同執行檔會各自連結到相對應的函式實作,不必在透過頻繁的 #ifdef 來達到不同版本的運作
Phonebook.init()
  • 重構後的 main.c,大部分程式碼都移植到對應的 .c 檔裡了!
int main(int argc, char *argv[]) { struct timespec start, end; double cpu_time1,cpu_time2; Phonebook.initialize(); /* Build the entry */ entry *pHead; printf("size of entry : %lu bytes\n", sizeof(entry)); /* Compute execution time */ clock_gettime(CLOCK_REALTIME, &start); pHead = Phonebook.append(DICT_FILE); clock_gettime(CLOCK_REALTIME, &end); cpu_time1 = diff_in_second(start, end); /* Find the given entry */ /* the givn last name to find */ char input[MAX_LAST_NAME_SIZE] = "zyxel"; assert(Phonebook.findName(input) && "Did you implement findName() in " IMPL "?"); assert(0 == strcmp(Phonebook.findName(input)->lastName, "zyxel")); #if defined(__GNUC__) __builtin___clear_cache((char *) pHead, (char *) pHead + sizeof(entry)); #endif /* Compute the execution time */ clock_gettime(CLOCK_REALTIME, &start); Phonebook.findName(input); clock_gettime(CLOCK_REALTIME, &end); cpu_time2 = diff_in_second(start, end); /* Write the execution time to file. */ FILE *output; output = fopen(OUTPUT_FILE, "a"); fprintf(output, "append() findName() %lf %lf\n", cpu_time1, cpu_time2); fclose(output); printf("execution time of append() : %lf sec\n", cpu_time1); printf("execution time of findName() : %lf sec\n", cpu_time2); /* Release memory */ Phonebook.free(); return 0; }
  • 但由於 text_align() 移動到 Phonebook.append() 裡,所以測出來的時間才會暴增

新增功能

  • 刪除節點功能 - Phonebook.remove()
    • 只有將目標節點從 linked list 中去除,並沒有 free 掉一開始就已經 malloc 好的空間,空間的釋放交給程式結束前一次 free 掉 entry_pool 所指向的大空間
  • 實作
static void phonebook_remove(char lastName[]) { entry *e = findName(lastName, headPtr); if (e) { if (e != headPtr) prev->pNext = e->pNext; else headPtr = e->pNext; } else { printf("Target not exist.\n"); return; } free(e->dtl); }

利用 IPC 同步 align & append