Try   HackMD

2016q3 Homework1 (phonebook)

contributed by <heathcliffYang>

目標

  1. 在mac上安裝Lubuntu
  2. 學會效能分析工具perf
  3. 學會製圖軟體gnuplot
  4. 針對phonebook進行效能優化
  5. 把背景知識補齊

Because I spent much time installing Lubuntu on mac, I read reference first.

apt-get
  1. It cannot find any packages because my os version is too old.
  2. Then, another problem is "Unable to locate package", I change the download source.

Code 理解

Implementation note

$ lscpu | grep cache
Check the size of cache.

$ echo 1 | sudo tee /proc/sys/vm/drop_caches
Clean the cache.

eog runtime.png
Watch the picture.

Try & Problems

嘗試用開頭字母索引加速搜尋 - 9/26 12 am

1. 想法

Set a linked list to store the searching flags which is the first word in the segment of the same first letter.

2. orig perf stats

cache-misses: 2,009,169 # 96.121% of all cache refs (± 0.71%)
cache-references: 2,090,254 (± 0.72%)
instructions: 260,893,651 # 1.42 insns per cycle (± 0.02%)
cycles: 183,226,677 (± 0.31%)
time elapsed: 0.072529778 seconds (± 2.26%)

3. opt perf stats

cache-misses: 7,562 # 41.626% of all cache refs (± 1.47%)
cache-references: 18,166 (± 0.63%)
instructions: 627,073 # 0.53 insns per cycle (± 0.17%)
cycles: 1,191,255 (± 0.73%)
time elapsed: 0.160321086 seconds (± 11.76%)
Segmentation fault

4.

5. code

.h

typedef struct __PHONE_BOOK_ENTRY {
    char lastName[MAX_LAST_NAME_SIZE];
    struct __PHONE_BOOK_ENTRY *pNext;
} entry;

typedef struct __PHONE_BOOK_MORE {
    char firstName[16];
    char email[16];
    char phone[10];
    char cell[10];
    char addr1[16];
    char addr2[16];
    char city[16];
    char state[2];
    char zip[5];
} more;

static entry *searchIndex; // for index searching
static entry *indexHead;   // for index searching
static entry *temp;

.c

entry *findName(char lastName[], entry *pHead)
{
    while (pHead != NULL) {
        if (lastName[0]==pHead->lastName[0]) {
            if (strcasecmp(lastName, pHead->lastName) == 0)
                return pHead;
            pHead = pHead->pNext;
        }
        else {
            pHead = indexHead->pNext;
            indexHead = indexHead->pNext;
        }
    }

    return NULL;
}

entry *append(char lastName[], entry *e)
{
    e->pNext = (entry *) malloc(sizeof(entry));
    e = e->pNext;
    if (temp->lastName[0]=='0') {
        /* build the entry for index searching */
        indexHead = (entry *) malloc(sizeof(entry));
        searchIndex = indexHead;
    }
    if (temp->lastName[0]!=e->lastName[0]) {
        searchIndex->pNext = e;
        searchIndex=searchIndex->pNext;
    }
    strcpy(e->lastName, lastName);
    temp=e;
    e->pNext = NULL;

    return e;
}

作圖遇到問題 - 9/26 2am

Plot won't change when the code is changed.

Remove opt.txt and orig.txt. Because the program won't remove the previous 100 data. From hugikun999

Debug: Segmentation fault - 9/26 11am

1. idea(1) & Solved Segmentation fault

​Forgot to malloc() for temp

2. opt perf stats

cache-misses: 470,826 # 77.273% of all cache refs (± 0.14%)
cache-references: 605,423 (± 0.37%)
instructions: 270,811,645 # 0.53 insns per cycle (± 0.02%)
cycles: 150,092,320 (± 0.30%)
time elapsed: 0.054915752 seconds (± 1.66%)

3. 執行時間圖

Debug: flag 設置缺失 - 9/27 9am

1. 找到一個問題 & 解決

flag 設置缺失: 因為在e的lastName尚未設置好時,就檢查開頭字母,printf檢查結果如下

set the flag,1; temp:a, e:set the flag,2; temp:a, e:

解決方法: 將strcpy() 提前,讓e的lastName有東西

2. opt perf stat

 Performance counter stats for './phonebook_opt' (100 runs):

           227,675      cache-misses              #   78.157 % of all cache refs      ( +-  0.14% )
           291,305      cache-references                                              ( +-  0.39% )
       186,204,940      instructions              #    1.76  insns per cycle          ( +-  0.03% )
       106,058,289      cycles                                                        ( +-  0.14% )

       0.037255178 seconds time elapsed                                          ( +-  1.01% )

cache-miss 仍偏高

3. 執行時間圖

4. 在findName增加一個strlen()計算lastName長度去判斷

Performance counter stats for './phonebook_opt' (100 runs):

           328,439      cache-misses              #   82.416 % of all cache refs      ( +-  0.04% )
           398,513      cache-references                                              ( +-  0.35% )
       189,556,866      instructions              #    1.70  insns per cycle          ( +-  0.03% )
       111,267,313      cycles                                                        ( +-  0.22% )

       0.043569709 seconds time elapsed                                          ( +-  2.10% )

cache-miss 亦增加

兩者時間都增加了

在append部分加入smaz - 9/28 11am

Algorithms for compress small text strings

Huffman coding *

Smaz *

It can compress text by 40-50% in the average case.

Target function:
int smaz_compress(char *in, int inlen, char *out, int outlen);

重寫code


Hash function

PThread

優勢

  • Thread需要的系統資源比另外複製一個process要少
  • 平行運算

Creating and Terminating Threads

  • int pthread_create(pthread_t *restrict thread, const pthread_attr_t *restrict attr, void *(*start_routine)(void*), void *restrict arg);
  • pthread_exit (status)
  • Thread Attributes: thread 的屬性,該thread要被定義成joinable才能被join

Joining & Detaching

  • 4個步驟創造一個 joinable的thread
    1. 宣告一個資料型態為pthread_attr_t 的pthread 的屬性變數
    2. pthread_attr_init()初始化屬性變數
    3. pthread_attr_setdetachstate()設定其屬性為detached status
    4. When done, free library resources used by the attribute with pthread_attr_destroy()
  • detaching:

參考資料1
參考資料2

These topics are not covered here, however a good overview of "how things work" under Linux can be found in the sched_setscheduler man page.

Efficiency Improvement Strategy

1. Analysis the data set

  • A letter may appear repeatedly in a line.
  • The words are sorted by letter order.
  • The first letter and length of the word are properties.

2. Struct

  • Pick up necessary members.
  • Add the number which represents the length of the word. (yet)
  • Add the line index the word in. -> line index is fuzzy with respect to the letter.

3. append

  • Give each word the corresponding line number and count the length.
  • Set a to z flag.

4. findName

  • Divide the data set as five parts by the line number.
  • Compare the length number.
  • Check the first letter.
  • Check the whole words.

5. Binary search tree*

  • Use letter h as a root node.

背景知識

1. cache:

一種SRAM,存取速度較能與CPU匹配,因此會將常用的資料保存在cache中,讓CPU不需要等待資料進來。

2. branch prediction:

根據同一條指令的歷史紀錄進行預測,讀取接下來最可能被執行的指令,並非順序讀取;對重複性的分支指令序列()能得到比較好的預測結果,若是swithc-case()則無法。

Static Prediction

Dynamic Prediction

Q:造成branch-miss的原因?
待讀完1
待讀完2
待讀完3

C programming:

strlen(stringName); returns the length of string.

typedef struct _a_{ unsigned long size; struct _a_ *next; } b;
typedef extends struct a as a data type, b;

int strcasecmp(const char *s1, const char *s2);
strcasecmp() compares s1 and s2 with ignoring the letter case.
Return 0: length equal
Return >0: s1>s2
Return <0: s1<s2

int *ptr; ptr = malloc(sizeof(int));
malloc() allocates dynamic memory with sizeof() calculating the space it needs. If it succeeds, malloc() will return the pointer to the space.

ptr=fopen("filename","r");

fgets(s,n,ptr)
fopen() r represents the open mode.
fgets() reads data line by line from the file. The first parameter is the array which stores the input data. The second one is the character number at most. The third one is the pointer to the file.

clock_gettime()

<< : shift left