2017q1 Homework1 phonebook

# 2017q1 Homework1 phonebook contributed by < `SeanCCC` > ### Reviewed by `divazone` * 我認為應該要將紀錄時間用導去檔案存起來，不應該直接導去/dev/null，時間紀錄不是垃圾 * 我覺得find name有優化的空間，可能可以避免memory leak * github上的提交訊息要清楚明瞭，在code內也可以做一些簡單的註解說明 ## 開發環境 ``` Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 1 On-line CPU(s) list: 0 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 62 Model name: Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz Stepping: 4 CPU MHz: 2399.998 BogoMIPS: 4799.99 Virtualization: VT-x Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 15360K NUMA node0 CPU(s): 0 ``` ## 準備期 * 安裝雙系統之災難 * 在要進入try Lubuntu後會黑屛，上網查過發現是顯示卡的問題 * 在選擇try Lubuntu時按e進入編輯，然後加入nomodeset即可解決 * 在分割磁碟時，發現硬碟上有四個主磁區後便沒辦法增加磁區，只好移除掉一個看起來沒用的 * 在安裝grub時說放不進去/target讓我難過 * 發現是網路卡沒有插好，插好之後就沒有這個問題 * 安裝到grub2後，安裝程式崩潰 * 索性直接用一顆新硬碟安裝Lubuntu，手滑新硬碟掉地，新硬碟卒。 * 轉而嘗試虛擬主機，然後發現VM不能抓cache-miss * 轉而嘗試raspbian，發現apt-get找不到要自己build * 上網查詢發現DigitalOcean的VPS可以做cache-miss，實測可以成功抓到 * DigitalOcean * ssh key，開啟VPS時給予DO，第一次登入可以不用密碼，直接用私鑰，安全性高 * 新增root以外使用者，設定ssh_config 和 sudoers * 忘記安裝gcc導致make失敗 * 執行" apt-get update & apt-get install gcc " * 透過lscpu取得硬體資訊 * 可以獲得CPU型號和cache大小等等資訊 * 結果同開發環境 * make run * 紀錄運算所花費時間 * 以下是origin的數據統計 ``` size of entry : 136 bytes execution time of append() : 0.064278 sec execution time of findName() : 0.008735 sec ``` * make plot * 閱讀Makefile後發現其實是執行"gnuplot scripts/runtime.gp" * 閱讀runtime.gp後發現是拿output去依照scripts/runtime.gp設定的方法輸出圖表 * plot兩組數字總是一樣 * 在phoneboot_opt.h裡定義OPT為1即可 * make cache-test * 實際上是透過perf執行數據統計 * 以下是origin的數據統計，有88.778%的cache-misses ``` Performance counter stats for './phonebook_orig' (100 runs): 1,882,921 cache-misses # 88.778 % of all cache refs ( +- 0.19% ) 2,120,939 cache-references ( +- 0.29% ) 262,799,092 instructions # 1.11 insns per cycle ( +- 0.02% ) 237,529,409 cycles ( +- 0.66% ) 0.094042119 seconds time elapsed ( +- 0.73% ) ``` * Github ssh key * 利用如下指令新增VPS上的ssh key，然後設定到github裡 ``` ssh-keygen -t rsa -C "your_email@example.com" ``` * 利用如下指令確認github的ssh key運行正常 ``` ssh -T git@github.com ``` * 結果如下 ``` Hi SeanCCC! You've successfully authenticated, but GitHub does not provide shell access. ``` ## 開發紀錄 ### 針對極高的cache-misses做改善 * 那100筆的時間記錄太煩人了，並且output已經存在orig.txt和opt.txt了，因此直接導入/dev/null * 修改原理：cache是比記憶體快好幾倍的儲存空間，因此CPU會先嘗試尋找需要的資料是不是在cache裡面，如果不在就是cache-miss，這時要花費好幾倍的時間去記憶體讀取，如果記憶體又沒有，就要去置換空間，每一段的讀取花費的時間都是倍數成長，應該要避免。 * 修改方案：參考老師影片中前人的做法，使用將struct縮小來減少每個entry的數量。如此可以期望能cache到更多entry，減少cache-misses，原始碼如下。 * 修改結果（數據與圖表在下方）：cache-misses從將近90%降到57.7%。 ```c= typedef struct __PHONE_BOOK_ENTRY { char lastName[MAX_LAST_NAME_SIZE]; char firstName[16]; char email[16]; char phone[10]; char cell[10]; char addr1[16]; char addr2[16]; char city[16]; char state[2]; char zip[5]; struct __PHONE_BOOK_ENTRY *pNext; } entry_detail; typedef struct __LAST_NAME_ENTRY { char lastName[MAX_LAST_NAME_SIZE]; entry_detail *detail; struct __LAST_NAME_ENTRY *pNext; } entry; ``` * 結果如下 ``` perf stat --repeat 100 \ -e cache-misses,cache-references,instructions,cycles \ ./phonebook_orig > /dev/null Performance counter stats for './phonebook_orig' (100 runs): 1,846,951 cache-misses # 89.464 % of all cache refs ( +- 0.45% ) 2,064,471 cache-references ( +- 0.54% ) 262,409,563 instructions # 1.04 insns per cycle ( +- 0.02% ) 252,014,055 cycles ( +- 0.90% ) 0.128133241 seconds time elapsed ( +- 2.91% ) perf stat --repeat 100 \ -e cache-misses,cache-references,instructions,cycles \ ./phonebook_opt > /dev/null Performance counter stats for './phonebook_opt' (100 runs): 426,313 cache-misses # 57.716 % of all cache refs ( +- 0.61% ) 738,644 cache-references ( +- 1.28% ) 244,287,423 instructions # 1.56 insns per cycle ( +- 0.02% ) 156,112,220 cycles ( +- 1.33% ) 0.082167426 seconds time elapsed ( +- 4.13% ) ``` * 結果plot如下 ![](https://i.imgur.com/FmUAtbC.png) ### hash 與 binary tree待補 ## 問題討論 1.回答以下問題：本例選用的 dataset (也就是輸入電話簿的姓名) 是否存在問題？ * 有代表性嗎？跟真實英文姓氏差異大嗎？ * 裡面有許多名字根本不會出現，但是根據同學的共筆其實字首的分布滿接近現實的。 * 資料難道「數大就是美」嗎？ * 如這次程式，資料大讀取就花時間，讀了不用更是一種浪費。因此數大就是美只適用於資料有充分利用並且儲存在結構妥當的struct的情況。 * 如果我們要建立符合真實電話簿情境的輸入，該怎麼作？ * 可以利用各種名字表單，例如電話簿。 ## 疑問 :::warning 如果偵測效率的程式本身就需要用到cpu，是不是代表永遠無法完美偵測程式效率?畢竟偵測行為本身不是沒有效能代價的。 ::: ## 參考資料 * http://wiki.csie.ncku.edu.tw/ * http://linux.vbird.org/ * 老師作業介紹影片 * davis8211 同學解決圖表輸出的方法 * https://zh.wikipedia.org/wiki/%E7%BC%93%E5%AD%98

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.