# 210413 記憶體耗能測試結果 ## 測試流程 1. 程式開始,載入所需物件 2. 分析第一個檔案 (2019_Oct_Data.csv) + dataset amount after removing official messages: 1602289 + dataset amount after removing duplicate messages: 8175 3. 分析第二個檔案 (2019_Nov_Data.csv) + dataset amount after removing official messages: 2231899 + dataset amount after removing duplicate messages: 9987 4. 分析第三個檔案 (2019_Aug-Sep_Data.csv) + dataset amount after removing official messages: 973174 + dataset amount after removing duplicate messages: 113813 5. 程式結束 + 每 5 秒紀錄 RAM 大小到 log 檔。 ### Code ```python msg_file_list = ['2019_Oct_Data', '2019_Nov_Data', '2019_Aug-Sep_Data'] for msg_file in msg_file_list: print(datetime.now().strftime("%Y/%m/%d %H:%M:%S") + ' ' + msg_file) with open(msg_file + '.csv', 'r', encoding="utf-8") as f: csv_list = preprocess_from_csv_to_list(f) csv_df = model.batch_analysis(csv_list, pinyin_mode=True, batch_size=500) save_path = msg_file + '_result.csv' csv_df.to_csv(save_path, index=False) f.close() ``` ## 測試機器 + **CPU:** i7-6700 + **RAM:** 16G + **GPU:** Null ## 測試結果 + 備註1: 圖中的 size = 在這個檔案中有幾筆資料(去除重複後) + 備註2: 圖中 y 軸雖然最大值為 8000MB(約 8G),但**實際的 RAM 有 16G**。 ### 1. Default + 耗時約: 1小時6分 (21:46:12->22:52:43)  ### 2. Batch size = 500 + 耗時約: 33分 (00:55:20->01:27:48) 
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up