SEVER API VERSION 2

# SEVER API VERSION 2 # 輸出與匯入環境包 ## Export enviroment of current conda enviroment The following command: **`conda-env export -n your_env_name > your_env_name.yml`** example: **`conda-env export -n test_env > test_env.yml`** ## Create new enviroment using yml configuration The following command: **`conda-env create -n new_env_name -f=your_yml_config.yml`** example: **`conda-env create -n new_env -f=test_env.yml`** # SERVER API 程式檔說明 **分為4個檔:** 1. sever.py: SERVER 進行運算的主程式，接收資料進行預測，並回傳預測結果 2. run.py : 資料處理的主程式，分為3部分: * 到公共區forum中的raw data 抓資料並合成一個檔，丟給server進行預測，回傳預測結果後，進行抽樣(8000筆)並分割成8個檔以供人工校正 * 每天將校正完的資料與前一天的訓練資料進行合併，並重新訓練一個新模型 * 每天將校正完的資料結果與SERVER預測的資料結果進行指標計算，項目有預測錯誤次數、AUC、準確率以及混淆矩陣 3. rebuild_model.py: 進行重新建模的模組，且將每天訓練好的模型存放於'model/mode/'，同時更新'model/model_for_server'下的模型 4. data.py: 進行所有資料處理的函式，如下: * get_data(): 從公共區抓資料 * to_AITeam_Result(): 將資料輸出到'AITeam_Result_2' * sampling_data(): 將預測後資料進行抽樣8000筆，抽樣方法為將Y全部抽出，其餘都是N * split_sampling_data(): 將抽樣的資料進行分割 * catch_correct_data_to_file(): 將已經校正的資料抓出，並另外存到'adj_ad_for_train'目錄下 * write_into_statistic_data(): 將統計數據(預測錯誤次數、準確率、AUC、混淆矩陣)寫入.CSV檔 * find_miss_date(check_dir): 找出"finish_adj"有但"check_dir"沒有的日期 * create_new_train(): 建立新的訓練資料 * adjust_prediction(): 將預測結果調整成人工校正後的結果(是否為廣告) * load_transform(): 進行資料預處理部分，包含結巴(jieba)，逗號處理與空格處理等