[Kevin] 2019/7 July daily plan

# [Kevin] 2019/7 July daily plan ## Schedule - 7/1 ~ 7/9 - - Plan - Debug the code of inference part. - Let all the for loop in program is functionable. - 7/1 - settle down: - Choose my sit and computer, screen, chair. - Install the operating system (Ubuntu 16.04) and the developing environment - 7/2 - meet and know my student: - His name is Achmad Kripton Nugraha. I can call him Kripton. - Outcome: - Narrow down the problem. Inference has bug because the DataFrame become empty after excuting feature engineering. - Note: - df is functionable. - df_idx is functionable. - df_new is Empty dataframe. So the Error is happened in feature_engineering. - Tomorrow's plan: - Check the functions in feature_engineering, try to narrow down the problem. --------------------------------------------------------- - 7/3 - Outcome: - Narrow down the problem. The reason of the bug is df.dropna() drop whole array because there is some columns with all NaN in it. - Note: - Only need these columns: [Refilling, Hot temp , Cold temp, Warm Temp, Hot Valve, Warm Valve, Cold Vale] - Feature engineering will generate other features I need. - Power saving data will generate by Refilling data. (9th step of feature engineering) - Time shifting will generate the data of yesterday (power saving) and last week (power saving). - After that, time shifting will do df.dropna(), do drop all of NAN data in the data frame.(This is why I got a empty DataFrame, the current input data have some features that is useless and they are full of NAN. It make Whole DataFrame got dropped out.) - DataFrame.shape in feature_engineering process: (30240,23) (30240,23) (30240,23) (21600,25) (21600,30) (1440,30) (1440,30) (1440,30) (1440,31) (0,33) >>> after time shifting. - Tomorrow's plan: - Find out which features should be dropped, and where is the code dropping the features. ----------------------------------------------------------------- - 7/4 - Outcome: - The module feature_engineering is functionable now. - Note: - The inference data I put in was wrong, too many useless columns, drop them first. Columns_drop = ['Sterilizing', 'ErrorCode', 'SavingPower', 'ColdTemp_Insulation_Low', 'ColdTemp_Insulation_High', 'HotTemp_Insulation_Low', 'ColdTemp_Insulation', 'WarmTemp_Insulation', 'HotTemp_Insulation_High', 'HotTemp_Insulation', 'WaterLevel', 'TDS', 'Status', 'Consumption', 'Filter_Usage', 'Filter_Hint', 'Usage_CC', 'Usage_L', 'Usage_MT'] - After data_engineering, the columns are: Columns = ['Heating', 'Cooling', 'Refilling', 'WarmTemp', 'ColdTemp', 'HotTemp', 'Consump', 'day', 'hour_class', 'cool_time_back23', 'consump_sum_back23', 'hottemp_mean_back23', 'warmtemp_mean_back23', 'coldtemp_mean_back23', 'power_saving', 'Last_week', 'yesterday'] - DAYS. For data_engineering, we input 21 days data, 15 days after weekend has dropped, and finally 10 days left after dropna() in time shifting. - FEATURES. We have 9 features at first, drop 3 and then add 11 during data engineering, so we have 17 features after data engineering. - DataFrame.shape in feature_engineering process: (30240, 7) (30240, 7) (30240, 7) (21600, 9) (21600, 14) (1440, 14) (1440, 14) (1440, 14) (1440, 15) (960, 17) - Tomorrow's plan: - Write a code to generate the answer for predicting power saving from raw data. --------------------------------------------------------- - 7/5 - Outcome: - Know how to maintain the dispensors. - step.1 use GreenMeter app to check the meter is working - step.2 use Gateway app to check the Gateway is transfering the data - step.3 use bluetooth to check the WiFi of the Gateway is available - step.4 change the battery of Module if the battery is dead. The Module should hook on the red wire. - Note: - Compare df_real (answer) with df_heatmap (predict) to calculate the percentage of energy we saved. - Answer of poer_saving is from refilling, if refilling >5hrs && <12hrs = 1 ; else = 0 - There is some little different about the code and the definition above. - Learn how to maintain the Dispensors. - Tomorrow's plan: - Write a code to generate the answer for predicting power saving from raw data. ------------------------ - 7/8 - Outcome: - Finishing the answer of power saving (ps_time.py) - Step.1 Read csv file and drop columns and rows. - Step.2 Excute ps_time.main to calculate the answer of real_pstime - Step.3 Drop columns, then reshape ps_time. (row index = time, columns = date) - Step.4 Save as csv file. - Note: - The original power_saving code means: ``` block :s e s e refilling :1...11100000111...1 power_saving:....00111111100.... #block means the Consecutive 1 of refilling #block size means the length ``` - So I decide to modify the code, don't let power_svaing be 1 at the end and begin of block(111...1) - Tomorrow's plan: - Modify the code of power_saving to conform the meaning of power_saving we defined. (if refilling >=5hrs && <=12hrs = 1 ; else = 0) -------------------------- - 7/9 - Outcome: - Modify the code of power_saving and group (in inference_one_month.py) - Note: - The start and end of refilling=1 should not be power_saving = 1 - The Time range should be 12>= time >=5. - Group have tolerate one minute of data. But for counting block size, the data's unit is 15mins, so it shouldn't tolerate once. - The code if index==0 is wrong, it should be counting the block size befor first block. - Tomorrow's plan: - Make sure the final metrics has same size for each columns. So we can dintinguish the output result correctly.