Formatter Design

# Formatter Design ## Requirements Take EDP queried data as input, output the following widget responses - score card - line chart there is also two modes to support: - normal - comparison(compare metrics between two different time intervals) ## Request Example 1 ``` impressions | performance_datetime | performance_datetime_index | campaign_channel -------------+----------------------+----------------------------+------------------ 839591 | 1648771200 | 0 | Android Push 40149 | 1648771200 | 0 | InApp 125457 | 1648771200 | 0 | iOS Push 1255733 | 1648857600 | 1 | Android Push 52629 | 1648857600 | 1 | InApp 286768 | 1648857600 | 1 | iOS Push 939040 | 1648944000 | 2 | Android Push 52562 | 1648944000 | 2 | InApp 1 | 1648944000 | 2 | InWeb 194820 | 1648944000 | 2 | iOS Push 1230936 | 1649030400 | 3 | Android Push 54023 | 1649030400 | 3 | InApp 232125 | 1649030400 | 3 | iOS Push 1271609 | 1649116800 | 4 | Android Push 43161 | 1649116800 | 4 | InApp 2 | 1649116800 | 4 | InWeb 211339 | 1649116800 | 4 | iOS Push 1076898 | 1649203200 | 5 | Android Push 43513 | 1649203200 | 5 | InApp 178588 | 1649203200 | 5 | iOS Push 1114294 | 1649289600 | 6 | Android Push 38571 | 1649289600 | 6 | InApp 1 | 1649289600 | 6 | InWeb 202949 | 1649289600 | 6 | iOS Push 1147923 | 1649376000 | 7 | Android Push 29250 | 1649376000 | 7 | InApp 793111 | 1649462400 | 8 | Android Push 21037 | 1649462400 | 8 | InApp 884273 | 1649548800 | 9 | Android Push 25554 | 1649548800 | 9 | InApp ``` Example 2: ``` unique_conversions_overall | performance_datetime | performance_datetime_index | channel ----------------------------+----------------------+----------------------------+-------------- 355 | 1648771200 | 0 | Android Push 3390 | 1648771200 | 0 | InApp 40 | 1648771200 | 0 | iOS Push 438 | 1648857600 | 1 | Android Push 6019 | 1648857600 | 1 | InApp 29 | 1648857600 | 1 | iOS Push 405 | 1648944000 | 2 | Android Push 4096 | 1648944000 | 2 | InApp 31 | 1648944000 | 2 | iOS Push 535 | 1649030400 | 3 | Android Push 3987 | 1649030400 | 3 | InApp 47 | 1649030400 | 3 | iOS Push 882 | 1649116800 | 4 | Android Push 6386 | 1649116800 | 4 | InApp 33 | 1649116800 | 4 | iOS Push 566 | 1649203200 | 5 | Android Push 4502 | 1649203200 | 5 | InApp 43 | 1649203200 | 5 | iOS Push 558 | 1649289600 | 6 | Android Push 2972 | 1649289600 | 6 | InApp 46 | 1649289600 | 6 | iOS Push 479 | 1649376000 | 7 | Android Push 2939 | 1649376000 | 7 | InApp 361 | 1649462400 | 8 | Android Push 2410 | 1649462400 | 8 | InApp 622 | 1649548800 | 9 | Android Push 3314 | 1649548800 | 9 | InApp 371 | 1649635200 | 10 | Android Push 2128 | 1649635200 | 10 | InApp ``` Questions: - Column name 包含在 request 內嗎? 或是更廣義的問，request 是以什麼樣的形式傳進 formatter? CSV or JSON 之類的? - Unifier, EDP driver, composer and formatter 是以什麼樣的形式協作: - Code level(i.e. different Python modules) - Service level - Other ways??? Answers: - Yes, each row would be a dict from column name to value - Code level for now.(Update: 2022/05/03, EDP driver -> formatter via HTTP response) ## Response ### Normal #### Score card ```json= dataset: { dimensions: ['Channel'], source: [ ['Android Push', 24600], ['iOS Push', 13800], ['SMS', 34100], ['Web Push', 29000], ['Email', 76200], ['Line', 1500000], ], } ``` #### Time Series ```json= { dataset: { dimensions: ['date', 'regular', 'trigger', 'in-web', 'in-app', 'journey'], source: [ [unix_timestamp_in_sec, 12, 55, 66, 2], [unix_timestamp_in_sec, 6, 16, 23, 1], [unix_timestamp_in_sec, 8, 30, 74, 7], [unix_timestamp_in_sec, 90, 2, 52, 3], [unix_timestamp_in_sec, 100, 40, 59, 11], ] }, series: [ {type: 'line'}, {type: 'line'}, {type: 'line'}, {type: 'line'} ], } ``` ### Comparison #### Score card ```json= dataset: { dimensions: ['Channel'], source: [ ['Android Push', 24600], ['iOS Push',13800] ['SMS', 34100], ['Web Push', 29000], ['Email', 76200], ['Line', 1500000], ], comparison: { 'Android Push': { uplift: 0.2, }, ... } } ``` #### Time Series ```json= { dataset: { dimensions: ['date', 'conversion', , 'Previous Period'], source: [ [1646092800, 12, 3], [1646179200, 6, 2], [1646265600, 8, 94], [1646352000, 90, 38], [1646438400, 121, 58], [1646524800, 100, 48], [1646611200, 30, NaN] ], comparison: { // the data point of 1646092800 is mapped to 1645488000 1646092800: { previous_time: 1645488000, uplift: 0.2, }, ... } }, series: [ {type: 'line'}, {type: 'line', is_comparison_mode: true}, ], } ``` ## Pseudo Code ### Normal & Comparison ```python DataRow = Tuple[Union[str, float]] Dims = Tuple[str] def interpolation(data: List[DataRow], beg_time: int, eng_time: int, interval: int): pass def convert_query_result_to_widget_format(data: List[DataRow], dims: Dims, list_prev_dims: Optional[List[Dims]] datetime_option: Optional[Dict] ): """ data -- EDP queried result, e.g. [(839591, 1648771200, 'Android Push'), (40149, 1648771200, 'InApp')] dims -- Column(Dimension) IDs, e.g. ('impression', 'datetime', 'campaign_type') list_prev_dims -- Specified dims for previous periods, would be used by comparison mode. datetime_option -- Specified time range and interval in second(usually dim 0 if exist?), e.g. { 'datetime_range': [1648771200, 1648794300], 'interval': 43200 } """ # 1. interpolation if datetime_option: data = interpolation(data, datetime_option['datetime_range'][0], datetime_option['datetime_range'][1], datetime_option['interval'], ) # 2. real formatting work pass ```