# Click Overmind overview - Arbitrage handling ## I. Generate Arbitrage feed ```plantuml @startuml autonumber actor "Arbitrage Start Trigger" as arb_trigger participant "Traffic Acquisition System" as tas participant "Tracing System" as ts database "AWS S3" as s3 arb_trigger -> tas ++: Trigger Arbitrage feed generation tas -> s3: Read Job Advert parquets placed by Job Feed Processing tas -> tas: ... tas -> ts: Create new feed level business process loop for each job tas -> ts: Create new job level business process (tfj1) note left See example below endnote end loop tas -> tas: ... tas -> s3 --: Write latest feed XML @enduml ``` ### Tracing System record example for a `tfj1`: ```yml global_trace_id: {"uuid": 24070a67-1642-40c1-aee7-94585f9595ab, "schema_id": 'tfj1'} source: traffic-acquisition-system local_trace_id: 14c4facc-a6c9-4325-a433-05b874ba4025 created_at: 2022-11-25 11:06:43.782+0000 global_state: job_pool_specific_description_based_id: 60468a5e37aa2328eef5d1763c13bd88eb87cae005724cb80ed6be3adbcf81a7 outgoing_cpc: 0.3499999940395355 feed_cpc: 0.25 # incoming_cpc has been renamed to feed_cpc business_strategy: instant_monetization expiry_check: 0 silent_change: 0 job_unique_id: 10e2ff380137406cb2a6af867c180acf publisher_id (nullable): instant_monetization campaign_id (nullable): instant_monetization local_state: NULL parent_global_trace_id: {"uuid": 4aaa11e2-682e-41f2-be96-f3b95a7d96c4, "schema_id": 'tff1'} ``` ## II. Job seeker clicks ```plantuml @startuml autonumber actor "Job Seeker" as js participant "Click Overmind" as co participant "Tracing System" as ts participant "Central Job Advert Store" as cjas queue incoming_clicks as incq queue outgoing_clicks as ocq js -> co ++: click co -> co: Pick/generate user ID co -> ts ++: Join job level business process (tfj1) ts --> co --: tfj1 state data co -> ts ++: Create new click level business process (jc1) ts --> co --: new click level global trace id ("click_id") co -> cjas ++: Get job by job_pool_specific_description_based_id cjas --> co --: Current Job Advert (job_unique_id may be different) co -> co: Merge data - apply hierarchy of sources co -> co: Detect anomalies co -> co: Apply URL tags (e.g. conversion tracker, etc/) co --> js: Redirect opt success creating click level business process co -> incq: incoming click msg end opt opt redirected to partner co -> ocq: outgoing click msg end opt @enduml ``` ### Request data * URL path param * job level global trace id (e.g. `24070a67164240c1aee794585f9595abtfj1`) * Supported URL query params (others are ignored) - **these override state data from the Tracing System** * campaign_id - cpc_cid * publisher_id - cpc_pid * incoming_cpc * utm_campaign (defaults to `utm_campaign` if not provided) * utm_source * utm_medium * utm_content * utm_term * Read cookies * lensa_guest_id * (planned) lensa_access_token (with Job Seeker ID) * (+ other client info eg. ip, user agent) #### example url ``` https://lensa.com/cgw/inc/24070a67164240c1aee794585f9595abtfj1 ?utm_source=Lensa_ARB_test &utm_medium=cpc &utm_campaign=ARB_internal_test &incoming_cpc=0 &cpc_cid=001 &cpc_pid=007 &cpc-jid=10e2ff380137406cb2a6af867c180acf &source=pandologic_feed &utm_content=24070a67164240c1aee794585f9595abtfj1 ``` ### Job Advert (simplified) - received from Central Job Advert Store ```yml unique_id: 13dcb1ee6200498380454e4163796ffd remote_id: 551485348 job_pool_slug_id: pandologic_feed price_cpc: ??? # Not used is_expired: false partner_url: https://practicelinkltd.thejobnetwork.com/Job/551485348?etd=LAFFER2UEHPRBIJH7HQABA6754LROKXW5A2I3IMYKZ2YNGXGMJNNGZAHEFPN3W5Q6BLLLQUWJN7KVTBSW72ARR57LRLRWNADMN3N4ZK7MPKQYEEWXBNKLJDGJURCMNB43ABSZAXDVMZRU3JHNO3DXMMVOA%3d%3d%3d%3d%3d%3d classified_title: cleaned_title: Physician - Pain Medicine location: city: Seymour state: IN display_name: Seymour, IN company: normalized_name: Schneck Medical Center hash_id: 6a61be4cf3d4fcce6b7c4e55b20274449abe2978 ``` ### Produced incoming_clicks message ```json // Headers { "event_type": "click_entered", "is_anomaly": "true" } // Content { "anomalies": [ "duplicate" ], "utm_tags": { "utm_source": "Lensa_ARB_test", "utm_medium": "cpc", "utm_campaign": "ARB_internal_test", "utm_term": null, "utm_content": "24070a67164240c1aee794585f9595abtfj1" }, "incoming_cpc": 0, // Overridden in URL query param "publisher_id": "007", // Overridden in URL query param "campaign_id": "001", // Overridden in URL query param "global_trace_id": "328d175ed2294aa083f825c6a0104fa3jc1", "job_level_business_process_name": "traffic_flow_job", "job_pool_specific_description_based_id": "60468a5e37aa2328eef5d1763c13bd88eb87cae005724cb80ed6be3adbcf81a7", // current job_unique_id from Central Job Advert Store "job_unique_id": "13dcb1ee6200498380454e4163796ffd", "cleaned_title": "Physician - Pain Medicine", "job_pool_slug_id": "pandologic_feed", "job_remote_id": "551485348", "city": "Seymour", "state": "IN", "company_normalized_name": "Schneck Medical Center", "company_hash": "6a61be4cf3d4fcce6b7c4e55b20274449abe2978", "job_url": "https://practicelinkltd.thejobnetwork.com/Job/551485348?etd=LAFFER2UEHPRBIJH7HQABA6754LROKXW5A2I3IMYKZ2YNGXGMJNNGZAHEFPN3W5Q6BLLLQUWJN7KVTBSW72ARR57LRLRWNADMN3N4ZK7MPKQYEEWXBNKLJDGJURCMNB43ABSZAXDVMZRU3JHNO3DXMMVOA%3d%3d%3d%3d%3d%3d", "is_expired": false, "ip_address": "54.208.108.236", "user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Safari/605.1.15", "guest_id": "4d45cacd-69d0-4e70-9e7a-1f3b0f0cd344", "outgoing_cpc": 0.3499999940395355, "clicked_at": 1669630502, "job_level_global_trace_id": "24070a67164240c1aee794585f9595abtfj1" } ``` ### Produced outgoing_clicks message ```json // Headers { "event_type": "click_redirected", "is_anomaly": "true" } // Content { "anomalies": [ "duplicate" ], "utm_tags": { "utm_source": "Lensa_ARB_test", "utm_medium": "cpc", "utm_campaign": "ARB_internal_test", "utm_term": null, "utm_content": "24070a67164240c1aee794585f9595abtfj1" }, "outgoing_cpc": 0.3499999940395355, "global_trace_id": "328d175ed2294aa083f825c6a0104fa3jc1", "job_level_business_process_name": "traffic_flow_job", "job_pool_specific_description_based_id": "60468a5e37aa2328eef5d1763c13bd88eb87cae005724cb80ed6be3adbcf81a7", "job_unique_id": "13dcb1ee6200498380454e4163796ffd", "cleaned_title": "Physician - Pain Medicine", "job_pool_slug_id": "pandologic_feed", "job_remote_id": "551485348", "city": "Seymour", "state": "IN", "company_normalized_name": "Schneck Medical Center", "company_hash": "6a61be4cf3d4fcce6b7c4e55b20274449abe2978", "job_url": "https://practicelinkltd.thejobnetwork.com/Job/551485348?etd=LAFFER2UEHPRBIJH7HQABA6754LROKXW5A2I3IMYKZ2YNGXGMJNNGZAHEFPN3W5Q6BLLLQUWJN7KVTBSW72ARR57LRLRWNADMN3N4ZK7MPKQYEEWXBNKLJDGJURCMNB43ABSZAXDVMZRU3JHNO3DXMMVOA%3d%3d%3d%3d%3d%3d", "ip_address": "54.208.108.236", "user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Safari/605.1.15", "guest_id": "4d45cacd-69d0-4e70-9e7a-1f3b0f0cd344", "job_seeker_id": null, "clicked_at": 1669630502, "is_incoming": true } ``` ## III/A. Consume incoming_clicks for legacy system support ```plantuml @startuml autonumber actor "AWS Lambda" as lambda queue incoming_clicks as incq participant "Legacy Click Consumers" as lcc database incoming_click database jobalerts lambda -> incq: poll lambda -> lcc ++: incoming_clicks msg lcc -> lcc: Transform for incoming_click db note left See below end note lcc -> incoming_click: Insert incoming_click opt expired or spike lcc -> lcc: Transform for jobalerts db note left We assume there is no outgoing_clicks msg, so this consumer must insert a jobstop-type record into jobalerts.click_tracking. Transforms and inserts are the same as what is described for the outgoing_clicks consumer further below. endnote lcc -> jobalerts: Insert click_tracking lcc -> jobalerts: Insert click_tracking_incoming_click lcc -> jobalerts: Insert click_tracking_id_trace_id_mapping end opt @enduml ``` ### Inserted incoming_click.incoming_click Fields with non-obvious content: * unique_id = job_unique_id * click_source = `jobsense` or `jobsense.widget` if we recognize the business process name, else job_level_business_process_name * global_tracing_id = job_level_global_trace_id * incoming_click_id = global_trace_id (click-level) * recommendation_time = NULL (constant) ```yml id: 244110 unique_id: 13dcb1ee6200498380454e4163796ffd job_pool_specific_description_based_id: 60468a5e37aa2328eef5d1763c13bd88eb87cae005724cb80ed6be3adbcf81a7 global_tracing_id: 24070a67164240c1aee794585f9595abtfj1 publisher_id: 007 campaign_id: 001 click_source: traffic_flow_job incoming_price: 0.0 outgoing_price: 0.35 is_expired: 0 cleaned_title: Physician - Pain Medicine company_normalized_name: Schneck Medical Center company_hash: 6a61be4cf3d4fcce6b7c4e55b20274449abe2978 city: Seymour state: IN ip_address: 54.208.108.236 user_agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Safari/605.1.15 utm_tags: '{"utm_campaign": "ARB_internal_test", "utm_source": "Lensa_ARB_test", "utm_medium": "cpc", "utm_content":"24070a67164240c1aee794585f9595abtfj1", "utm_term": null}' recommendation_time: null incoming_click_time: 2022-11-28 10:15:02.0 incoming_click_id: 328d175ed2294aa083f825c6a0104fa3jc1 utm_source: Lensa_ARB_test utm_medium: cpc utm_content: 24070a67164240c1aee794585f9595abtfj1 utm_campaign: ARB_internal_test utm_term: null cid: 4d45cacd-69d0-4e70-9e7a-1f3b0f0cd344 ``` ## III/B. Consume outgoing_clicks for legacy system support ```plantuml @startuml autonumber actor "AWS Lambda" as lambda queue outgoing_clicks as ocq participant "Legacy Click Consumers" as lcc database jobalerts lambda -> ocq: poll lambda -> lcc ++: outgoing_clicks msg lcc -> lcc: Transform for jobalerts db note left See below endnote lcc -> jobalerts: Insert click_tracking lcc -> jobalerts: Insert click_tracking_incoming_click lcc -> jobalerts: Insert click_tracking_id_trace_id_mapping @enduml ``` #### outgoing_clicks msg transformations for inserting to jobalerts.click_tracking * cid = job_seeker_id else guest_id * utm_content = * `spike_stopped` if spike anomaly * `expired` if job expired * else the actual utm_content * type = * `job_sense` if job_level_business_process_name is a known JobSense * `jobstop` if spike anomaly or job expired * else `onsite` * Constant NULLs: * user_src * client_id * jobtrack_id * title_similarity * geo_similarity ### Inserted jobalerts.click_tracking_id_trace_id_mapping ```yml id: 2 click_tracking_id: 312400275 global_trace_id: 328d175ed2294aa083f825c6a0104fa3jc1 ``` ### Inserted jobalerts.click_tracking_incoming_click ```yml click_tracking_id: 312400275 incoming_click_id: 328d175ed2294aa083f825c6a0104fa3jc1 ``` ### Inserted jobalerts.click_tracking ```yml id: 312400275 type: onsite src: pandologic_feed cid: 4d45cacd-69d0-4e70-9e7a-1f3b0f0cd344 user_src: null client_id: null ip: 54.208.108.236 useragent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Safari/605.1.15 url: https://practicelinkltd.thejobnetwork.com/Job/551485348?etd=LAFFER2UEHPRBIJH7HQABA6754LROKXW5A2I3IMYKZ2YNGXGMJNNGZAHEFPN3W5Q6BLLLQUWJN7KVTBSW72ARR57LRLRWNADMN3N4ZK7MPKQYEEWXBNKLJDGJURCMNB43ABSZAXDVMZRU3JHNO3DXMMVOA%3d%3d%3d%3d%3d%3d jobtrack_id: null position: Physician - Pain Medicine location_name: Seymour, IN jobadvert_id: pandologic_feed_551485348 created_at: 2022-11-28 10:15:02.0 utm_source: Lensa_ARB_test utm_medium: cpc utm_campaign: ARB_internal_test cpc: 0.35 title_similarity: null geo_similarity: null utm_content: 24070a67164240c1aee794585f9595abtfj1 utm_term: null ``` ## IV. Track conversion ```plantuml @startuml autonumber actor Partner participant "Conversion Tracker" as ct database jobalerts queue converted_clicks as ccq Partner -> ct ++: HTTP: click level global trace id in request or cookie ct -> jobalerts: Read click_tracking_id_trace_id_mapping ct -> jobalerts: Insert partner_application_tracking ct -> ccq --: Send msg @enduml ``` ### Inserted jobalerts.partner_application_tracking ```yml id: 19872194 click_tracking_id: 312400275 created_at: 2022-11-28 14:57:10 ``` ### Produced converted_clicks message ```json // Headers {} // Content { "job_click_global_trace_id": "d3569ebd080a47f6b77d3fc0d4be1d73jc1", "created_at": 1669647430 } ```