Dave Enrollment Job (Reference for Cred.ai Spark Job)

# Dave Enrollment Job (Reference for Cred.ai Spark Job) ## job_request_dave_user_enrollment_filetransfer_import ###### tags: `pentaho type job` ### Find Dave Card Enrollment Files in file_transfer ###### tags: `pentaho type step` ```sql SELECT file_transfer_id, full_file_name, created_timestamp FROM filetransfer.file_transfer_info WHERE client_name = 'dave' AND transfer_status = 'DONE' AND short_file_name like 'Dave.com_Enrollment_%' AND processed_timestamp is null ORDER BY created_timestamp asc ``` ![](https://i.imgur.com/rjL3251.png) ## dave_user_filetransfer_work ###### tags: `pentaho type job` ### Cardholder File Exist ###### tags: `pentaho type step` Checks if file exists at `${full_file_name}` #### Save file to S3 (If Cardholder file Exists) ###### tags: `pentaho type job` Copies ``${full_file_name}`` to `${S3_WAREHOUSE_PREFIX}/raw/enrollment/dave/` using generic pentaho job. #### Send email if file not found ###### tags: `pentaho type job` ### Convert results to json args ###### tags: `pentaho type transformation` This transformation doesn't really seem to serve a purpose. It seems to get a variable `cardholder_file_path`, convert to json, then get the first element of the json object just created only to finally set that variable `json_argument`. #### Get Variables ###### tags: `pentaho type step` ![](https://i.imgur.com/EWOoWK0.png) #### Json Output ###### tags: `pentaho type step` ![](https://i.imgur.com/mes5MC0.png) ![](https://i.imgur.com/eRgrbS1.png) #### Json Input ###### tags: `pentaho type step` ![](https://i.imgur.com/6LFve8i.png) #### Set Variables ###### tags: `pentaho type step` ![](https://i.imgur.com/LFcy6pX.png) ### Job request insert ###### tags: `pentaho type job` Creates an entry into the `job_requests` table for the next job to run, `/dave/enrollment/user/manager/job_request_dave_cardholder_enrollment.kjb`, passing in the file path created through the json transformation. #### job request insert helper ###### tags: `pentaho type transformation` #### variable setter ###### tags: `pentaho type transformation` ### Update job request create timestamp ###### tags: `pentaho type transformation` Why don't we let the datbase handle the timestamp? Who knows? #### Get Variables ###### tags: `pentaho type step` Gets `created_timestamp` to `${created_timestamp}` and `job_request_id` to `${job_request_id}` ![](https://i.imgur.com/Sa7BXrC.png) #### update job request create timestamp ###### tags: `pentaho type step` Updates the `job_request` entry to `${created_timestamp}`. ![](https://i.imgur.com/fu5DKyP.png) ### Update file transfer info processed timestamp ###### tags: `pentaho type transformation` Again, just let the database handle this #### Get Variables ###### tags: `pentaho type step` Gets `file_transfer_id` as `${file_transfer_id}` ![](https://i.imgur.com/FbCRF0t.png) #### Get system info ###### tags: `pentaho type step` Creates `current_timestamp` as `system date (variable)` ![](https://i.imgur.com/I28wxZW.png) #### update job request create timestamp ###### tags: `pentaho type step` Set the `procesed_timestamp` on `file_transfer_info` to `${current_timestamp}` Connection says read only, but it's doing updates. ![](https://i.imgur.com/P4NrXyH.png) #actual enrollment transformations ## job_request_dave_cardholder_enrollment ###### tags: `pentaho type job` ### Adding job-name as variable ###### tags: `pentaho type step` Sets variable `job_name` as `/dave/enrollment/user/manager/job_request_dave_cardholder_enrollment.kjb` ### job_request_selector ###### tags: `pentaho type job` #### find_job_request_by_name ###### tags: `pentaho type transform` ##### Get Variables Gets vars `job_name` as `${job_name] and ${order_by_created_timestamp}` as `${order_by_created_timestamp}`. ##### Job Name Lookup Checks if the `job_name` exists in `filetransfer.job`, filtering on `name = ${job_name}`. ##### Pending status Creates a constant of `pending_status` as `PENDING` ##### Validate order by If results are returned it orders by `order_by_created_tiemstamp` ascending. ##### select job_request Returns one row, so it only processes job_request per iteration ```sql select j1.id from filetransfer.job_request as j1 left outer join filetransfer.job_request as j2 on j1.blocked_by_job_request_id = j2.id where j1.job_name = '${job_name}' and j1.status = 'PENDING' and (j2.status is null or j2.status = 'DONE') order by j1.created_timestamp ${order_by_created_timestamp} limit 1 ``` ##### convert_job_request_id_from_int_to_string Creates a new field on dataset, `selected_job_request_id` with this java experssion `id == null ? "" : id+""`. #### Check if job is requested Runs this javascript ```javascript !parent_job.getVariable("selected_job_request_id").equals("") ``` ##### Select job request id Gets the aforementioned `selected_job_request_id` ##### Set Varaibles Sets `selected_job_request_id` as a var named `selected_job_request_id` ## create_full_enrollment_standard_files ###### tags: `pentaho type job` ### write realm to log ###### tags: `pentaho type step` ### create_full_enrollment_files ###### tags: `pentaho type transformation` #### strip postal code to 5 Takes the first 5 digits of the postal_code ![](https://i.imgur.com/TSCQjCd.png) #### add file name constants adds the file prefixes and suffixes to be used later to create the final final ![](https://i.imgur.com/guQMd4h.png) #### get variables pass in file_paths to the variables to send a successful file and also set the file_path for a error file. grab the row count of the file about to be processed at the start ![](https://i.imgur.com/zyI6kiM.png) #### format system date convert system date to yyyyMMddHHmmss and name it system_date_formatted #### add enrollment file column constants Some have default values and some are set as empty ![](https://i.imgur.com/nPo3kj2.png) #### Create file names Using the add file name constants step from prior, this step creates the full paths with file names with concentations ![](https://i.imgur.com/NygjMkS.png) #### The job now breaks into two different paths to produce a file and a standard file_path #### Full enrollment file path #### Std file that sets active or inactive There are two filter steps. The first filter step looks for status = ACTIVE ![](https://i.imgur.com/fwLHuPG.png) #### If that is true, it sets the card_status to ACTIVE ![](https://i.imgur.com/YDUUDwF.png) #### If the status = ACTIVE is not true, there is a failure path which checks if the status = INACTIVE ![](https://i.imgur.com/FLWkUYR.png) #### If it is INACTIVE, it sets the card_status to INACTIVE ![](https://i.imgur.com/080W53v.png) #### Combines both rows with STATUS = ACTIVE or INACTIVE to std enrollment file ![](https://i.imgur.com/hVud2zP.png) If the status is not ACTIVE or INACTIVE, there is a third failure path that will set the reason to The Status 'ENROLL_STATUS' is not known and is sent to a skipped file #### label as reason = The Status 'ENROLL_STATUS' is not known ![](https://i.imgur.com/AwURqai.png) #### replaces the place holder of ENROLL_STATUS with the value of the status that came from the source file ![](https://i.imgur.com/hyo7s7a.png) #### writes to a skip file ![](https://i.imgur.com/lmLp8gU.png) #### waits for all skips to be added to the skip file ![](https://i.imgur.com/fnpoASi.png) ### email_error_file ###### tags: `pentaho type transformation`