# Lucidworks Proof of Concept ## POC Configuration | Configuration | Place Holder |Value| | -------- | -------- |-------| | Customer Code | `customerCode` | lucidworks | | Mesh Key | `meshKey` | dev | | Conscia Endpoint | `baseURL` | https://io-staging.conscia.ai | | API Key | `apiKey` | shared privately | | Conscia Admin | N/A |https://admin-staging.conscia.ai/applications| ## Platform Components ![](https://hackmd.io/_uploads/SkHa07WQY.png) ## Data Set The data set is from a grocery store with multiple store locations. The entities and the corresponding files are described in the table below: | Entity | File(s) | Data Record Type | | --- | --- | -- | | Customer | customer.csv.gz | Master | | Product | product.csv.gz | Master | | Promo | promo.csv.gz | Master | | Store | store.csv.gz | Master | | Transaction Header | transaction_header_YYYYMM.csv.gz | Transactional | | Transaction Detail | transaction_detail_YYYYMM.csv.gz | Transactional | The entity relationships and CSV columns are illustrated in the following ER diagram: ```mermaid erDiagram CUSTOMER ||--o{ TRANSACTION_HEADER : generates CUSTOMER ||--o{ TRANSACTION_DETAIL : generates TRANSACTION_HEADER ||--o{ TRANSACTION_DETAIL : contains TRANSACTION_DETAIL ||--o| PROMO : uses TRANSACTION_DETAIL ||--|| PRODUCT : has TRANSACTION_HEADER ||--|| STORE : "occurred at" TRANSACTION_DETAIL ||--|| STORE : "occurred at" CUSTOMER { string UNIQUE_CUSTOMER_ID string TYR_KEY string TYR_ACCOUNT_ID string CARDNUMBER string STATUSNUMBER date FIRST_TRANS_DATE date LAST_TRANS_DATE string SEGMENT_NUMBER string SEGMENT_NAME string PRIMARY_SITE string GG_CUSTOMER_ID boolean GG_CUSTOMER_FLAG integer POINTS_BALANCE decimal CASH_BALANCE boolean EMAIL_FLAG } PRODUCT { string PRODUCT_KEY string SKU_NO string PRODUCT_DESCRIPTION string BRAND string WeighedItemFlag string AttOrganic string AttHealth string Att_Gluten string Att_Natural string DiscontinuedFlag string PRODUCT_LEVEL_1_DEPTGRP_ID string PRODUCT_LEVEL_1_DEPTGRP_DESC string PRODUCT_LEVEL_2_DEPTID string PRODUCT_LEVEL_2_DEPT_DESC string PRODUCT_LEVEL_3_CLASS_ID string PRODUCT_LEVEL_3_CLASS_DESC string PRODUCT_LEVEL_4_LINEID string PRODUCT_LEVEL_4_LINE_DESC } PROMO { string PROMOID string PROMO_DESC date PROMO_START_DATE date PROMO_END_DATE string PROMO_TYPE } STORE { string SITEKEY string SITENUMBER string SITEFULLNAME string SITENUMBERANDNAME string CITY string REGION string PROVINCE string POSTALCODE string STORE_FORMAT string PRICE_ZONE string PRICE_ZONE_NAME date SITEOPENING } TRANSACTION_HEADER { string TRANSACTION_ID timestamp TRANSACTION_DATE string STORE_ID integer ITEM_COUNT integer SALE_QTY decimal SALES_AMOUNT decimal REGULAR_PRICE decimal DISCOUNT string TYR_CARDNUMBER } TRANSACTION_DETAIL { string TRANSACTION_ID string SITEKEY string PRODUCT_KEY string TYR_KEY string PROMOID timestamp TRANSACTION_DATE integer UNIT_QTY decimal WEIGHT decimal COGS_COST decimal SALES_AMOUNT_ACTUAL decimal SALES_AMOUNT_REGULAR boolean NON_REVENUE_FLAG string TYR_CARDNUMBER } ``` When representing the above entities as JSON, the fields are all lowercased. The Master Data schema is flexible, but for the POC we have chosen lowercased fields for the Master Data fields, so the fields of the entities being passed should match. Example customer record: ```json { "tyr_key": "10000182", "unique_customer_id": "A65F0081-778E-43EE-BD46-9C14ACE34356", "tyr_account_id": "45010000182", "cardnumber": "45010000182", "statusnumber": "A", "first_trans_date": 1600232398000, "last_trans_date": 1631768398000, "segment_number": "3", "segment_name": "Opal", "primary_site": "0", "gg_customer_flag": "N", "points_balance": 1928, "cash_balance": 0, "email_flag": 0 } ``` When passing in a values from the CSV as JSON, please ensure the following: - timestamp and date values should be sent as epoch values in milliseconds. - boolean values (represented as "N", 0, "Y", 1) should be passed as true or false ## Creating, updating and removing Master Data Records into a Conscia Collection Master Data Records are uploaded into a Data Collection inside the Conscia Graph and are typically enriched with the Computed Values that are calculated from the Transactional Data Records. For example: `Customers` are Master Data Records and `Product Purchases` are Transaction Data Records. `Product Views` could be used to calculate each Customer's "favourite" product category which could be used to populate a Customer's `favouriteCategory` field. Master Data Records are typically used as Lookups for Transactional Data Records. In the above example, the `Product Purchases` Data Records will have a Customer identifier and would need the `Customers` Data Records as a Lookup to get the Customer attributes. If Master Data Records do not need to be enriched or managed in any way, then they do not need to be uploaded into a Data Collection. Instead, upload them directly into a Data Table (as you would for Transactional Data Records). | Entity | `collectionCode` (in Conscia Graph) | | --- | --- | | Customer | `customer` | | Product | `product` | | Promo | `promo` | | Store | `store` | ### Asynchronous Webservices Asynchronous webservices are not rate-limited. This is the recommeneded way to Create/Update/Delete Master Data. Asynchronous endpoints will return an jobID which can be used to check status. They can optionally specify a boolean property `sendCompletionEmail` which defaults to `false` if not specified. If set to `true`, an email is sent to the user associated with whe API Key. Asynchronous endpoints can optionally specify a boolean property `eventSource` which is a string that will be associated to the Job. #### Creating Master Data Records (Asynchronously) Endpoint: ``` POST {{baseUrl}}/vue/_api/v1/collections/{{collectionCode}}/records/_async ``` | Property | Description | | --- | --- | | ifExists | `ignore` will skip the creation of the record if already exists. `conflict` will throw an error if the record already exists. | Example: ```http POST {{baseUrl}}/vue/_api/v1/collections/customer/records/_async Content-Type: application/json Authorization: Bearer {{apiKey}} X-Customer-Code: {{customerCode}} { "sendCompletionEmail": true, "ifExists": "ignore", "eventSource": "customer-123-pipeline", "dataRecords": [ { "tyr_key": "10000182", "unique_customer_id": "A65F0081-778E-43EE-BD46-9C14ACE34356", "tyr_account_id": "45010000182", "cardnumber": "45010000182", "statusnumber": "A", "first_trans_date": 1600232398000, "last_trans_date": 1631768398000, "segment_number": "3", "segment_name": "Opal", "primary_site": "0", "gg_customer_flag": "N", "points_balance": 1928, "cash_balance": 0, "email_flag": 0 }, { "tyr_key": "10000189", "unique_customer_id": "B12F0081-778E-43EE-BD46-9C14ACE34675", "tyr_account_id": "45010000189", "cardnumber": "45010000189", "statusnumber": "A", "first_trans_date": 1600232398000, "last_trans_date": 1631768398000, "segment_number": "3", "segment_name": "Opal", "primary_site": "0", "gg_customer_flag": "N", "points_balance": 1928, "cash_balance": 0, "email_flag": 0 } ] } ``` #### Updating Master Data Records (Asynchronously) Endpoint: ``` PATCH {{baseUrl}}/vue/_api/v1/collections/{{collectionCode}}/records/_async ``` | Property | Description | | --- | --- | | ifExists | `replace` will replace the full record. `merge` will overwrite the fields specified in the records. | | ifNotExists | `create` - if the specified identifier does not exist, create the record. `fail` - if the specified identifier does not exist, fail. `ignore` - if the specified identifier does not exist, skip it. | Example: Update 2 customers' `points_balance` and `cash_balance` values. The `unique_customer_id` is the record identifier which must be provided to update existing records. ```http PATCH {{baseUrl}}/vue/_api/v1/collections/{{collectionCode}}/records/_async Content-Type: application/json Authorization: Bearer {{apiKey}} X-Customer-Code: {{customerCode}} { "sendCompletionEmail": true, "ifNotExists": "ignore", "eventSource": "customer-123-pipeline", "dataRecords": [ { "unique_customer_id": "A65F0081-778E-43EE-BD46-9C14ACE34356", "points_balance": 2501, "cash_balance": 12.98 }, { "unique_customer_id": "B12F0081-778E-43EE-BD46-9C14ACE34675", "points_balance": 340, "cash_balance": 5.56 } ] } ``` #### Removing Master Data Records (Asynchronously) Endpoint: ``` DELETE {{baseUrl}}/vue/_api/v1/collections/{{collectionCode}}/records/_async ``` | Property | Description | | --- | --- | | ifNotExists | `ignore` - if the specified identifier does not exist, ignore. `fail` - if the specified identifier does not exist, fail. | Example: Remove two customers by their unique identifiers ```http DELETE {{baseURL}}/vue/_api/v1/collections/customer/records/_async Content-Type: application/json Authorization: Bearer {{apiKey}} X-Customer-Code: {{customerCode}} { "sendCompletionEmail": true, "ifNotExists": "fail", "eventSource": "customer-123-pipeline", "dataRecordIDs": [ "A65F0081-778E-43EE-BD46-9C14ACE34356", "B12F0081-778E-43EE-BD46-9C14ACE34675" ] } ``` ### Synchronous Webservices Synchronous webservices are rate-limited to 10 requests per second. #### Creating Master Data Records (Synchronously) Endpoint: ``` POST {{baseUrl}}/vue/_api/v1/collections/{{collectionCode}}/records ``` | Property | Description | | --- | --- | | ifExists | `ignore` will skip the creation of the record if already exists. `conflict` will throw an error if the record already exists. | Example: ```http POST {{baseUrl}}/vue/_api/v1/collections/customer/records Content-Type: application/json Authorization: Bearer {{apiKey}} X-Customer-Code: {{customerCode}} { "sendCompletionEmail": true, "ifExists": "ignore", "eventSource": "test-123-pipeline", "dataRecords": [ { "tyr_key": "10000182", "unique_customer_id": "A65F0081-778E-43EE-BD46-9C14ACE34356", "tyr_account_id": "45010000182", "cardnumber": "45010000182", "statusnumber": "A", "first_trans_date": 1600232398000, "last_trans_date": 1631768398000, "segment_number": "3", "segment_name": "Opal", "primary_site": "0", "gg_customer_flag": "N", "points_balance": 1928, "cash_balance": 0, "email_flag": 0 }, { "tyr_key": "10000189", "unique_customer_id": "B12F0081-778E-43EE-BD46-9C14ACE34675", "tyr_account_id": "45010000189", "cardnumber": "45010000189", "statusnumber": "A", "first_trans_date": 1600232398000, "last_trans_date": 1631768398000, "segment_number": "3", "segment_name": "Opal", "primary_site": "0", "gg_customer_flag": "N", "points_balance": 1928, "cash_balance": 0, "email_flag": 0 } ] } ``` #### Updating Master Data Records (Synchronously) Endpoint: ``` PATCH {{baseURL}}/vue/_api/v1/collections/{{collectionCode}}/records/_async ``` | Property | Description | | --- | --- | | ifExists | `replace` will replace the full record. `merge` will overwrite the fields specified in the records. | | ifNotExists | `create` - if the specified identifier does not exist, create the record. `fail` - if the specified identifier does not exist, fail. | Example: ```http PATCH {{baseURL}}/vue/_api/v1/collections/{{collectionCode}}/records/_async Content-Type: application/json Authorization: Bearer {{apiKey}} X-Customer-Code: {{customerCode}} { "sendCompletionEmail": true, "ifNotExists": "ignore", "eventSource": "10739827-EsmInventoryQPipeline", "dataRecords": [ { "ItemNo": "645779", "IsActive": false, "RequestDateTime": "2021-07-14T17:11:50.9346277-04:00" }, { "ItemNo": "645780", "IsActive": false, "RequestDateTime": "2021-07-14T17:11:50.9346277-04:00" }, { "ItemNo": "100495", "IsActive": false, "RequestDateTime": "2021-07-14T17:11:50.9346277-04:00" }, { "ItemNo": "100497", "IsActive": false, "RequestDateTime": "2021-07-14T17:11:50.9346277-04:00" } ] } ``` #### Removing Master Data Records (Synchronously) Endpoint: ``` DELETE {{baseURL}}/vue/_api/v1/collections/{{collectionCode}}/records/_async ``` | Property | Description | | --- | --- | | ifNotExists | `ignore` - if the specified identifier does not exist, ignore. `fail` - if the specified identifier does not exist, fail. | Example: ```http DELETE {{baseURL}}/vue/_api/v1/collections/customer/records/_async Content-Type: application/json Authorization: Bearer {{apiKey}} X-Customer-Code: {{customerCode}} { "sendCompletionEmail": true, "ifNotExists": "fail", "dataRecordIDs": [ "45010000182", "45010000189" ] } ``` #### Removing all Master Data Records (Synchronously) The following endpoint removes all records from a collection. Endpoint: ``` POST {{baseURL}}/vue/_api/v1/collections/{{collectionCode}}/_truncate ``` Example: ```http POST {{baseURL}}/vue/_api/v1/collections/customer/_truncate Content-Type: application/json Authorization: Bearer {{apiKey}} X-Customer-Code: {{customerCode}} ``` <!-- ### File Uploads - Upload file to Incoming Data Bucket - Process incoming data --> ## Creating and removing Transactional Data Records Transactional Data Records will always reside in a Data Table inside Conscia Intel. Delimited or [NDJSON](http://ndjson.org/) files can be used. For this POC, we have set up the Data Tables for comma-delimited files. Files may or may not be gzipped. Note: Transactional Data Records cannot be updated. Full time-based partitions of data may be removed. The following Data Buckets and Data Tables will be setup for this POC. | Incoming transactions | `dataBucketCode` | `dataTableCode` | | --- | --- | --- | | transaction_header | transaction | txn_header | | transaction_detail | transaction | txn_detail | ### Asynchronous Webservices <!-- #### Creating Transactional Data Records Endpoint: ``` POST {{baseURL}}/vue/_api/v1/data-buckets/{{dataBucketCode}}/data-tables/{{dataTableCode}}/records/_async ``` Example: Add 2 transaction header data records ```http POST {{baseURL}}/vue/_api/v1/data-buckets/transaction/data-tables/txn_header/records/_async Content-Type: application/json Authorization: Bearer {{apiKey}} X-Customer-Code: {{customerCode}} { "sendCompletionEmail": true, "ifExists": "ignore", "eventSource": "10739827-EsmInventoryQPipeline", "dataRecords": [ { "transaction_id": "3103901111374", "transaction_date": 1547182798000, "store_id": "31", "item_count": 2, "sale_qty": 2, "sales_amount": 4.46, "regular_price": 7.96, "discount": 3.5, "tyr_cardnumber": "45010000642" }, { "transaction_id": "3103901111374", "transaction_date": 1547182798000, "store_id": "31", "item_count": 2, "sale_qty": 2, "sales_amount": 4.46, "regular_price": 7.96, "discount": 3.5, "tyr_cardnumber": "45010000642" } ] } ``` --> #### Removing Transactional Data Records ```http DELETE {{baseURL}}/vue/_api/v1/data-buckets/{{dataBuckerCode}}/data-tables/{{dataTableCode}}/records/_async Content-Type: application/json Authorization: Bearer {{apiKey}} X-Customer-Code: {{customerCode}} { "sendCompletionEmail": true, "partition": "2021-07-14T17:00:00" // July 14, 2021 between 5pm and 6pm EST } ``` ### File Uploads #### Uploading transactional data Endpoint: ``` POST {{baseURL}}/fs/customers/{{customerCode}}/meshKey/{{meshKey}}/buckets/{{dataBucketCode}}/upload?folder={{dataTableCode}} ``` Example: ```bash curl -F files[]=@transaction_header_201901.csv.gz --url "{{baseURL}}/fs/customers/{{customerCode}}/meshKey/{{meshKey}}/buckets/transaction/upload?folder=txn_header" -H "Authorization: Bearer {{apiKey}}" ``` #### Verifying file upload success The following request lists all the files uploaded to the Intel Data Bucket. It can be used to verify that the Data File was uploaded successfully. ```http POST {{baseUrl}}/graphql Content-Type: application/json Authorization: Bearer {{apiKey}} { "query": "query ($input: ListDataFilesInput) { listDataFiles(input: $input) { name lastModified size }}", "variables":{ "input":{ "customerCode":"{{customerCode}}", "meshKey":"{{meshKey}}", "dataBucketCode":"{{dataBucketCode}}" } } } ``` ## Checking job status of an asynchronous webservice ```http GET {{baseURL}}/vue/_api/v1/jobs/{{jobID}}/status Content-Type: application/json Authorization: Bearer {{apiKey}} X-Customer-Code: {{customerCode}} ``` ## Additional Notes Please note that we are not using Track API for this POC. Track API can be used when you are logging one event at a time. ## Querying Collections Endpoint: ``` POST {{baseUrl}}/vue/_api/v1/collections/{{collectionCode}}/records/_query ``` Example: ```http POST {{baseUrl}}/vue/_api/v1/collections/customer/records/_query Content-Type: application/json Authorization: Bearer {{apiKey}} X-Customer-Code: {{customerCode}} { "limit": 10 } ``` ```http POST {{baseUrl}}/vue/_api/v1/collections/customer/records/_query Content-Type: application/json Authorization: Bearer {{apiKey}} X-Customer-Code: {{customerCode}} { "filter": { "$eq": { "field": "cardnumber", "value": "45170793389" } } } ```