# NACC/Flywheel Error Handling ## Metadata Location The errors shall be appended to the files themselves under the custom information. The errors shall be nested under `info.qc.<gear-name>.validation.data` ```json "info": { "qc": { "<gear-name>": { "job_info": {"<auto-generated>"}, "validation": { "sate": "PASS or FAIL", "data": [ "<error-object1>", ... "<error-objectN>" ] }, } } } ``` ## Adding Metadata The metadata will be added using the flywheel gear toolkit context's built in [qc-report generator](https://flywheel-io.gitlab.io/public/gear-toolkit/flywheel_gear_toolkit/context/#adding-a-qc-result-or-gear-info). For example: `context.metadata.add_qc_result('form.json', "validation", state="FAIL", {'data': [<error-objects>]})` ## Error Object Format The error object will have the same core keys across all outputs. These core keys are: ```json { "type": "str - Can be 'error' or 'alert'", "code": "str - for example, 'EmptyFile', 'MissingHeader', or 'IncorrectColumnName'", "flywheel_path": "str – Flywheel path to the container/file", "container_id": "str – ID of the source container/file ", "message": "str – error message description" "timestamp": "str - timestamp of when the error was created" } ``` Each gear may add additional keys as needed for the error reporting. **NOTE: Adding more keys will clutter the UI extension view.** ### file-validator For the file validator gear, the error objects will be defined as follows: ```json { "type": "str - will always be 'error' in this gear", "code": "str - Type of the error (e.g. MaxLength)", "location": "<location object>", "flywheel_path": "str – Flywheel path to the container/file", "container_id": "str – ID of the source container/file ", "value": "str – current value", "expected": "str – expected value", "message": "str – error message description" "timestamp": "str - timestamp of when the error was created" } ``` Where value for location will be formatted as such: - For JSON input file: `{ “key_path”: "str = the json key that raised the error" }` - For CSV input file: ``` { “line”: "int - the row number that raised the error", “column_name”: "str - the column name that raised the error" } ``` In addition to the JSON schema based validation, the additional checks will be implemented: - Empty file (JSON and CSV) - Missing header (CSV only) - CSV file has no header - NO first row cells match ANY key in the schema - Incorrect column name in header (CSV only) - csv header present that is not in the JSON schema These special checks will have an error object as follows: ```json { "type": "str - will always be 'error' in this gear", "code": "str - 'EmptyFile', 'MissingHeader', or 'IncorrectColumnName'", "location": "", "flywheel_path": "str – Flywheel path to the container/file", "container_id": "str – ID of the source container/file ", "value": "", "expected": "", "message": "str – error message description" "timestamp": "str - timestamp of when the error was created" } ``` ### identifier-lookup (owner: NACC) The identify-lookup gear looks at 2 columns in the CSV (ADCID and PTID) and call the NACC API to get the corresponding NACC ID, takes as input a CSV and errors are reported into `info.qc.<gear-name>.validation.data` as metadata, where “data” is an array of error objects. - The error objects will follow the specification in the file-validator section, above. - The location object will follow the specification defined in the file-validator section for a CSV file input. This gear will also make the same special checks for - Empty file - Missing header - Incorrect column name in header using the same error-object format as specified in the file-validator section for special checks ### For Form-qc-checker (owner: NACC) The error object for the form-qc-checker will follow the specification defined in file-validator, with the addition of three new items: ```json { "type": "str - 'error' or 'alert'", "code": "str - Type of the error (e.g. MaxLength)", "location": "<location object>", "flywheel_path": "str – Flywheel path to the container/file", "container_id": "str – ID of the source container/file ", "value": "str – current value", "expected": "str – expected value", "message": "str – error message description" , //New Items: "ptid": "str - the ptid", "visit_number": "str - the visit number", "Initial": "str - ?" } ``` **Note that this is the only gear where the "type" key can be either "alert" or "error"** - The location object will follow the specification defined in the file-validator section for a CSV file input. # NACC/Flywheel Alert Handling Alerts will exist next to the errors, under "cleared": ```json "info": { "qc": { "<gear-name>": { "job_info": {"<auto-generated>"}, "validation": { "sate": "PASS or FAIL", "data": [ "<error-object1>", ... "<error-objectN>" ], "cleared" : ["<clear-object1>", ... ] }, } } } ``` The structure of "clear-object" will be as follows: ```json { "alert-hash": "<Hash of the alert this object is referring to>" "clear": bool, "provenance": ["<provenance-object"], } ``` where a provenance object is as follows: ```json { "user": "<user-email>" "clear-set-to": bool "timestamp": "<timestamp>" } ``` where "clear" indicates which state that user set the "clear-set-to" item to in the level above. # UI Columns The UI will have this subset of the columns present in the "error object" keys: ```json { "type": "str - 'error' or 'alert'", "code": "str - Type of the error (e.g. MaxLength)", "location": "<location object>", "value": "str – current value", "expected": "str – expected value", "message": "str – error message description" , //New Items: "ptid": "str - the ptid", "visit_number": "str - the visit number", "Initial": "str - ?" } ``` where `flywheel_path` and `container_id` are hidden. A `subjectURL` and `sessionURL` will also be created, but will be displayed as links where the text is the subject/sessison label. An "acquisiton" and "file" column will also be created, and will be links, but will link to the parent session (flywheel limitation) There will also be a "cleared status" column, which will contain a button to clear or unclear an alert. ## Column summary: ### Displayed Columns: ```json { // From the original dataview, some or all may be present "type": "str - 'error' or 'alert'", "code": "str - Type of the error (e.g. MaxLength)", "location": "<location object>", "value": "str – current value", "expected": "str – expected value", "message": "str – error message description" , "ptid": "str - the ptid", "visit_number": "str - the visit number", "Initial": "str - ?" // Derived Columns, always present: "Ignored": "bool - is the alert ignored?", "subject": "str - link to subject", "session":"str - link to session", "acquisition": "str - link to session", "file": "str - link to file", } ``` ### Hidden Columns: ```json { // values from dataview "flywheel_path": "str - flywheel path", "container_id": "str - file_id", "clearedProv": "str - the cleared provenance", // Derived columns "hash": "str - a calculated hash for the row" }