# NACC/Flywheel Error Handling
## Metadata Location
The errors shall be appended to the files themselves under the custom information.
The errors shall be nested under `info.qc.<gear-name>.validation.data`
```json
"info":
{
"qc":
{
"<gear-name>":
{
"job_info": {"<auto-generated>"},
"validation":
{
"sate": "PASS or FAIL",
"data": [ "<error-object1>", ... "<error-objectN>" ]
},
}
}
}
```
## Adding Metadata
The metadata will be added using the flywheel gear toolkit context's built in
[qc-report generator](https://flywheel-io.gitlab.io/public/gear-toolkit/flywheel_gear_toolkit/context/#adding-a-qc-result-or-gear-info).
For example:
`context.metadata.add_qc_result('form.json', "validation", state="FAIL", {'data': [<error-objects>]})`
## Error Object Format
The error object will have the same core keys across all outputs. These core keys are:
```json
{
"type": "str - Can be 'error' or 'alert'",
"code": "str - for example, 'EmptyFile', 'MissingHeader', or 'IncorrectColumnName'",
"flywheel_path": "str – Flywheel path to the container/file",
"container_id": "str – ID of the source container/file ",
"message": "str – error message description"
"timestamp": "str - timestamp of when the error was created"
}
```
Each gear may add additional keys as needed for the error reporting.
**NOTE: Adding more keys will clutter the UI extension view.**
### file-validator
For the file validator gear, the error objects will be defined as follows:
```json
{
"type": "str - will always be 'error' in this gear",
"code": "str - Type of the error (e.g. MaxLength)",
"location": "<location object>",
"flywheel_path": "str – Flywheel path to the container/file",
"container_id": "str – ID of the source container/file ",
"value": "str – current value",
"expected": "str – expected value",
"message": "str – error message description"
"timestamp": "str - timestamp of when the error was created"
}
```
Where value for location will be formatted as such:
- For JSON input file:
`{ “key_path”: "str = the json key that raised the error" }`
- For CSV input file:
```
{ “line”: "int - the row number that raised the error",
“column_name”: "str - the column name that raised the error" }
```
In addition to the JSON schema based validation, the additional checks will be implemented:
- Empty file (JSON and CSV)
- Missing header (CSV only)
- CSV file has no header - NO first row cells match ANY key in the schema
- Incorrect column name in header (CSV only)
- csv header present that is not in the JSON schema
These special checks will have an error object as follows:
```json
{
"type": "str - will always be 'error' in this gear",
"code": "str - 'EmptyFile', 'MissingHeader', or 'IncorrectColumnName'",
"location": "",
"flywheel_path": "str – Flywheel path to the container/file",
"container_id": "str – ID of the source container/file ",
"value": "",
"expected": "",
"message": "str – error message description"
"timestamp": "str - timestamp of when the error was created"
}
```
### identifier-lookup (owner: NACC)
The identify-lookup gear looks at 2 columns in the CSV (ADCID and PTID) and call the NACC
API to get the corresponding NACC ID, takes as input a CSV and errors are reported into
`info.qc.<gear-name>.validation.data` as metadata, where “data” is an array of error
objects.
- The error objects will follow the specification in the file-validator section, above.
- The location object will follow the specification defined in the file-validator
section for a CSV file input.
This gear will also make the same special checks for
- Empty file
- Missing header
- Incorrect column name in header
using the same error-object format as specified in the file-validator section for special
checks
### For Form-qc-checker (owner: NACC)
The error object for the form-qc-checker will follow the specification defined in
file-validator, with the addition of three new items:
```json
{
"type": "str - 'error' or 'alert'",
"code": "str - Type of the error (e.g. MaxLength)",
"location": "<location object>",
"flywheel_path": "str – Flywheel path to the container/file",
"container_id": "str – ID of the source container/file ",
"value": "str – current value",
"expected": "str – expected value",
"message": "str – error message description" ,
//New Items:
"ptid": "str - the ptid",
"visit_number": "str - the visit number",
"Initial": "str - ?"
}
```
**Note that this is the only gear where the "type" key can be either "alert" or "error"**
- The location object will follow the specification defined in the file-validator section
for a CSV file input.
# NACC/Flywheel Alert Handling
Alerts will exist next to the errors, under "cleared":
```json
"info":
{
"qc":
{
"<gear-name>":
{
"job_info": {"<auto-generated>"},
"validation":
{
"sate": "PASS or FAIL",
"data": [ "<error-object1>", ... "<error-objectN>" ],
"cleared" : ["<clear-object1>", ... ]
},
}
}
}
```
The structure of "clear-object" will be as follows:
```json
{
"alert-hash": "<Hash of the alert this object is referring to>"
"clear": bool,
"provenance": ["<provenance-object"],
}
```
where a provenance object is as follows:
```json
{
"user": "<user-email>"
"clear-set-to": bool
"timestamp": "<timestamp>"
}
```
where "clear" indicates which state that user set the "clear-set-to" item to in the level above.
# UI Columns
The UI will have this subset of the columns present in the "error object" keys:
```json
{
"type": "str - 'error' or 'alert'",
"code": "str - Type of the error (e.g. MaxLength)",
"location": "<location object>",
"value": "str – current value",
"expected": "str – expected value",
"message": "str – error message description" ,
//New Items:
"ptid": "str - the ptid",
"visit_number": "str - the visit number",
"Initial": "str - ?"
}
```
where `flywheel_path` and `container_id` are hidden.
A `subjectURL` and `sessionURL` will also be created, but will be displayed as links where the text is the subject/sessison label.
An "acquisiton" and "file" column will also be created, and will be links, but will link to the parent session (flywheel limitation)
There will also be a "cleared status" column, which will contain a button to clear or unclear an alert.
## Column summary:
### Displayed Columns:
```json
{
// From the original dataview, some or all may be present
"type": "str - 'error' or 'alert'",
"code": "str - Type of the error (e.g. MaxLength)",
"location": "<location object>",
"value": "str – current value",
"expected": "str – expected value",
"message": "str – error message description" ,
"ptid": "str - the ptid",
"visit_number": "str - the visit number",
"Initial": "str - ?"
// Derived Columns, always present:
"Ignored": "bool - is the alert ignored?",
"subject": "str - link to subject",
"session":"str - link to session",
"acquisition": "str - link to session",
"file": "str - link to file",
}
```
### Hidden Columns:
```json
{
// values from dataview
"flywheel_path": "str - flywheel path",
"container_id": "str - file_id",
"clearedProv": "str - the cleared provenance",
// Derived columns
"hash": "str - a calculated hash for the row"
}