# TDP System Validation
Overview of data file processing is illustrated below.
TDP validates the data files (and data therein) in a sequential fashion--*first* checking if the data file being uploaded is the correct file type and *finally* ending with checks for consistency across related records within a file.
Key points about this workflow, which will be described in more detail further below:
* Records are not stored into the TDP db until passing the record pre-parsing checks (Step 5).
* data quality checks are not performed on the data until Step 6, which means the data OFA will work with may need additional cleaning.
* any data file that is succesfully submitted (Step 2) will be transferred to ACFTitan.
* this transfer feature was introduced in August 2022 (before parsing/validation was implemented) to ensure any file submitted via TDP could still be processed through TDRS. This feature will be deprecated in May 2024.
```mermaid
graph TD;
Start(Start) --> Upload_File{1. File uploaded?};
Upload_File -->|No| End1{Stop};
Upload_File -->|Yes| Submit_File{2. File submitted?};
Submit_File -->|No| End2{Stop};
Submit_File -->|Yes| File_Storage{3. Store file in S3};
File_Storage{3. Store file in S3}-->Cat1{4. File pre-parsing checks};
Cat1 -.->|Fail| End3{Generate feedback report};
Parsing -.->|Fail| End3{Generate feedback report};
Cat2 -.->|Fail| End3{Generate feedback report};
Cat3 -.->|Fail| End3{Generate feedback report};
Cat4 -.->|Fail| End3{Generate feedback report};
End3 --> End4 --> End5(End);
Cat1{4. CATEGORY1: File pre-parsing checks} --> |Pass| Parsing{5. CATEGORY1: Record pre-parsing checks} -->|Parse/store record in db| Cat2{6. CATEGORY2: Out-of-range value checks}-->|Pass| Cat3{7. CATEGORY3: Record inconsistency checks} --> |Pass| Cat4{8.CATEGORY4. Case inconsistency checks} --> |Pass| End4{Transfer file to ACFTitan};
%% Format some blocks
style Start fill:#CCCCCC,stroke:#333,stroke-width:2px;
style End1 fill:#ffcccc,stroke:#333,stroke-width:2px;
style End2 fill:#ffcccc,stroke:#333,stroke-width:2px;
style End3 fill:#ffd750,stroke:#333,stroke-dasharray: 3,3;
style End4 fill:#8FBC8F,stroke:#333,stroke-width:2px;
style End5 fill:#CCCCCC,stroke:#333,stroke-width:2px;
%% Define nodes with labels and font colors using HTML-like labels
Start((<span style="color: black;">Start</span>));
End1((<span style="color: black;">Stop</span>));
End2((<span style="color: black;">Stop</span>));
End5((<span style="color: black;">End</span>));
```
---
---
## Data submission workflow: `success`
Success defined herein is: `no errors detected by the system`.
**File uploaded :heavy_check_mark:**

**File submitted :heavy_check_mark:**
* on-screen banner

* submission history shows `Accepted` status with the total number of unique cases submitted. No errors were detected by system, so no feedback report generated.

* Data submission email notification

**Data file stored in S3 :heavy_check_mark:**
* From the Django Admin Console (DAC), admins can access the data file stored in S3 (with validation metrics)

**Data parsed :heavy_check_mark:**
* From DAC, admins can see how the parsed records from the data

**Data file transferred**
* From DAC, admins can see evidence that the file has been transferred to ACFTitan (*screenshot of file landing on ACFTitan also included*)


---
---
## What happens when the data submission workflow is `NOT successful` ?
There are a few points of `failure` in the workflow--each with different impacts--which are summarized below:
* Step 1. File upload failed
* Step 2. File submission failed
* Steps 3-8. File submitted but errors were detected
* File not transferred to ACFTitan
---
### Step 1. File upload
* Only *plain/text* files with .txt extensions OR with legacy file naming conventions (e.g. `ADS.E2J.NDM3.TS01`) can be uploaded. **If anything else is uploaded, an error message is returned to the end-user.** Admins will NOT know this unless end users reach out for help.

<u>Important Note</u>: This file is not in the TDP system, so will not be transferred to ACFTitan.
---
### Step 2. File submission
* Files must pass security scanning to be submitted for processing. Infected files are thrown out and an error message is returned to the user:

Additionally, admins will see evidence of this from Django Admin Console (DAC) :arrow_down:

<u>Important Note</u>: This file is not in the TDP system, so will not be transferred to ACFTitan.
---
### Step 3. File storage
No failures observed (yet) at this stage in the workflow. If a file is successfully submitted, it should be stored in S3, _indefinitely_.
:information_source:
---
### Step 4. File pre-parsing checks (Category 1)
These types of errors are considered to be violations of the expected file layout, so **`NONE`** of the records can be parsed. Most of these types of errors are related to problems detected in the `HEADER` record (see examples below). When this happens:
* file status is `Rejected` and an error report generated.

* the file is _still_ stored in S3
* the file is _still_ transferred to ACFTitan :warning:
---
### Step 5. Record pre-parsing checks (Category 1)
These types of errors are considered to be violations of the expected record layout, so at least some of the records in the file cannot be parsed. This type of error is related to the length of the record. When this happens:
* file status is `Partially accepted with errors` and an error report generated. The record with the incorrect length (T2 record in the example below) is NOT parsed. :warning:

* the file is _still_ stored in S3
* the file is _still_ transferred to ACFTitan :warning:
---
### Step 6. Out-of-range value checks (Category 2)
These types of errors are based on the [coding instructions](https://www.acf.hhs.gov/ofa/policy-guidance/acf-ofa-pi-23-04).Any item value that is not listed as a possible value in the coding instructions will yield an error. When this happens:
* file status is `Accepted with errors` and an error report generated. The records are still parsed. :warning:

* the file is _still_ stored in S3
* the file is _still_ transferred to ACFTitan :warning:
---
### Step 7. Record inconsistency checks (Category 3)
These types of checks look for inconsistency in the values across related items in a record. When errors are found:
* file status is `Accepted with errors` and an error report generated. The records are still parsed. :warning:

* the file is _still_ stored in S3
* the file is _still_ transferred to ACFTitan :warning:
---
### Step 8. Case inconsistency checks (Category 4)
<mark>(implementation in-progress)</mark> These types of checks look for inconsistency across related records in a file. When errors are found:
* file status is `Accepted with errors` and an error report generated. The records are still parsed. :warning:
* _example_: For every family (T1) record for a given month, there is no evidence that at least one adult (T2) or child (T3) associated with the family's case (T1) is a TANF recipient.
* the file is _still_ stored in S3
* the file is _still_ transferred to ACFTitan :warning:
* these records will be removed from the db :question:
---
### File not transferred to ACFTitan
The file transfer feature is temporary and will be deprecated when TDRS is decomissioned in May 2024. That said, it is rare for a file that is sucessfully submitted to not be transferred. When this happens, it could be:
- Network traffic (a lot of activity in the system)
- A configuration issue (e.g. missing environment variables that are used to connect to ACFTitan)
- the ACFTitan server is down
Admins can see this from TDP (see below):

---
### What about files with `Pending` status?
This could mean a few things:
- file parsing/validation still in-progress (usually for large files)
- file was submitted prior to parsing/validation implemented
- uncaught exception (usually because the file is poorly formatted; common among STTs who do not use FTANF)
---
## Error descriptions
This section contains links to legacy error descriptions (for TDRS) and TDP error descriptions.
### TDRS error descriptions
#### [system-level edit codes](https://www.acf.hhs.gov/sites/default/files/documents/ofa/tanf_system_edits_fatal_and_warning.pdf)
#### [TANF](https://www.acf.hhs.gov/ofa/policy-guidance/final-tanf-ssp-moe-data-reporting-system-transmission-files-layouts-and-edits#tanf-edits)
#### [SSP](https://www.acf.hhs.gov/ofa/policy-guidance/final-tanf-ssp-moe-data-reporting-system-transmission-files-layouts-and-edits#ssp-edits)
#### [Tribal TANF](https://www.acf.hhs.gov/ofa/policy-guidance/final-tanf-ssp-moe-data-reporting-system-transmission-files-layouts-and-edits#tanf-edits) edit codes are the same as TANF except any edit codes related to work eligible individual (tanf item # 48) because tribal tanf does not have this item.
---
### TDP error descriptions, by category

Error categories 3 through 6 (below) are based on validation checks across record types and sections. The relationship between sections and record types is illustrated herein:

---
#### Category 1: File pre-check
- header record missing
- header record incorrect length
- HEADER calendar quarter does not match fiscal period
etc. etc. etc.
---
#### Category 1: Record pre-check
---
#### Category 2: Out-of-range check
**TANF Section 1**
* workbook of validation checks by record type and category type [**here**](https://hhsgov.sharepoint.com/:x:/r/sites/TANFDataOFA/Shared%20Documents/TANF%20Data%20System/TDP_System_Validation/TANF/TDP_Validation_TANF_Section1.xlsx?d=w25787a245fb24f10934542aadddfff1f&csf=1&web=1&e=9qVwVG)
* Includes error categories 2 - 4.
* The error messages, as these would appear in TDP feedback reports, are still in refinement. Below are some examples of how these should appear, by category type:
* Cat 2: `Item 76 (Citizenship/Alienage) must be in set of values [0, 1, 2, 9]`
* Cat 3: `If Item 30 (Family Affiliation) is 1 then Item 34F (Marital Status) must be in set of values [1, 2].`
* Cat 4: `Item 30 (Family Affiliation) or Item 67 (Family Affiliation) must be 1`
---
#### Category 3: Record value consistency check
---
#### Category 4: Case consistency check
---
#### Category 5: Section consistency check
5. **<mark>(not yet implemented)</mark> Errors re: inconsistent values across related sections of data** – These types of errors are based on additional data quality checks OFA conducts. Because sections of data can be submitted at different points in time, current thinking around these checks suggest that these errors would need to be checked against data from the dB `example: total #of families reported in Section 1 > total # families reported in Section 3`.
---
#### Category 6: Historical consistency check
6. **<mark>(not yet implemented)</mark> Errors re: inconsistent values across related records and/or sections over time** -- Also based on additional data quality checks OFA conducts and would benefit from checks against data from the dB `example: state did not submit enough case records to meet annual sample size requirements`.