# Gerbil Project Future Plan
## Terms (just for shane's reference)
- Datacell: A unique combination of stimulus type (e.g., Audio/Visual) and frequency (e.g., 10Hz).
- Stoppoint: The timestamp up to which the data is used for training, e.g., 20, 30, 40.
## For the entire dataset
### General structure
traversing the entire dataset, iterate over all types (A/V & Freq), fetech each type of data, train model for each type, plot result for each type
- file management:
- save/read parameter for each type of dataf
- save/read trained model
- save data used for model training
- save result into csv file
- save plot image into folder
- `config_and_result.json` contains all config and all desired data we need for plotting (init as empty at the begining and fill in when training)
**Side note: we could use one file to save all config and result since their structure are basically the same**
```json!
{
"Audio": {
"10Hz": {
"data_path": "path/to/data",
"save_path": "path/to/models",
"log_path": "path/to/logs",
"plot_path": "path/to/plots",
"stoppoints": [
{
"criteria": "angle",
"criteria_value": 20,
"timestamp": 12, // either sum trace or sum timstamp decide later
"result": {"acc": "", "loss": ""}
},
{
"criteria": "percentage",
"criteria_value": 40,
"timestamp": 35,
"result": {"acc": "", "loss": ""}
}
]
}
}
}
```
- find the time stamp we want to stop (a function that takes either timestamp_of_train, or angle_of_train, or percent_of_train)
- read config for each datacell
- read data, init model, train model, save output (detail next section)
- Finally we can plot the graph based on these info.
### Structure
- loop through all types
- loop through all freqs
-load data
- loop through all stoppoints
- init model
- train model
- save result
- data analysis
## For each data cell
- It should be pretty straight forward
- there should be a function that read the configs
- there should be a function that load the data (from config)
- there should be a function that init the model (from config)
- training
- save result to json file
## Problems
- the result structure is not perfect
- [x] we need to retain as much info as possible (like angle or percentage and stoppoint) without making it too complicated
- [ ] we need to make it dynamic so we can choose to either train one model or batch train (for future convenience)
- [ ] we need to make it as modular as possible so we can cooporate better.
## Deadlines
### Current focus -- Run our model on a single datacell
- Build a file handling system based on our data and generate the config file: 11/7
- build the timestamp finding algorithm 11/14
-